Open-sourcing production code

One of my very first projects that actually made it to real users was Skivsamlingen, developed in the dawn of my PHP days. Skivsamlingen means The Record Collection, and is not surprisingly a web application where users can list all their records.

The size of the target group has admittedly decreased somewhat since the era of iTunes, Spotify and Grooveshark, among others, begun, but there are still active users on the site. These users are the reason I felt I had to clear my conscience for barely working on the site for many years. Possible solutions are to make time for the site, take it down, or simply make other people work on it.

I chose the latter. By making the code public at Github, the users themselves get the chance to improve on it. There are however some things you need to keep in mind before throwing your code out there, especially if it is an older project that might not have been developed according to all of today’s best practices.

1. Remove sensitive data

First of all, make sure that any sensitive data is removed from files that will be made public. This includes passwords to your database and mail provider, encryption keys, API keys for third-party services and anything else that shouldn’t end up in the wrong hands. If this information has already been put under version control, simply removing those lines is of course not enough.

The first step is to find all these bits and pieces of information and gather these in a configuration file (or similar) that contains information that should not be available to other people. Make sure to add this file to your list of ignored files (.gitignore, if using Git). I for example had a global salt used when hashing user passwords that was located in the user model, which was under version control.

The second step is therefore to remove all traces of your secrets from old commit data. If you are using Git, then you will use the git filter-branch command. GitHub has a guide on removing sensitive data which is very useful for this. They also mention another important thing, namely changing any passwords used by the application, just in case something is missed.

2. Code review

Grab a cup of coffee. You will most likely need it. The next step is going through your code, line by line, and continually ask yourself, where does this input come from? Am I enforcing any validation on this input? Where is it used? Is it properly escaped?

The goal is of course to cover your bases in terms of SQL and XSS injections, CSRF, session management and all the other possible attack scenarios that are available. For those of you who barely understood what I was just talking about, I highly recommend you to visit OWASP, and especially make sure you grasp the concepts in OWASP Top 10. Actually, I recommend it anyway. These are the must-haves of security.

I, for example, found a SQL injection vulnerability in the record listings. The sorting worked by adding the direction, ascending (ASC) or descending (DESC), to the URL. This parameter was then put directly into the SQL query that fetches the list of records. A stupid mistake, of course, but one that was easy to make, especially when new to the game. Try to patch these things up, so not to make life easier for potential attackers.

3. Upgrade libraries

Once you have gone through your own code, some time should be spent on any external dependencies. Most of us use frameworks and libraries to speed up the development, and especially when a project has gone unattended for a while, these things tend to have moved on without you. Outdated libraries with known security flaws are common points of attack, since the problems are typically well-known and easy to exploit.

Not surprisingly, my site was running version 1.7.2 of CodeIgniter which was released in September 2009. That’s three years ago. The latest update put the version number to 2.1.2 and was released June 2012. Looking through the change log revealed quite a few fixed security problems, as well as other improvements.

Final words

The steps described above can of course be applied to your code base even if you are not planning to go open-source. I don’t believe that open-sourcing code is always the way to go, obviously. The risks must always be weighted against the potential rewards, and these are fundamentally different for a small, non-profit project and a large business application.

On the other hand, I don’t buy the argument that revealing your code would make your site less secure. This concept is called security through obscurity and may work as an additional layer of security, but it does not by itself secure a flawed application.

Software and Other Mysteries

On code and productivity with a dash of unicorn dust.

1. Remove sensitive data

2. Code review

3. Upgrade libraries

Final words

Comments