Implementing complex changes

Today I want to describe how I tackle implementing complex changes on a project. Example categories of such changes are:

  • Integrating a large library or component into an even larger program.
  • Supporting an additional environment.
  • Upgrading underlying framework or a library.

Specific examples that I dealt with in each of these categories:

  • In phpbb, making rewritten template engine work.
  • Adding python 3 support to pycurl.
  • Upgrading a rails 1 application to rails 2.

The shared trait of these changes is they deal with a large body of code and that code presently works. You want the code to remain working after the change. This is unlike adding a new feature where the feature did not "work" before by virtue of not existing. When changing a working application, the expectation is that it will largely if not entirely continue to work. Users are much more forgiving of new functionality not working than of old functionality breaking on them.

Typically the entire program or system is so large that manually testing it is impractical. In addition, the person implementing the change might not even have a very good grasp on the system in the first place. How can the change be done with as little breakage as possible in as little time as possible?

Test suite

A test suite is extremely important for validating that the system works correctly, or at all. Some environments and cultures write automated tests as part of normal development process, and some don't. My rules are:

  • If there is a test suite but it has failures, the failures must be fixed before any other work can commence.
  • If there is no test suite but I know ahead of time of the big project, I would start writing tests proactively. Any tests are better than nothing, and the more there are, the quicker code will be validated.
  • If there is no test suite and no opportunity to create one, I would forego the quality focus entirely. If the environment/framework being used ordinarily have great testing support, and this particular application did not bother with testing, it is a shortsightedness of whoever built it and they are now going to pay for it. If the environment/framework does not support testing, there is not much that I can do about it.

With the test suite implemeted and passing, the next step is to set up a continuous itegration server to run the tests automatically if one does not already exists. This is virtually a requirement for me to work on anything at all so might as well get it done now.


I assume you use version control. But for interactive rebasing you want git. Git can integrate with other revision control systems via bridges, so if your project does not use git ordinarily you can still make your changes in git.

One change per commit

Your commits should be small. It is much easier to squash commits than it is to split them, therefore, always err on the side of smaller commits.

Signs that your commits are doing too much:

  • You use the word "and" in a commit message.
  • You have a bulleted list in a commit message.
  • You commit once per day.
  • Your commit diff is more than a couple hundred lines long and it is not a result of a single "search and replace" operation.
  • Your commit is so large is slows your browser down to a crawl when you try to view it in github.

You should commit maybe every 30 minutes or so of actual code writing. If you are doing research, you are not writing code and this obviously does not apply. You should commit whenever you are done with each bug fix or feature.

Interactive rebasing

The following steps will heavily rely on git rebase feature, and particularly on interactive rebasing. There are four key operations that will be needed:

  1. Rebasing against current master, or development head.
  2. Reordering commits, to keep related commits together.
  3. Squashing commits, to eliminate changes that have been proven to be wrong and are of no value anymore.
  4. Splitting commits. Assuming you follow the rule of one change per commit, splitting typically is used when you make a change somewhere and update call sites in the same commit, and later want to redo how the change is made and kill the initial change entirely.

Keep task branch small

Your principal task is large. The branch it is in will be getting bigger every day. After a few days, it will stop being readable in github and then it will stop being reviewable. If you have a code review policy, the implications should be obvious. If you don't, guess what - you will not be able to review your work later when it breaks something and you investigate that problem.

The task branch is kept small by continuously extracting anything that is independent enough to work on its own, and making those changes to master. For example, in a Rails 2 to 3 upgrade one of the changes being done is replacement of RAILS_ENV with Rails.env. This is a large change in terms of number of lines of code changed, but easy to review and Rails 2 understands Rails.env as well. This is a perfect example of a change that should be done against master to keep the upgrade branch smaller.

After changes are merged to master, rebase the task branch on master. This is where interactive rebase and small commits start to come in. You can identify conflicts easier in small commits and you can check that the history is being correctly replanted on master via an interactive rebase.

Fix errors first

If you are dealing with a library or framework upgrade, two common issues are features that are removed and deprecated. Contrary to what many people seem to think, "removed" and "deprecated" are not the same thing: features that were removed produce errors when code attempts to use them, whereas features that were deprecated continue to work fine but maybe produce storms of warning messages.

Fix the code that uses removed features first. Ideally your tests should pass while your code continues to use deprecated features. Once all errors are taken care of, you can start dealing with deprecated features.

Do not get stuck trying to rebuild working functionality when parts of your application are outright broken!

Start with core changes

Many applications, such as those using Ruby on Rails, have levels of tests: unit tests on the bottom level, functional tests higher up, integration tests higher up still. If you are doing test-driven development and are fixing the failing tests resulting from your changes, start with the bottom level and work your way up. Make sure all unit tests pass before changing anything related to functional tests.

Keep master green

The test suite should always pass on master. Typically the tests will be broken on the task branch until the task is complete. If the task is not complete but the tests pass, write more tests! And once you add more tests, ask yourself if they will work on master, and if so backport them to master to keep the size of the task branch down.

Editing history

In a non-trivial project, you are likely to jump from one area of the code to another and back, many times. Eventually you will have a history with many commits, some of which turned out to not work. Some of those might be helpful to keep just so you know what you tried, but others are simply hopeless.

Don't keep the useless commits - instead, squash them with other commits to get a history that helps you get to your destination. The question "OK, what have I done with X?" should be easy to answer by glancing on your development history. You should not need to go through "first I did A, then that apparently did not work so I did B, hmm" type of process.

Reordering history - following changes

Especially if you are in a project or company that has code reviews, someone will need to review your work before it is accepted. If you give that person a single commit with a message "rails 2 to 3 upgrade" and a 10,000+ line diff, that person will not be doing any reviewing. Either the work will be accepted on faith, or you will be told to redo it. Companies will tend to accept the work because they paid for it. Free software projects will tend to reject a contribution like that because they did not pay for it.

This is where you use interactive rebase to reorder the commits, squash and split them to the point where each commit makes sense and logically follows from previous commits. Someone going through the commits should have a clear picture of the changes you made.

That someone might be you! If you are not the only developer, or if the complex task is only a part of what you have to do, chances are there are other commits happening on master concurrently with your work. If they conflict with your complex task you need to figure out how to resolve conflicts. As you will be changing your own implementation, it helps for your implementation to be readable, at least to you.

This process will not be easy to follow the first time, but the more you do it the better you will become. Good luck.

Rebase often

If the master branch continues to receive bugfixes or minor or even not so minor features while you are working on your complex task, make sure to rebase the task branch on master frequently. While you may need to deal with conflicts, each conflict will tend to be small, and it will be with recently done work so resolving it should be easier. Especially if you are the person who continues to commit to master, resolving conflicts with your own work right away is significantly more efficient than postponing conflict resolution until later.

Don't just fix test failures - think about why they happened

A failing test can, generally, be fixed in one of two ways:

  1. Change the test.
  2. Change the code the test is testing.

Usually, the test suites are supposed to change less frequently than the code they are testing. So, typically, if a change breaks a test then the code is broken in some way.

A similar situation can appear when upgrading frameworks and otherwise changing dependencies in major ways. Suppose you have code like this:

def foo():
    # ...

... and you have a test that foo works in a certain way. Now you upgrade a library and foo ceases to work properly. You find out that you can rewrite foo as follows:

def foo():
    # ...

... and that makes the test pass. Maybe bar and quux are different spellings of more-or-less the same thing. You can stop here, or you can think about why where bar worked previously it no longer does. Do you understand why the code used bar? Why bar changed behavior? Is quux really correct? Does anything else in your project rely on the behavior of bar that changed, perhaps calling some variation of bar or code that bar called or code that ends up calling bar in some fashion?

Especially for projects with low test coverage it is very important to think about these questions, because if the underlying semantics of what bar does changed the change may affect code you don't have test coverage for.

To give an example, one of the differences between Python 2 and 3 is string handling. In Python 3 there is no longer a "string-of-bytes" type; strings are either Unicode strings or byte sequences which are not really strings. PycURL has to pass a sequence of bytes to libcurl, and gets Unicode strings from Python code under Python 3 whereas previously it received "byte strings" that could be passed to libcurl as is. The submitted Python 3 patch converted Unicode data to bytes using C standard library, in the process of doing which it 1) leaked memory and 2) did not use Python-supplied encodings, so while the trivial tests using ASCII worked the code would have crashed and burned when faced with nontrivial Unicode data. The solution that I eventually went with involved requiring Python code to encode all Unicode strings before giving them to PycURL, because PycURL could not correctly encode them in all possible cases.