Agile Code Reviews

Have you ever been tasked with doing classic “Fagan” style code reviews? They’re boring. They’re tedious. They take up lots of time and yield little, if any, results. At a previous employer, we did these kinds of code reviews. It consisted of printing out the source code to be reviewed, one copy per reviewer. Each of us reviewed the code alone, making notes to the printout. Then we’d gather in a conference room and discuss our findings while one person combined all our notes into a list of items to rework. That person would rework the code alone and then report back to the group the results. Whew! This is what the agile development world would call a “high ceremony process”. It has lots of formal steps and the associated paperwork. Is there a better way? In this post, I’ll discuss what we do at my current employer that makes code reviews a fun team activity.

First, we banish the paperwork! Instead of printouts we use a projector and a laptop with a development IDE running (or a remote desktop session to a development machine with the IDE running). We have quite a bit of legacy code on our hands, so we don’t have the luxury of reviewing everything in our code base. Since we have so much code, where do we focus our attention?

Concentration

The above diagram visualizes the idea that the hot spots in the code are the areas where high complexity, high bug counts and high revision counts converge. This fits with your intuition that the problem areas of your code are complex, keep getting bugs reported against them and you have to come back and edit them again and again.

That’s great, but how do we measure this stuff so that we can find it? First, let’s start with complexity. SourceMonitor is a great tool for measuring cyclomatic complexity and other metrics. SourceMonitor is free and I highly recommend it. You can even automate the gathering of metrics on your source base in your nightly build and build a running dataset of how your code complexity changes over time. If you’ve got really good code, you can even fail the build if complexity exceeds a certain threshold.

For bug and revision counts, you can adopt either a metrics approach or an informal approach. On my team, we’ve adopted an informal approach of reviewing the highest complexity methods and classes in our code base and leverage the collective memory of the team in identifying which code we’ve changed often in response to bugs.

So what would the metrics based approach look like? For this to work, you have to have some sort of association between bug reports and code. Some bug databases require that you enter the files that changed in order to correct the bug. If you have that information, you can mine it for which files are associated with the most bugs. If you have version control commit messages that follow a certain format and contain markers indicating that the revision was done in response to a bug fix, then you can mine your commit messages for information associating files with bug counts. For the revision counts, you can mine your source code control system to identify the files that change most often. Even if these files are not changing often in response to bug reports, that they change often might indicate that the code needs a review.

For revision counts, some version control systems count the number of revisions to a file. CVS is one such version control system. Subversion attaches revisions to a single snapshot of the entire repository, so it doesn’t track revisions per file. It does track which files were modified for each revision, so with a simple script you can accumulate the revisions per file. RevisionCounts.js is a simple JScript file for Windows that will process the output of svn log -v on its standard input and output a tabulated list of the revision counts, with the most often revised files listed first.

Once we’re armed with this information, we can identify the hotspots. So what do we do about them? We treat the code review as a team refactoring session instead of some formal process with paperwork and log sheets. We use Visual Assist X, an automated refactoring tool, to refactor our code. Because the refactorings are applied in an automatic manner, there is a reduced likelihood that we will introduce defects as we change the code.

When we have long methods that are hard to understand, we apply the Compose Method refactoring to break it down into smaller methods.

When we have names that don’t reveal their semantic intention, we apply the Rename Method refactoring to the offending names.

When we have lots of code duplication, we apply the Extract Method refactoring to eliminate the duplication.

What makes this a team activity is that we are discussing the code as a group as one person drives the code changes in the IDE. We discuss what would be a good name when we change names. We discuss how to best eliminate code duplication when we extract methods. We discuss the best way to compose a long method into a series of steps. Because we have a group of eyeballs looking at the code as its being changed, its somewhat like a super-charged pair programming experience. Little typos are caught right away and any change errantly introduced in the logic is caught by a member of the team.

Naturally, it would all be more comforting if we had a suite of unit tests or regression tests covering the code as we change it. So far, we have restricted ourselves to small refactorings that can be performed in an automated way by tools like Visual Assist X to minimize the danger of the changes. Over time as we take care of the worst offending long methods and complex methods, there will be less that we can do in an automated way without the safety of tests. When that time approaches, we’ll have to spend some time fitting tests onto the code we are changing before we can safely make the change. Whats really interesting is that these team refactoring exercises have made the team more comfortable with refactoring on real code, giving them confidence in when and how to apply the refactorings to the new code they’re writing on a daily basis. Best of all, it has turned what used to be a chore (“code cleanup”) into a fun activity that we perform together. Along the way, we all get a little more familiar with the code and with techniques for taming unruly code.

4 Responses to “Agile Code Reviews”

  1. Hal Arnold Says:

    Nice article. We’ve done a little of this, and I will get my teams to do more!
    One question, though, wouldn’t you want to nearly always write safety tests first? I know with ‘legacy code’, there are situations where you might just not be able to write a test, where creating a ‘seam’ [as Michael Feathers calls it] is necessary before you can isolate the code to accept tests. But your description above seems to suggest that you do this as a matter of practice?

    Hal Arnold

    Like

  2. legalize Says:

    So far the refactoring we have done in these code reviews has been driven in an automated way by Visual Assist X. For instance, selecting a block of code and then clicking “Extract Method” in the IDE is much less risky than extracting the method manually.

    This is a temporary state of affairs because eventually we will run out of places where we can improve the code through these automated tools alone. Eventually we will need to do something where we need to make the changes manually and it will require some tests first before we make the change.

    Adopting a full test-driven development style requires changing lots of habits. By getting the team comfortable with refactoring and how they are the remedy for the long method or complex method code smells, we are only changing one thing at a time.

    Rome wasn’t built in a day! :-)

    Like

  3. C++ Static Analysis « Legalize Adulthood! Says:

    […] I discussed in my post on agile code reviews, SourceMonitor is a tool for measuring cyclomatic complexity on a code base. […]

    Like

  4. Agile Code Reviews, Part 2 « Legalize Adulthood! Says:

    […] Code Reviews, Part 2 9-October-2009 — legalize In my post on agile code reviews, I described how we were attacking our code through automated refactorings that we trusted to apply […]

    Like


Leave a comment