In my post on agile code reviews, I described how we were attacking our code through automated refactorings that we trusted to apply without tests. Mostly we applied a series of Extract Method refactorings to long methods in the process of performing a Compose Method refactoring on the long method. Eventually, your team will run out of situations where you can apply automated refactorings to improve the design of your code and you’ll have to start applying actual code changes. Several of the commenters on my previous post remarked that making changes without unit tests implies risk that can only be averted with sufficient manual testing and that such risk is better mitigated by automated tests. In this followup post, I’ll describe the results from our team at applying the decoupling techniques described by Michael Feathers to our legacy code so that we could backfill unit tests onto the legacy code in preparation for a design change we want to make.
Our application consists of a core set of 3D graphics functionality and extensions provided by a set of hosted plugins. One of the plugins provides parsing support for Poser PZ3 scene files. The parsing code was already identified in our previous code reviews as having some of the longer methods in our code, having some of the highest complexity methods in our code and part of our code that often results in bugs we have to fix. The complexity and file revision metrics we gathered on our code base confirmed this observation from our team.
While we were applying a series of automated Extract Method refactorings as part of a larger Compose Method refactoring to the long methods, we identified a design choice that we wanted to change. The parsing code breaks up the input into a series of tokens. However, the tokens are passed around as arrays of char pointers. We would like to change the parser so that the tokens are represented by an enum (one for each keyword and value token) and a possible string value payload. This would avoid a large number of string compares that we currently perform while parsing the input.
Hour 1: Extract the Class to be the SUT
The existing parser was bundled along with the plugin class. This was violating the single responsibility principle for a class, so we started by extracting a parser class from the plugin class. This gives us a single class as the system under test (SUT) that we can cover with tests.
We left the existing parser in the existing plugin class so that we could work on the parser class without affecting the plugin until the new parser was ready to replace the one in the existing plugin. This will result in a temporary duplication of functionality until the replacement is ready. We chose this method as we wanted to be conservative in how we affected existing code as we worked on placing unit tests around the existing parser. Before we re-integrate our new unit tested parser into the plugin, we will fit acceptance tests onto the existing parser using Fitnesse and the scripting support in our application. This will give us high-level acceptance tests that we can apply to the new parser as a regression before incorporating it into our existing code.
Splitting the existing parser code into its own class and making that class compile in the existing plugin project took about an hour. We timebox our code reviews to about an hour long meeting so that it doesn’t become too long a meeting. On one occasion we took about 90 minutes because we wanted to spend the extra time on the subject matter.
Hour 2: Build a Test Framework Around the SUT
On our second code review of the parser class, we constructed a unit test project using Boost.Test and created a simple construction test for the extracted parser class. When retrofitting unit tests onto C++ code, linker dependencies quickly become an issue. If you’re not careful you can end up bundling the entire application into your unit test executable just to satisfy link dependencies. In our case, the production code is packaged as a plugin shared library that only exports a handfull of symbols used by the application to bind the plugin into the application.
We want to write a test for a class that isn’t directly exposed through the shared library interface, so we’ll need another mechanism for accessing the class. We could separate out the code to be tested into a static library and have that static library be used to build both the shared library and our test executable. Another alternative is to share the source code directly between the plugin project and the test project. We chose to share the source code. Because the PZ3 parser class is at the center of the plugin, its directly coupled to all the other types in the plugin as well as a number of concrete types in our core library. We tried initially just adding as few source files from the plugin project as possible to the test project as possible, but it became necessary to add all the source files from the plugin project to the test project in order to resolve the link dependencies. To keep the test code separate from the production code in the project, we organized all the production code into a “Production” folder in the test project.
Hour 3: Testing parseVersion
Now that we had dispensed with building the necessary infrastructure around our class so we could test it, we were ready to write some tests. When writing tests for legacy code, you often need to decouple the legacy code from its environment in order to exercise the SUT in isolation. We browsed through the various methods on the parser, looking for a good candidate that wasn’t highly coupled to write our first tests. We had roughly broken out the parser into a single method for each keyword clause based on our previous Compose Method refactoring work. We chose a relatively simple method for our first tests:
parseVersion. This method had some simple input validation and code to parse a version number from a string token value. The validation code would issue a warning message when insufficient tokens were passed to the method.
The first problem is that the
parseVersion method was private and we needed to expose it in some way in order to test it in isolation. In our case, we knew that consumers of the plugin would never be exposed to any methods on the internal parser class, so we simply made this method public on the parser class. That allows us to call it from our unit tests.
The second problem is that the
parseVersion method called another method called
lineWarning to issue a warning message in the context of parsing. This method in turn was coupled to members and other methods that didn’t have any bearing on the tests we wanted to write for
parseVersion. We decided to decouple the test from this part of the implementation of the parser by using a derive-and-override technique. First, the
lineWarning is made virtual. Next, we change its visibility from private to protected. Finally, we create a testing class called
DzPZ3ParserTester that derives from the production class
DzPZ3Parser in the test project. In
DzPZ3ParserTester we create an overriding implementation of
lineWarning that simply senses when it is called and remembers the arguments passed to it.
By changing the visibility of some internal methods and decoupling our test from collaborating methods on the SUT we were able to write unit tests covering all the code paths on the method
parseVersion. Had we written this code in a test-driven fashion, its unlikely we would have ended up with the same visibility and virtual methods on our class, as is often the case when retrofitting unit tests onto legacy code. The team learned how to decouple existing code from the rest of the implementation in order to test it in isolation with a simple example of derive-and-override and exposing methods for testing by changing their internal visibility.
Hour 4: Testing parseActorOrigin
Next we wanted to write a test for the code that parses the origin for the current node and sets that on the node. This meant that we needed a way to sense the parser’s interaction with the current node. As is often the case with legacy code its not so easy to instantiate the collaborators for the SUT. In our case, we couldn’t construct a node because the node’s public constructor interacted with the global undo stack and we don’t have a global undo stack in our test environment. The method parseActorOrigin interacted with the current node by using a member variable m_currNode.
We needed a way to sneak in a fake node, and furthermore it didn’t need to be a real node because the method merely checked that this pointer was non-zero in an error checking case and then just passed the pointer onto another method that set the node’s current position based on the parsed values.
We investigated a number of alternatives, one of which was extracting an interface on the node and having the parser class collaborate with the interface. Then we could create a fake node for the parser to use in our test. Ultimately this is probably the direction we want to go, since having an interface on our node will benefit us in the long-run. However that was too large a change for what we could get done in an hour and it would involve changing more classes than just the parser, with the commensurate compile and test cycle.
Instead we decided to create a local variable for the node and initialize it from a new protected virtual member on the parser class that returned the current node pointer. We then used the local variable where we used to use the member variable. Now the method was no longer coupled directly to the member variable and we had a protected method we could override in our testing class to provide a suitable value for the pointer for use in the parseActorOrigin method.
To test the error case when the pointer is zero, we don’t need to construct a node, so we’re good there. How about the normal case when the node pointer is non-zero, how do we test that? Normally I don’t recommend the use of
reinterpret_cast, but in this case its just exactly what we need. In our test we configure the fake result of the currentNode method as
reinterpret_cast<DzNode *>(0x1), which we appropriately store in a local variable called magicNode to indicate that this is purely a magic number. Yes, its damned ugly! It does let us test the relevant “if zero then error, else do useful work” cases however.
With our magic node in hand, we were able to test the normal case of setting the parsed origin on the current node. We did another derive-and-override for the method that sets the position on the current node and sensed that it was called with the appropriate floating-point values that should be parsed from the string tokens we supply to the parser.
An alternative approach we could have taken to providing currentNode and overriding it was to provide an overload of parseActorOrigin that took the affected node explicitly. Then we could write a test for that overload that supplied the current node pointer explicitly instead of indirectly through an overridden method. Its a judgment call really, and there is no one right way to do these things, but they all have a level of indirection involved.
Hour 5: Testing parseMaterial
For our next code review session, I wanted to demonstrate isolating a method from its collaborators through an interface. To avoid spending time in the code review with the whole team watching one person type in a bunch of boiler plate code, I did a little preparation for the code review by extracting an interface and building a fake for that interface in advance. The parseMaterial method on the parser extracts surface material properties and sets them on a class designed to hold those surface material properties. Surface material properties consist of things like diffuse color, ambient color, specular color, reflection color, bump parameters, texture maps and transparency information.
I have an existing concrete class that stores this information for a default material. I want my parser method to collaborate with the interface and not the concrete class so that I can supply a polymorphic test double under test circumstances. When I’m extracting the interface, I have a choice between extracting the interface in isolation or extracting it onto the concrete class. I chose to extract the interface in isolation to avoid any problems with binary compatibility of the concrete class. In our case the concrete class is exposed as part of a plug-in SDK and adding new virtual functions or base classes with virtual functions to the existing concrete class will change the layout of its virtual function table and break binary compatibility.
If binary compatibility were not a concern, I would have made the existing production class implement the new interface and added new virtual methods to the production class. Instead, I’ll create an adapter class that delegates the methods on the interface to the existing concrete class. The production code for the parseMaterial method will instantiate the adapter for the existing concrete class and pass the adapter to the method I’m going to test. I extract the body of the existing parseMaterial method into a new method which I called parseMaterialAux. The new method takes an extra parameter for the material interface it will use as the target of the successfully parsed material properties. Using this technique we were able to get four test cases written for the parseMaterial method, covering some of its existing functionality.
To Be Continued…
The example I’ve described here demonstrates a few things. First, you aren’t going to get all your legacy code easily back-filled with unit tests. It only pays to exert this level of effort on classes that are critical to your application and that already have a tendency to be broken. In our case, the parser file has 4036 statements with 151 methods on the parser class. Even with all the Compose Method we’ve already performed on this class, we still have 17 methods with a complexity higher than 30. To get even 50% code coverage of all the paths through all these methods is going to mean quite a bit of tests and decoupling.
However, as a team we gained quite a bit by spending the time to fit tests onto this existing code. First, we’ve all had a chance to learn some decoupling techniques together and see how they can be used to retrofit tests onto our existing code. Sure, we picked some of the easier and shorter methods to learn these techniques, but the same techniques apply to longer and more coupled methods than the examples we chose. The only difference is how much of the same technique you have to apply to a method in order to decouple it from its collaborators in order to test it in isolation. Its much, much easier to write tests for new code in a test-driven development style, but its still possible to fit tests onto existing code when you want to change that code.