I’ve just finished reading “Working Effectively with Legacy Code” by Michael C. Feathers. This is a book that explains a variety of strategies and techniques for transforming a code base without unit tests to a code base with unit tests. Unit tests are small, quick tests of software components that validate their behavior.
Since 1999 I’ve been practicing “eXtreme programming” to one degree or another. Out of the XP processes have come the “test driven development” philosophy of first writing the unit test and then doing the implementation. The test fails until your implementation is complete. Once you have unit tests for code, you can apply design transformations, or refactorings, to the code in order to improve the design or implementation. The unit tests tell you when you’ve messed something up with your refactoring. A refactoring is a code transformation that is intended to preserve the behavior of the application, but change its internal structure or implementation. The unit test validates the behavior of the code telling you when your refactoring had no unintended consequences. This gives you the confidence to make changes to the code without that nagging feeling that somewhere you might have introduced a bug that won’t be found until much, much later. Obviously, refactoring depends on unit tests. Without the unit tests, refactoring is just “code and pray” style programming.
Well, that’s all fine and dandy for code that you are writing from scratch: first you write the test, then you write the code. But when was the last time you had the luxury of working entirely on code that you built from scratch? Even when you get to code new features, it must usually integrate with existing code. What do you do if that existing code doesn’t have unit tests? How can you test your integration?
Michael Feathers’s book is the first book I’ve seen that answers these questions in a rich and robust manner. The book is full of practical advice and specific techniques for working tests into your legacy code. Each technique is presented with its pros and cons and there are examples in C++, C# and Java. The book is divided into three parts: The Mechanics of Change, Changing Software and Dependency-Breaking Techniques.
Part I: The Mechanics of Change starts out by defining legacy code:
“Legacy code is code without tests. Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.”
This is a pretty broad definition of legacy code and I’m willing to bet that most of the code in existence qualifies as legacy code under this model. I know that virtually all of the code I have ever written falls into this category and I’ve been doing this crap since 1979! However, while I’ve read (and known) about test-driven development and unit tests since around 1999, its been all too easy to write the code first and then the unit tests (if we even get around to writing them) and I’m just as guilty of that sloth as anyone else. That’s because we’ve been writing new code that is coupled to an existing code base that isn’t under tests, so it looks like we’ll have to go on a major holiday to rework code for unit testing in order to write new unit tests for the new code we want to write. We shrug our shoulders, add the new code and write unit tests for any new isolated components we introduce (which probalby won’t be many because the existing code base is coupled). We then rely on developer and manual testing in order to find the defects. If we’re lucky we’ll have a functional regression suite we can run against the product (but many products don’t have these either). So yeah, I’ve worked on a lot of legacy code!
Next, Michael gives his definition of a unit test:
“Here are the qualities of good unit tests:
- They run fast. A unit test that takes 1/10th of a second to run is a slow unit test.
- They help us localize problems.”
His argument is that when it takes a long time to run a suite of unit tests, its less likely that people will run them often to validate their most recent changes. This increases the length of the feedback cycle. The feedback cycle is how long it takes from the time you make a change to the time you’re confident that your change didn’t break any existing behavior and works correctly. The point of unit tests is to shrink the length of this feedback cycle to the time it takes to do a build. You build your code and the unit tests run every build giving you instant feedback on whether or not your change represented forward progress or backward progress. What are the implications of this approach to unit tests? Unit tests don’t talk over the network. Unit tests don’t talk to databases, except possibly in-memory “fake” databases. You can have tests that do these things, but they aren’t unit tests. They would be higher-level tests, often called functional tests or integration tests.
Next, Michael covers the concepts of sensing and separation:
“1. Sensing–We break dependencies to sense when we can’t access values our code computes.
2. Separation–We break dependencies to separate when we can’t even get a piece of code into a test harness to run.”
Both of these concepts were new to me, but they are excellent observations about the kinds of things you need to do to a piece of code in order to get it into a test harness. Have you ever looked at writing a test for a class and said to yourself “there’s no way I can get this class instantiated by itself in order to write a test for this one method”. That’s because your class was coupled to lots of other classes or other APIs even and it would be difficult if not impossible to instantiate the class in isolation. The coupling–to another class or an API–represents the dependency that makes it difficult to test the class. By breaking the dependency you can get the class to work in isolation and make it testable. Of course you don’t want to break the coupling in production code because the whole reason this class talked to other classes in the first place was to collaborate with them to get work done.
Getting legacy code under test is the art and practice of identifying the points where you can break the dependencies of a class in order to get the class under unit tests. Michael introduces the concept of a seam to identify points where you can insert layers to break apart the dependencies and get a test harness wrapped around the class.
Part II: Changing Software is the meat of the book and gives practical strategies for getting legacy code under control of unit tests using the concepts introduced in Part I. Just look at the titles of the chapters in Part II and you’ll see many familiar conundrums you encounter when staring legacy code in the face:
- I Don’t Have Much Time and I Have to Change It
- It Takes Forever to Make a Change
- How Do I Add a Feature?
- I Can’t Get This Class into a Test Harness
- I Need to Make a Change. What Methods Should I Test?
- I Need to Make Many Changes in One Area
- I Need to Make a Change, but I don’t Know What Tests to Write
- Dependencies on Libraries Are Killing Me
- My Application is All API Calls
- I Don’t Understand the Code Well Enough to Change It
- My Application Has No Structure
- My Test Code is in the Way
- My Project Is not Object Oriented, How Do I Make Safe Changes?
- This Class Is Too Big and I Don’t Want It to Get Any Bigger
- I’m Changing the Same Code All Over the Place
- I Need to Change a Monster Method and I Can’t Write Tests For It
- How Do I Know That I’m Not Breaking Anything?
- We Feel Overwhelmed. It Isn’t Going to Get Any Better
This is the real meat of the book and the place where you’ll get the best nuggets. This is the place where you’ll be reading a chapter and saying to yourself “man, I have faced that problem so many times!”. Michael’s approach is very hands-on and each chapter is illustrated with an example in C++, Java or C#. Sometimes techniques will be mentioned that can only be done in some languages and not in others, but all the techniques can be applied in one way or another in any object-oriented language. If you’re looking at a big pile of procedural C code, many times the techniques can be applied there as well in a limited or different form than the one you’d use in an object-oriented language. Each chapter in Part II refers to the dependency breaking techniques of Part III that you use to tame the problem in question.
Part III: Dependency-Breaking Techniques covers the recipes used to separate a problematic class from its dependencies so that you can get the class instantiated in isolation under a test harness. One or more of these techniques are referenced as part of the solution to each of the problems discussed by a chapter in Part II. The techniques are cataloged in a fashion similar to “Refactoring: Improving the Design of Existing Code” by Martin Fowler. The techniques are:
- Adapt Parameter
- Break Out Method Object
- Definition Completion
- Encapsulate Global References
- Expose Static Method
- Extract and Override Call
- Extract and Override Factory Method
- Extract and Override Getter
- Extract Implementer
- Extract Interface
- Introduce Instance Delegator
- Introduce Static Setter
- Link Substitution
- Parameterize Constructor
- Parameterize Method
- Primitivize Parameter
- Pull Up Feature
- Push Down Dependency
- Replace Function with Function Pointer
- Replace Global Reference with Getter
- Subclass and Override Method
- Supersede Instance Variable
- Template Redefinition
- Text Redefinition
This book is an excellent partner to Fowler’s Refactoring. Fowler’s book tells you about transformations you can make on code that is already under test in order to improve its design. Feathers’s book tells you how to get legacy code under test so that you can do the refactoring with confidence. Both books demonstrate incremental approaches to improving code quality and represent the distilled essence of many, many consulting hours with development teams by their respective authors.
I strongly recommend Feathers’s book to anyone interested in applying unit tests to legacy code.