How Do I Unit Test Something That Writes To A File?

This is a question that comes up periodically. Remember that a good unit test is one that is fast (< 100ms/test). A test that accesses files, databases or the network is not going to be this fast. But, we still have code that writes data to files or reads data from files and we want to unit test that code. So how can we do that?

The typical situation in C++ is that we have some code like this:

std::ofstream output("filename.txt", "w");
// do a bunch of stuff that writes to stream 'output'
output.close();

If we take a look at the stream classes for C++, we’ll see that std::ofstream derives from std::ostream, which encapsulates the concept of an output stream. If we look deeper, we’ll see that std::ostringstream also derives from std::ostream. So we can change our code to use an ostream instead of an ofstream we could test it with a ostringstream in our unit test and then validate the contents of the string stream in our test.

Going back to our fictitious example, if we extract a new function from our existing code:

void write_output(std::ostream &output)
{
// do a bunch of stuff that writes to stream 'output'
}

// ...
std::ofstream output("filename.txt", "w");
write_output(output);
output.close();

Now we can write a unit test for write_output by passing it an ostringstream:

BOOST_AUTO_TEST_CASE(write_output)
{
    std::ostringstream output;
    write_output(output);
    verify_output(output.str());
}

In this example, I’m using the Boost.Test unit testing framework for C++. The function verify_output is something you write that validates that write_output wrote the proper data. Since my exampe is fictitious, I’m just showing this to indicate the structure of the test.

Unit testing the file creation process

That lets you validate the contents of the file, but what about the enclosing logic that interacts with the filesystem? Things like checking to see if a file exists, prompting the user for overwriting of an existing file, and so-on. How do I validate error handling when attempting to create a file fails because the target folder is read-only, and so-on?

To get at testing these interactions with the filesystem, you need to insert some sort of interface between your application and the filesystem so that you can supply a test double for the file system. The test double does everything in memory and “lies” to your application telling it about synthesized errors like no more room on the device or a read-only target directory.

The part that makes this sticky in traditional C++ code is that the construction of the ostream object is inlined and directly couples your production code to std::ofstream. However, if you push the construction of the ostream into an interface that encapsulates the file system operations, then you can configure your production code to use the test double under unit test and use the production implementation of the filesystem interface in the normal case.

Unit testing with FILE *

I know what you’re thinking… you’ve got this pile of C++ code that was written in the style of C and uses FILE * all over the place. Its too much of a pain to convert everything over to C++ iostreams (although Boost.Format will go a long way towards getting you there), but you still would like to retrofit some unit tests over all this file I/O without forcing the unit test to call the routine to write out to a temporary file and then validate the contents of the temporary file. (Although, this would be better than nothing, its more of a “functional test” than a unit test because its too slow.)

Well, something you might be able to do is to wrap the FILE * opaque data type into a C++ class that adheres to an interface. Here’s an example of what the interface might look like:

class c_file_interface
{
public:
    virtual ~c_file_interface() { }

    virtual int fputc(int c) = 0;
    virtual int fprintf(const char *format, ...) = 0;
    // ... other members that manipulate an output FILE *
};

Now ordinarily you don’t want to use variable arguments (...) in a C++ class because it isn’t typesafe. But in this instance, what we’re trying to do is preserve the signature of the C style functions that manipulate a FILE *. So we’ve just removed the FILE * argument from the global functions when creating the signature of our member functions.

Here’s the production implementation of this interface that we use to write to real files:

class c_file : public c_file_interface
{
public:
    c_file(FILE *file) : _file(file) { }
    virtual ~CFile() { }

    // delegate to global ::fputc(int, FILE *)
    virtual int fputc(int c)
    { return ::fputc(c, _file); }

    // delegate to global ::fprintf(FILE *, const char *, ...)
    virtual int fprintf(const char *format, ...);

private:
    FILE *_file;
};

For brevity, I’ve left out the implementation of c_file::fprintf, but its just simple delegation to ::fprintf just like we did for fputc. We’ll have to retrofit this interface onto code that writes to a FILE *. To make things simpler, pretend that we already did an “extract function” on the code that does the actual writing. We take this code:

void write_output(FILE *output)
{
    fprintf(output, "%s: %d", _name, _value);
}

…and turn it into this code:

void write_output(c_file_interface *output)
{
    output->fprintf("%s: %d", _name, _value);
}

This change is “mechanical” enough that we can automate it with a search and replace type operation in our favorite editor or IDE. We’ll still have a chunk of untestable code that surrounds the call to write_output: it will open the file with fopen, construct c_file from the FILE *, call write_output and then close the FILE * with fclose.

This sounds like a bunch of busy work, but did we get anything for it? Yes, we got the ability to write a unit test for all this legacy code that is doing file output with FILE * without having to rework all that code. Yes, we did have to transform it to use our interface, but since those changes were mechanical in nature we could automate them with our editor/IDE, so the change has a low likelihood of causing something to break, but we’ll still have to be careful making that change. Once we have this change in place, we can make enhancements or bug fixes to the file output routines and validate those enhancements or bug fixes with unit tests.

I’m not really inventing anything new here–I’m just following the advice Michael Feathers has in “Working Effectively with Legacy Code” and inserting what Michael calls an “object seam” in between my output code and the C file stream.

So great, now we’ve refactored the production code to use this abstraction. Now we can write a unit test for write_output using a test double:

class fake_c_file : public c_file_interface
{
public:
    fake_c_file() : _stream() { }
    virtual ~fake_c_file() { }

    // c_file_interface stuff:
    // write char to our string stream
    virtual int fputc(int c)
    { _stream << char(c); return c; }
    // write formatted arguments to our string stream
    virtual int fprintf(const char *format, ...);

    std::string const &str()
    { return _stream.str(); }

private:
    std::ostringstream _stream;
};

Again, for brevity I’ve left out the implementation of fake_c_file::fprintf. This test double gathers all of its output into an ostringstream and lets us query the contents of the stream through fake_c_file::str. We’ll use str in our unit test to get at the written output and validate its contents.

BOOST_AUTO_TEST_CASE(write_output)
{
    fake_c_file output;
    write_output(&output);
    verify_output(output.str());
}

What if someone else needs the FILE *?

Now this approach is all fine and dandy when you have control over the source base enough to be able to make the change from FILE * to c_file *. What if you’re using 3rd party libraries that take your FILE * to write part of the output? Well, there’s a couple things you can do in that case. You can decide not to test the functionality of the 3rd party component. After all, if you didn’t write this component, you probably figured that it worked as advertised anyway. Still, you may have calls to this 3rd party component intermingled with your own output code and you want to write a unit test to cover the whole thing. You can encapsulate the 3rd party component into an interface just like we did with FILE * so that you can fake out the 3rd party component in your unit test with a test double for the 3rd party component. Or you can separate your output code from the 3rd party component calls and unit test your output only.

5 Responses to “How Do I Unit Test Something That Writes To A File?”

  1. aditya menon Says:

    Thanks, that gave me some good ideas! I am currently doing something in PHP, where it is even more easy to do everything you mentioned here. Good day kind sir!

    Like


Leave a comment