Reduce Technical Debt by Using Unit Tests as Documentation

Technical debt is an inevitable side effect of legacy code. Some code can (and should) be pruned, but institutional memory fades—what if there's a reason certain lines were included that may not be immediately obvious? Done right, unit tests can serve as documentation. Later on, these tests can illuminate what the developer was thinking when they created the code.

G. K. Chesterton wrote murder mysteries a century ago, but I wonder what sort of developer he would have been. In one of his books, he described a metaphorical fence or gate that obstructs the way with no discernible purpose. Then he advised that we ought not tear it down until we understand why someone troubled themselves to put it up in the first place.

This principle, known as Chesterton’s fence, can help us curate existing code, such as when you're improving an existing codebase and come across something you do not understand. But it can also paralyze us.

Technical Debt in Legacy Code

Compromises and bad design decisions slip into software as expedients during development and maintenance. We might be pressed for time, understand incomplete or erroneous requirements, or have inadequate resources. These flaws damage quality, and we term them technical debt.

Left alone, technical debt grows, and our velocity grinds to a halt. It's not as if there's some payday lender who repossesses your ride; it's more like a loan shark who breaks your legs.

Of course, there is no literal loan shark to lend functionality in the present while threatening leg-breaking in the future. But a codebase afflicted by technical debt manifests debt-like behaviors: future obligations and compounding expenses. And, just like monetary debt, there are two cures: pay it off or go bankrupt.

We can pay off debt by refactoring our code to fix problems introduced earlier. Alternatively, we identify "bankruptcy" with a complete rewrite. Both approaches entail problems, as we shall see.

Right after you finish your code, you know where all the bodies are buried. But over time, your memory fades—and more significantly, your organization's institutional memory fades. This is a primary contributor to the compounding expenses of technical debt, but suboptimal interfaces also hobble everything that talks to your system, so you'll need to coordinate repairs to those interfaces with everything that uses your system.

Customers who rely on side effects may resist your attempts to effect repairs that would jeopardize those side effects. It's as if you failed to remove the scaffolding from a building project, and the new homeowner uses it to access the second floor because everyone overlooked the missing staircase.

All these things cost money and effort that increase with time. These are expenses to live with now that turn into greater expenses to correct later.

When I examine legacy code, my first impression is that it seems needlessly complex. I may ask, "Who was the idiot who wrote this?" Usually, that idiot was me. When my memory returns, I remember that the suspect logic stemmed from bug fixes for the corner and edge cases that I overlooked in my overly utopian design.

Unit Tests as Documentation

Ugliness in legacy code constitutes a veritable maze of Chesterton fences. If you're lucky, there's a unit test that shows why each one exists. If you're unlucky, you'll have to either find out why it's there or risk breaking something after you remove it. But there are things you can do to make yourself lucky.

Done right, unit tests serve as documentation. I fantasize about a three-ring binder filled with little notes to myself reminding me of what was I thinking when I wrote certain bits of code. Instead of this three-ring binder, we have a project file. Memorialize every significant decision during development: Tersely state the issue, your thinking about it, and how you resolved it.

Sadly, project files are seldom started or maintained. They're too easily rendered obsolete when we fail to keep them up to date. The funny thing is, when I'm too hurried and harried to write a one-page note, I'll dash off a ten-page note instead. Too much useless documentation is worse than none at all.

Alternatively, well-thought-out unit tests can document every decision made in your code. Give each unit test one reason to fail and one thing to demonstrate.

(If you want to write code comments describing the thinking behind some hinky logic, go ahead. But remember that such comments constitute a code-smell that can lead someone smarter than you to refactor them out.)

My manager once asked me for sample code to give a customer for a feature I'd implemented. Happily, he asked after I'd drunk the test-driven development “Kool-Aid.” I found the unit test that proved the feature works and gave him that. Because the unit test passed, I knew the sample code was good. Had I been a little smarter, I'd have made it easier for my manager to see the relationship between the unit test and the feature himself.

Another time I wrote and maintained a system without any unit tests. It was a C++ system I had built before test-driven development was a thing, and CppUnit was not yet available. Regression testing was manual and tedious, which hobbled refactoring. I only dared to make tiny changes; big changes got stuck on the side for fear of breaking something. Though immediately safer, gluing things on the side increases technical debt.

Near the end of this system's lifecycle, I isolated certain key functions and wrote unit tests for them. I could then boldly refactor these parts to reduce technical debt. Unit tests have your back. Though incomplete, this expedient method extended the system's useful life for years.

Reducing Technical Debt

This suggests the first step in paying down technical debt: Write unit tests for what you care about most. Add everything you don't quite understand so you can learn by stepping through them in the debugger. Once you are convinced the code is unnecessary, then you can yank it out. The main thing is to get your head around the system's core functionality in unit-test-sized chunks.

Unit tests can guide your decision to refactor a debt-laden system or to rewrite it afresh. If you're just refactoring out technical debt, the unit tests will spare you the embarrassment of breaking important functionality. (If functionality wasn't important enough to test, maybe it still isn't unimportant.)

Unit tests can also serve when you're eliminating technical debt in a clean-page rewrite. The aged system I mentioned eventually had to be rewritten in C#. Happily, I could take the unit tests I'd added to the older system and port them almost unchanged to the new system.

I ran both systems in parallel before retiring the older system. The bugs in the new system arose from its greater flexibility and expanded domain. This caused some spurious failures and other coverage gaps when comparing older and newer tests. I resolved this by partitioning the test domain into three parts: old only, new only, and shared. Old tests could only be trusted in the shared part. I had to manually verify test correctness in the “new only” part, but I only had to do so once to establish baselines.

Let Unit Tests Shine a Light

Technical debt is often accrued during development because exigent circumstances narrow the engineer's focus to only the functionality needed yesterday. In the push to ship code ASAP, we skip documenting and testing our functionality. Our incomplete work bleeds the project until it is finished.

The first step to pay down technical debt is finishing the work. We're not done until there's a unit test for everything important and every unit test passes. These unit tests can illuminate our thinking when those who follow ask, "What's that doing there?"

User Comments

Morgan Robson's picture

Self-documenting programs the way toward utilizing expressive variable names to make a system simple to use without earlier information and actualizing documentation guidelines and requiring peer surveys of documentation are other prescribed procedures that will prevent technical debt.

July 28, 2018 - 12:27am
Steve Poling's picture

Remember that code is there to convey programmer intent. Good variable naming conventions are one dimension of expressing oneself clearly. But it isn't the only one. I've recently been investigating declarative/functional programming languages where the programmer writes "what" with the compiler left to work out "how." This leads to solutions using 1/10th as many lines of code. This lets our intent shine through uncluttered by implementation details.

July 28, 2018 - 12:31pm

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.