|
 |
Home > Topics > Test & Evaluation > Detail: Programming with GUTs
 | |  |  Programming with GUTs
 By Kevlin Henney

  
 Summary: Since tests are commonly viewed in terms of offering quantitative feedback on the presence or absence of defects in specific situations, Good Unit Tests need both to illustrate and define the behavioral contract of the unit in question. Do you have GUTs? |  |  |
|
Looking around at some of the writing on the subject, both online and off, you could be forgiven for thinking that test-driven development (TDD) is just another way of saying testing, simply a synonym for programmer testing, or a fancy way of promoting plain ol' unit testing. It is not that these concepts do not have value--they do--it's that the arbitrary use of the term TDD decreases both the value of TDD and of other points of view. Not all testing is unit testing, and not all unit testing derives from TDD. As Alistair Cockburn noted in his blog earlier this year [1]:
Very many people say "TDD" when they really mean, "I have good unit tests" ("I have GUTs"?) Ron Jeffries tried for years to explain what this was, but we never got a catch-phrase for it, and now TDD is being watered down to mean GUTs.
Well, now at least there is a catch phrase for it! The distinction being drawn is both clear and useful: TDD is a way of doing something--i.e., it is a process; GUTs are an outcome--i.e., they describe an artifact and a quality of that artifact.
What Do You Want From Your Tests?
Tests are commonly viewed in terms of offering quantitative feedback on the presence or absence of defects in specific situations. They are, however, in a position to offer much more than this.
For example, our tests can give us feedback on how loosely coupled our code is. Some portion of any nontrivial system is necessarily not unit testable: The portion of the code that must interact across the system boundary may be integration testable, but it is not unit testable. This is the code that may be mocked out for purposes of unit testing other parts of a system, but should also be tested in its own right.
So, if there is some portion of our code that is necessarily not unit testable, how much should be unit testable that isn't? If the only code-facing tests we can write are integration tests, the chances are good that our code is too tightly coupled, no matter what dewy-eyed vision we may hold of its architecture. Furthermore, this also says something about the relationship between test quality and code quality: How good our set of unit tests can be is constrained by the quality of the production code’s architecture.
For us to consider our tests good, we may want other things from them, such as coverage. However, because it's less familiar territory for many programmers, let's keep exploring qualitative rather than quantitative feedback. When we look at test code, do we expect to see something written only for machine execution, or is human consumption also a key consideration? In the blurb for their conference session, "Are Your Tests Really Driving Your Development?" [2], Nat Pryce and Steve Freeman make clear the importance of the human audience:
Everybody knows that TDD stands for Test Driven Development. However, people too often concentrate on the words "Test" and "Development" and don't consider what the word "Driven" really implies. For tests to drive development they must do more than just test that code performs its required functionality: they must clearly express that required functionality to the reader. That is, they must be clear specifications of the required functionality. Tests that are not written with their role as specifications in mind can be very confusing to read. The difficulty in understanding what they are testing can greatly reduce the velocity at which a codebase can be changed.
This consideration is perhaps more explicit and more obviously enabled with TDD, but the sense and sensibility of treating tests as specs are also relevant when using other unit-testing approaches.
Style and Substance
Consider a simple example: a recently used list, which is a collection that holds strings uniquely and in reverse order of their insertion. The C# code in listing 1 shows the key features for using such a class: a default constructor, an Add method, a Count property to query size, and an indexer for subscripting.
So what style should our tests adopt in order to communicate functionality to the reader? The NUnit test case in listing 2 certainly exercises the class, but lumping all test logic into an undifferentiated, monolithic slab called Test has little communication value. It can be justified for code with simple behaviors, but it does not scale well to larger tests.
Dividing up the single test method into a number of arbitrary test methods--For example, Test1, Test2, and Test3--does not really address the issue. Such a division and labeling is, well, arbitrary, so it does not communicate anything useful to the reader.
Programmers who have moved beyond the monolithic and arbitrary styles tend to gravitate toward a procedural style. A procedural style can be characterized in terms of "I have a procedure foo, therefore I have a corresponding test procedure that tests foo." Listing 3 shows procedural unit tests for our RecentlyUsedList class, each test method aligned with a method in the class under test.
Although there is a certain logic to procedural testing, the rationale is somewhat weak (even for procedural code). Testing individual methods in isolation doesn't really make sense in terms of what a RecentlyUsedList is and how it is used. In many cases it is impossible to focus on just one method without involving others. For example, we cannot call the Add method without also having executed a constructor. Likewise, even in the constructor's test, we still end up querying the Count property in order to say something meaningful about the result of construction.
A per-method testing style also leads to an uneven spread of test-case length and depth. For example, most of what a RecentlyUsedList is about is covered by the Add method. A single test case for the Add method slops a number of different usage scenarios into one bucket. Indexing is also somewhat important, but unless we use Add we cannot usefully test the indexer. Indeed, much of the testing of Add's behavior ends up in the indexer’s test!
We can do better. A good unit test needs both to illustrate and define the behavioral contract of the unit in question. Behavior is more than just individual methods, so we need a style that cuts across the interface, a style where each test case is structured in terms of a meaningful and specific behavioral goal, as shown in listing 4.
One of the more noticeable consequences of a behavioral style is the naming. The idea is that each method is named as a requirement and therefore provides an outline of the contract:
- Initial list is empty.
- Addition of single item to empty list is retained.
- Addition of distinct items is retained in stack order.
- Duplicate items are moved to front but not added.
- Out-of-range index throws exception.
Compare this rich description with the somewhat simplistic procedural summary:
As an aside, a behavioral style forms one cornerstone in behavior-driven development, which Dan North described in his Better Software article "Behavior Modification" [3].
In comparing the monolithic, arbitrary, procedural, and behavioral styles, we can see there is a great deal more to good unit tests than simply mastering the syntax of an assertion. {end}
References:
1] Cockburn, Alistair. "The modern programming professional has GUTs."
2] Pryce, Nat; Freeman, Steve. "Are Your Tests Really Driving Your Development?"
3] North, Dan. "Behavior Modification." Better Software magazine. March, 2006.
What factors and styles have you seen that have influenced the quality of automated tests?
Join the conversation below or start a new one in the Member Comments section.
About the Author Kevlin Henney is an independent consultant and trainer based in the UK. He provides consultancy and training in programming techniques, software architecture, and development process. He is co-author of two recent books on patterns, A Pattern Language for Distributed Computing and On Patterns and Pattern Languages.
Back to Top  |
|
|
|
|
|
|
Testing Training Courses
Software Testing Certification, Systematic Software Testing, Test Management, Mastering Test Design, Just-in-time Testing
Software Engineering Training
Mastering the Requirements Process, Requirements Modeling, Introduction to the Capability Maturity Model Integration, Business-Driven Software Measurement
Agile Software Development Training
Scrum Master Implementation Workshop, User Stories and Estimation in Agile Development, Design Patterns Explained, Practical Test-Driven Development
|
|
|
|
|
|
 |