TrainingConferencesAbout UsContact UsAdvertiseSQE.comRSS Feed

StickyMinds.com: brain food for building better software

Log In
 Clarify Your Search Criteria

Tips on Using Our Search Feature(s)
 
StickyMinds.com Home
ResourcesTopicsCommunityPowerPassBlogs
Home  >  Topics  >  Test & Evaluation  >  Detail: GUT Instinct



GUT Instinct

By Kevlin Henney

Send This Content to a FriendGet a Short Link to This ContentPrint This Content

Summary: Whether or not a unit test is considered good is not simply about what it tests: It is also very much about "how" it tests. Is the test readable and maintainable? Does it define the expected behavior or merely assume it? To be sustainable, the style of a unit test is just as important as the style of any other code. Perhaps a little surprisingly, the most commonly favored test partitioning style does not meet these expectations.

HP
In one of my previous columns "Programming with GUTs," [1] we looked at some of the stylistic qualities that make up good unit tests, or GUTs, following Alistair Cockburn's coinage [2]. A tour of articles, books, and the blogosphere reveals a lot of emphasis on the importance of quality in production code and exhortations to test, but perhaps less on the quality of the tests themselves. Automated unit test cases are also code, and the case for quality applies just as much to their sustainability as to that of production code.

Obviously, GUTs are better than no unit tests (NUTs), but it is not clear that bad unit tests (BUTs) are necessarily better than NUTs. BUTs can hold back both a team and a codebase by restricting rather than supporting the kinds of changes that can be made. Without the appropriate sensibility, automated tests can degenerate into an accumulated jumble of test cases with no obvious narrative or direction.

Procedural Problems
What are unit tests for? One of the most common responses is that unit tests are used to demonstrate that code is doing the "right thing." But, what exactly is the right thing? Without a clear idea of what the right thing is, programmers can end up reducing testing to arbitrary and dispirited code poking—or to nothing at all.

Common but unjustified beliefs about testing are not just restricted to purpose; they can also be found in practice. For example, perhaps the most popular and seemingly intuitive technique for partitioning test cases is in a procedural style: For each method in a class, write a corresponding test method; for method foo, definetestFoo.

Aligning method under test with test method undermines the simpler and more important correlation of a test case corresponding to a test method: Testing a method with rich behavior in a single method fails to differentiate the underlying test cases; trying to test query methods on their own misses the point that their results can only be interpreted meaningfully within specific contexts, following previous calls. Lumping together all the common cases, boundary cases, and failure cases for a method into a single test case makes it harder to see the forest for the trees. Such arbitrary grouping is a surefire way to create messy tests and to overlook important cases.

If test cases are intended to define and confirm the behavior of a class, then the test cases need to align with cases of use that may cut across many methods, rather than align with methods, which may enclose a wide range of behavior. If test cases are intended to define and confirm the behavior of a class, then the test cases need to align with cases of use that may cut across many methods, rather than align with methods, which may enclose a wide range of behavior.

The popularity of the procedural testing style is carried through to automated perfection in many code-generating wizards, which generate test stubs for each method in a production class. These address the trivial work—writing method signatures—and carefully sidestep the hard part—the details of the actual test. Such tools nudge programmers into the wrong frame of mind for describing the usage behavior of their code. Along with the "helpful" tools that autolitter code with setters and getters, such tools bear a stronger resemblance to so(u)rcerer's apprentices than they do to wizards.

From Prodding Procedures to Revealing Requirements
It seems clear that procedural testing does not work well with objects, but is the same also true for code that is procedural or functional in style? Consider, for example, a single function that determines whether or not a given year is a leap year. To keep things simple, let's assume that the calendar follows the ISO 8601 standard, which includes a year 0 and is proleptic (i.e., applies to dates before the Gregorian calendar was introduced). In Groovy, the appropriate predicate would look something like listing 1.



Focusing just on method names, the procedural style of testing would lead to a single test method, as shown in listing 2a. Anyone unfamiliar with the rule for leap years will be none the wiser. The name tells you nothing about the behavior, only that a single method is being tested. If it fails, all you can deduce is that something about isLeapYear (or perhaps testIsLeapYear) is incorrect.



It is clear that the behavior needs to be partitioned in some way to clarify what behavior is expected. A first cut at partitioning might be to divide the test cases according to the result of the function—i.e., leap years and non-leap years (uncommonly known as common years). The code in listing 2b is an improvement, reflecting a little more of the problem domain in the tests. However, the actual nature of the rule is still unclear. Given that most people believe that leap years occur every four years regardless, this doesn't clarify matters one way or another and does nothing to correct that belief.

Even when people are aware that leap year determination involves more complexity, they often fail to recall the detail. At least part of the reason for this is that humans are good with straightforward rules—and even rules with an exception—but they are not so good at dealing with rules that have exceptions to exceptions (and beyond). For leap years and common years there are four distinct cases, as shown in listing 2c.



This equivalence partitioning delineates the cases explicitly. But there's still something missing: It's like a joke without a punch line. We know the situation we're testing, but not why or what we're expecting from it. The test cases in listing 2d include the punch line.



The Naming of Names
There is a world of difference between the last four test cases shown in listing 2d and the initial testIsLeapYear in listing 2a. These last test cases specify what the "right thing" is and, in their bodies, employ representative examples that demonstrate each case, as in listing 3, and check that the right thing happens. The procedural style merely indicates that some named piece of code is being tested without reference to what is needed or expected. The behavioral style reveals the rules and intentions that inform the code.



Of course, the same test code could have been written within a single test method, testIsLeapYear, and grouped into test cases, each grouping headed by a suitable comment. But whenever you are tempted to use a comment, first determine whether the code could be rearranged to express more directly what the comment is trying to say. In this case, don't comment blocks of code, name them as methods. Method-level partitioning also offers more precise and meaningful information when a test case fails.

Ideally test names should be simple statements of fact about what is needed from the code. In this case, we are constrained by the decision to take advantage of JUnit 3, which is conveniently integrated with the Groovy runtime. JUnit 3 requires that all test method names begin with test. Unfortunately, this particular verb on its own does little to encourage a specification-driven naming style. If you are writing to this technical constraint, or you still like to think of tests as confirmations rather than as specifications, try using testThat as your prefix. This alternative prefix naturally encourages name completion with something a little more statement-like.

Whether or not the code under test is procedural in style turns out to be irrelevant. A procedural partitioning of tests is invariably the wrong choice for GUTs. {end}

References
[1] Henney, Kevlin. "Programming with GUTs." Better Software magazine, July/August 2008.
[2] Cockburn, Alistair. The modern programming professional has GUTs."

Which unit test styles have you found to be beneficial and which ones have been problematic?

Join the conversation below or start a new one in the Member Comments section.


About the Author
Kevlin Henney is an independent consultant and trainer based in the UK. He provides consultancy and training in programming techniques, software architecture, and development process. He is co-author of two recent books on patterns, A Pattern Language for Distributed Computing and On Patterns and Pattern Languages.

Back to Top
 
 

Member Comments
Add Your Comment
 
Comment:    
by Jim Sears 12/15/2009

As an automation engineer, I've decided to give my testers the ability to enter data into the application and decide for them selves if the result is valid. It is pure data driven automation including the validation. Would that still be called functional by the industry? I automate components, screens, fields, etc.; and the testers enter data for the application then validate in the app gui or in the backend database. I don't create 'fixed' tests. I create framework for any data entry and any data validation related to the data entry. I'll ask again, is that functional?

 
Back to Top

May We Suggest...
Show All

Articles & Papers

Templates

Links

Books

Tools

Related Products
Testing Training Courses
Software Testing Certification, Systematic Software Testing, Test Management, Mastering Test Design, Just-in-time Testing

Software Engineering Training
Mastering the Requirements Process, Requirements Modeling, Introduction to the Capability Maturity Model Integration, Business-Driven Software Measurement

Agile Software Development Training
Scrum Master Implementation Workshop, User Stories and Estimation in Agile Development, Design Patterns Explained, Practical Test-Driven Development

 
Ads By Google
What's This?
 
 



Home   |   Resources   |   Topics   |   Community   |   PowerPass



© 2010 StickyMinds.com. All rights reserved.
StickyMinds.com is a division of Software Quality Engineering.
Privacy Policy    Terms & Conditions    Link to StickyMinds.com    Feedback


Cloud Connection

Rally Software


STAREAST 


Better Software Conference