Matthew Heusser goes beyond trivial examples to talk about the tradeoffs and ancillary benefits of a large-scale acceptance check automation strategy.
I previously introduced a long-term acceptance checking strategy that I have been using with a couple of teams for an extended period of time. In this article, I will catch up with that strategy two years into its operation.
Picture yourself on site with the client, reviewing the list of checks that run in every build. One day, you find something like the test run in figure 1, which is the output from one particular check.
The tool we used is called
What I see is that it creates eight orders, only four of which are current. It properly displays eight submitted and four eligible.
But, wait a minute. Notice that every column has four Ys of some type. Our summary shows four eligible orders. Our summary also shows four processed orders. Which four eligible orders go with which processed orders? Which Ys in the test correspond with "eligible" and "processed"? We don't know.
To test this check, I click the edit button, change the N eligible for order 1007 to a Y, and rerun the test. Everything still passes. The number of eligible orders in the header and the eligible count in the status file are both still four. I expected them to increase to five.
What's going on here?
Notice the name of the test, "EligibleCountBasedOnSampleMonth," and that the number of orders in the sample month is still four. My naive reading of the check led me to ask, "What if some columns are different?" and that led me to change the check. I did a test, and now we have a more powerful check. We're now better at demonstrating where eligibility comes from, and the check is more likely to catch regression errors. For that matter, I can change all the columns to have a different number of Y options and explore the results.
And that's kind of the point.
Running these checks over and over without thinking about them has some value; they are effective change detectors. When I click the edit button, change the inputs and expected results, and run again, I am testing. I am thinking in the moment. The tool gives me the ability to ask "What if?"—to explore and to find out if a problem exists.
Two years ago, if I wanted to explore in this fashion, I would need either to set up an input file into the database with these values or to run a series of INSERT and UPDATE statements on the existing database, hope I got everything right, run an MS-DOS executable, and then view the resulting file.
Whew. That isn’t fun.
Of course, I'm going to run some tests by hand to make sure the checks are still relevant. Once I believe the checks really do tie out end-to-end, I can play "What if?" all day and get feedback in seconds rather than the ten to thirty minutes an old run took. That is power.
But, Is That All?
The type of testing I describe in my previous article is feature testing, and, if you keep them around, the tests do produce a regression suite. That has some limited value, but regression testing the same values is sort of like walking the same steps through a minefield, checking to see if anyone has planted a new mine in exactly the same spot you walked before. You'll miss anything else that could be defective in the software under test. When we consider the additional time to institutionalize the test—to write the hooks in the code, to create the fixture—it turns out to be a