Testing What You Can’t See: Risk Blindness in Coverage Models

[article]
Summary:
The way we think about what necessitates test coverage being “complete” influences how we test and the cases we create. After all, you wouldn't design tests for situations that don't occur to you—and you can't test what you can't see. It's time to take off the blinders. Here's how you can find where the bugs in your products are occurring, and then adjust your strategy to pinpoint them.

When I was employed by an insurance company, I worked on a large number of data extract programs. In all that time I never saw a requirements document that addressed what to do if the database was down, even though most of the testing was based on those requirements.

Any guru could take the requirements document and build a map, bullet points, or a spreadsheet to make a “coverage model.” Entire software companies exist to visualize those models, designing them to all blink from blank to green as tests “cover” the functionality.

Yet inevitably, when the interface would run and the database would be down, what happens would be … unpredictable.

There are blind spots all over the place. We just can’t see them.

Automated Testing

In the testing-like-a-customer world, today’s dominant way of thinking about testing is based on the user interface and tools. Press a button, watch the screen fly by (or the cloud-based client), and come back in an hour and get results. It is amazing—almost magical.

My experience with such tools is that they will not have the ability to print; or, if they can, the tools lack the ability to inspect the generated PDF to see if it renders the right text. They certainly can’t check to see if the results look right. Most of the tools are not even able to see that something looks wrong by visual comparison. Where a human can check tab order by pressing Tab ten times in a row on a complex form, this work is so hard to program that most people simply skip any automated checks of tab order.

The tools are getting better. A few years ago, it was common to confuse the screen so that a button or link was not visible because it was hidden under something else, yet a tool like Selenium could click it. Today, some tools address this issue. Others add the capability to do user-interface comparisons of a portion of the screen, along with workflows for quick approval.

Still, you see my point. The tool makes testing for printing impossible. When we don’t create an automated test for printing, eventually we just stop testing printing.

The way we think about coverage being “complete” with our testing, or what “Testing is complete” might mean, influences what we do.

Functional Testing

When I teach testing, one of my early lessons is on failure modes—that is, the ways it is possible for the software to break. That includes both common defects, such as programmers making some of the same mistakes over and over again (“No matter what they touch, the devs seem to keep breaking the ALF component”), as well as the common failure modes of the platform. If I’m testing a mobile application with a low-maturity team, I’m likely to experiment moving from having wireless data to no coverage and back again.

For a web application with complex graphics, I’m going to resize the window—a lot. I will open multiple tabs and flip between them. I’ll try the application on a tablet and flip the tablet sideways. I will send a link to a different device that requires a login and try to access the link without logging in. All of these are common failure modes for responsive design, yet I’ve never seen them listed in any dashboard or formalized test steps, unless my company created the list.

Perhaps senior people “just know” to test this way, based on experience. Junior people certainly do not. The more prescriptive the test directions become, and the more we focus on writing things down so that we can hire low-skill people, the more likely we are to get this outcome: People do exactly what the test says, and they miss all the bugs.

With test tooling, this is not a risk; it is guaranteed. The automated tooling is only going to do exactly what you tell it to do. Meanwhile, evaluating the results of a resized window is an incredibly complex task that is exceedingly difficult to write. As a result, people simply do not record the automation or build the logic to simulate these behaviors.

Perhaps, to some extent, that is okay. Fellow testing trainer James Bach once told me that some people don’t want to pay for hand-crafted wood furniture. For them, cheap, assemble-it-yourself pieces are just fine. Make the plans and let the computer do the cutting—good enough is, after all, good enough.

But in software, cheap, cookie-cutter testing (and test tooling) means missing bugs, likely because the model of coverage does not consider them.

A New Strategy

Here’s a way—that could take you something between a lunch hour and a day—to find out what you’re missing.

Look at all the bugs that escaped a programmer for the past six months. (If your agile team doesn’t track those, you might want to start, at least for a short time.) For each one, tie it back not to a root cause, but to how they will be visible—how they manifest.

Then look at each of your mechanisms to reduce defects, from code review to human exploration to tooling. Ask if the existing formal process you have in place should have found the problem, if it could, and if it did.

Then do two things. Work on the people and process to prevent entire categories of defects, and take off the blinders. Find where the bugs are and adjust your test strategy to find them. Consider a coverage model that is about types of defects and all the ways you “cover” them with tests, checks, inspections, and walkthroughs.

There is a case to be made for blinders; they help horses walk straight. They also prevent exploration.

Who wants that?

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.