Kill by Wire

[article]
Summary:
Linda Hayes has worked in the software industry for a long time and through a lot of changes. But a series of recent events has led her to question whether the industry has changed for the better or worse. In this article, she recommends some attitudes we should lose and some we should adopt in order to save our software and—in some cases—our lives.

I'm going to go out on a limb and predict that the ultimate culprit in the sudden acceleration Toyota cases making the news is—you guessed it—the electronics. I say this for three reasons. First, what's with that story that the gas pedal sticks to the floor? Unless you're in a race or in extremis, who pushes the pedal to the proverbial metal? And even if you do, what's it made of, Velcro?

Second, my car was made by Toyota under a more expensive name, and it suddenly decelerates, usually during a turn. The service manager blamed it on "drive by wire" which basically means "computer controlled." He didn't ask me if I wanted to have it fixed or report the problem to the manufacturer. I guess I am lucky that—so far—it hasn't suddenly accelerated instead. As it is, I never cut left turns too close.

The third reason is that a different news story also caught my eye. At first, I thought they were dredging up the old story about people killed in the '90s by a radiation machine when a combination of user and software interaction accidentally ramped the dosage. But, it turns out, this news is current, and more patients have been injured or killed recently for a different but similar reason. So much for once burned twice shy.

The insinuation of technology into every aspect of our lives means that its inherent risks are here to stay. As pointed out by Malcolm Gladwell in his recent anthology, What The Dog Saw: And Other Adventures, accidents may be the inescapable result of the complexity of technological systems.

I've already come to the conclusion that large enterprise IT environments contain so many variables—especially the possible combinations between systems and users—that you could not test them all in your lifetime and that of your children. But, where does that leave us? We can't just give up, and we certainly shouldn't where lives are involved. What do we do?

First, we need to lose some basic attitudes:

  1. Forget quality. In fact, lose the moniker "quality assurance" altogether, because it is too nebulous and creates unrealistic expectations. The word "quality" encompasses too many attributes to be a valid measure, and, frankly, testers can't "assure" quality or much else for that matter. Customers love and use flawed systems all the time because their benefits outweigh their issues. After all, I'm still driving my car.
  2. Forget defects. I've never believed that the goal of testing is to remove defects, because that requires us to prove a negative—i.e., that there are no more defects. And there are defects that no one cares about, because they never encounter them or, when they do, they are easy to overcome. Almost every shop I know of has a long, long list of reported defects that will never be fixed because other things are more important.
  3. Expect disaster. How many times has there been a blowup of major proportions despite our best efforts to test all that we could? I'm sure the makers of the Toyotas and the radiation machines exercised extreme care in the testing of their systems, yet people still died. In the more common case, the best-designed and best-executed test plans can still leave behind the potential for financial or operational catastrophe. We often hope for the best but fail to plan for the worst.

Then, we need to adopt new attitudes:

  1. Focus on risk reduction. Instead of trying to make sure systems work, identify the greatest risks should the systems fail, and then set about removing them. Although testing is one way to reduce the risk, it is not foolproof. There is always the chance for that one, remote corner case that you might miss. For risks that are truly unacceptable, develop a failsafe strategy. In the case of a radiation machine, that might mean calibrating the equipment so that it is physically incapable of delivering a killing blast regardless of what the user or software says. For a car, that might mean a manual override mode. For software, that usually translates into a failover backup system.
  2. Hold development accountable. If I hear one more test group telling me they test boundaries, equivalence classes, or error handling, I will go nonlinear. These types of tests are designed to catch programming errors. If programmers can't write and test code so that it works, then you need new programmers—not more testers. Granted, it's hard to test your own work; fine, let developers test each others'. Testers should be looking for requirements misunderstandings, not coding mistakes.
  3. Hold the business accountable. Am I the only one who thinks it is bizarre for testers to write requirements? Yet, it happens all the time, usually because they want a traceability matrix. But, since the testers didn't ask for the system, aren't going to pay for it, and probably won't use it, how can they be expected to write the requirements? We often let the users slide with vague, sloppy, or just absent requirements, only to deal with their displeasure when "we" haven't tested something they think is important. Here's a strategy: Let everyone know we will test every requirement we are given. Period.

Forgive my tone if I seem grumpy. Maybe I've been in software testing for too long, but it seems that the only changes over the past twenty years have been for the worse. Unless we take the initiative to change ourselves, I don't see it getting any better. Do you?

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.