Measuring activities are vital to the software test process. On this site, there are more than 200 items (articles, tools, templates, etc.) classified under the topic "measurement." But what good are all the bits and pieces of data that you collect? In this week's column, veteran software tester Rick Craig outlines some of the practical uses for metrics.
To manage a testing effort, test managers and testers need information that will help them make timely and informed decisions. This information is often called "metrics." I am often asked to provide a client or student with a list of metrics they need to do their job. Unfortunately, such a standard list doesn't exist because the measurement needs of each team and project are usually different. At the very least, most teams will need measures of quality, resources, time, and size to do their job. In this short article, I am going to address some of the things metrics can do for you rather than discussing which metrics or types of metrics you should collect.
Provide a Basis for Estimating
Without some information to use as a basis of comparison, there can be no estimate--only a guess. Sometimes testers, test managers, and project managers make estimates based upon their experience. These are not necessarily guesses, since most of these practitioners have a reservoir of metrics on past projects stored in their heads. Estimation can often be improved, however, by recording the time, effort, and characteristics of each testing effort to provide a sounder basis for future estimates. Differences in project size and characteristics, software quality, staff skill, etc., will require normalization of the stored information to use as the basis for estimating a new testing effort.
Provide a Means of Control/Status Reporting
I often joke about projects that are always 90 percent done, but without the use of metrics, the progress report will often be based upon "gut feel" (metrics in the mind?). Testing status can be measured based upon the number or percent of test cases written or executed, requirements tested, modules tested, business functions tested, and others. Of course to be useful, this information will have to be reconciled against the schedule. A word of caution here: All test cases and requirements are not created equal. For example, some test cases may take a very short time to run and others may take much longer; so if you're measuring against a schedule, these test cases may have to be weighted based on the amount of time each one takes. Similarly, some test cases test more important functions, or more code than others; therefore, completing 50 percent of the test cases doesn't necessarily mean that 50 percent of the system has been tested. In that case, the test cases will have to be weighted against their coverage.
Identify Risky Areas That Require More Testing
Anyone who has ever been a maintenance programmer knows that when they are called upon to fix a problem (especially an emergency!), the problem is often found in a module or function that has already been fixed repeatedly in the past. Without belaboring the cause of this phenomenon in this short article, suffice it to say that identifying those parts of the application that are prone to failure can give the testers insight into areas that require greater care during testing. So by measuring the relative defect density of a module (or function, or piece of code), the tester can focus additional testing on those error-prone components.
Provide Meters to Flag Actions
Metrics that flag an action are sometimes called meters. Examples include exit criteria, suspension criteria, and criteria that call for reinspection of a piece of code. These meters are established to signal an action that should occur if the threshold is met. For example, some organizations establish entry criteria into the system test group to demonstrate that the application is complete and stable enough to allow the testers to begin testing without repeatedly