Choosing the Right Testing Metrics


Testing always looks to provide more information in order to have less uncertainty and better control over risk, but that information has to be analyzed carefully.


Part of a good testing strategy is to use the right metrics and KPIs to your advantage. Testing always looks to provide more information in order to have less uncertainty and better control over risk, but that information has to be analyzed carefully.
A problem with metrics is that they determine behavior, so we have to be very careful when choosing them. It’s like that saying, “When measuring a system, we alter it with the measurement process itself.”
For example, if a tester knows that someone is measuring the number of issues they report, the tester will want to report as many bugs as possible in order to improve the metrics. What’s the problem with that? Instead of focusing testing on more important and risky areas, the tester may dedicate more time to less mature features with more incidents or incidents that are easier to find. Plus, they will be likely to report anything, even details that wouldn't add value to the business, but the metrics would improve!
Another problem is that all metrics can be gamed, like the saying: “Every law has its loophole.” Imagine we have decided and even configured some tool to ensure that all code must have 80 percent unit test coverage. If not, it will be rejected. What someone may do, in this case, if there is a method invoked by a unit test, is add many useless and innocuous lines (as many as needed) that are easy to code yet won’t hurt the system, like this:
String coverage = "with this I improve the coverage";

In doing this, you could meet and improve the coverage metrics. So, due to the fact that the measurement can affect what we measure, we have to think carefully about what to measure. The focus should always be on the metrics that are important for the business, something that if improved, would lead to a real improvement for the business. It’s also important to note that one metric only tells part of the story and what’s fundamentally important is the team commitment to quality and not only metrics.
Start With Why 

As Simon Sinek famously said, “Start with why.” Let’s start thinking about the reason and the purpose. Why do we want to measure something?
We shouldn’t only think about metrics and just take a record. We should also think about defining thresholds and what decisions would be made if a certain threshold were surpassed. In order to do this, we must have a clear why. For instance, if an automated test suite takes more than two hours to execute, it could be divided into two pipelines—one with the most critical test cases and another one for the complete regression test suite to execute at night. In this example, the why could be “to assure that the automation will give quick feedback.”
Testing Metrics to Consider

Okay, now onto some concrete examples! I must preface this by saying that these ideas should only be used as inspiration, because different aspects of quality matter to varying degrees in different contexts.
User Satisfaction 
Here you will want to examine if the client’s reaction is that of confusion or amazement regarding the product. You could analyze satisfaction surveys and bugs reported by users. If we measure these quality metrics and aim at improving them, the business will consequently improve because we’ll have more satisfied and returning users. In case things go wrong, we’ll have to conduct a causal analysis of the problems.
Process Metrics
These metrics are internal, but they may have a big impact on the quality of your software. For example, you might want to measure lead time, the time from when a requirement is conceived to when the code implementing that requirement is deployed to production. Another metric is cycle time or the time taken to develop a feature since approval to start working on it has been granted. Lastly, you could measure the response time for solving problems. This could mean the time when tickets or bugs are resolved from when they’re first reported. 
It may be tricky to measure how long these take, so another way to address the efficiency of your processes is to pay attention to where unfinished work begins accumulating in the queue. This may indicate a bottleneck where you could find ways to optimize.
Coverage Metrics
Test coverage is another one of the measurements of test quality that tells us how much code has been tested. Here, I recommend a top-down approach. Begin with module coverage, then functionalities, and then analyze the coverage of data in each functionality (how many combinations of the possible data inputs we are covering, from something as simple as the criteria “each choice” to the other extreme “cartesian product”).
Code Quality Metrics
Code quality is often measured using tools like SonarQube, which help you to uncover how much technical debt is in a system. Both for issues and vulnerabilities, you will need to triage them, order by priority, and decide what you are going to focus on, because mainly at the beginning you will face tons of them.
Bug or Incident Metrics 
Here I think it’s important to at least bear in mind the severity of each issue and not lend all issues the same weight. When reporting issues, it’s important to classify how many are severe and which are more like a suggestion for improvement. It’s important to know for your business which aspects of quality are more critical than others. For some business models, performance may be more important, such as e-commerce businesses with peaks in traffic on busy shopping days.
That said, for the metrics you will analyze, I suggest not only paying attention to the number of bugs. Drill down into your numbers, taking into account useful categories to have a better understanding of where you can improve your process. 
Exploratory Testing Metrics
In each session, record how much time was spent on both the setup and execution, as well as the number of bugs found. For each session, record the time spent focused on the exploratory testing charter versus the time spent investigating interesting behaviors observed in the product. These are two metrics proposed by Michael Bolton and James Bach.
Test Automation Metrics
If we are going to count the test cases, think of grouping them by modules, criticality, risk, priority, etc. Measure stability or the level of trust that can be assigned to a test case due to the ever present possibility of false positives. Measure execution time—how long does it take for the feedback to arrive?
Performance Metrics
For performance, it’s widely known to measure response time and throughput. Another valuable metric may be resource usage to see if we can measure what we identify as possible bottlenecks. You could also note some metrics around the mathematical concepts behind performance such as averages and percentiles.
“Happiness” Metrics
This set of metrics may be used to measure how well the team is faring. How confident is the team that if someone goes on vacation their tasks will be covered? Are some members approaching burn out? It’s important to know about the health of the team, because this directly impacts the quality of their work and the product.
These were some points I’ve been talking about with Melissa Eaden over coffee. I believe these are metrics to consider, although perhaps difficult to measure, and they are not the only ones, but by thinking about these things, the team’s bonding and well-being will improve and help the business to run better.
Testing Metrics to Provide Visibility 

Metrics are useful to give visibility and be able to make better decisions. You have to be careful because they can generate undesired behavior as well.
Going back full circle, what do we want the metrics for? Start with that, discuss it, and then decide what to measure and how to measure it, because everything will make more sense. And what’s more, the team will be more motivated to disclose that information if there is a solid purpose behind it.

User Comments

1 comment
Alok Nag's picture

Good article, lot of new insights for me ex. the "Happiness metrics" and why "Cycle Times".

I do have a question though: 
How do we measure
Debugability: for reesponse times for incoming customer issues and
Quality of documentation: on how we communicate to customers to use/deploy the solutions.

What you think are some parameters we can consider here?


May 20, 2021 - 11:28am

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.