Test automation can reach a point at which it is no longer supporting organizational goals. Martin Ivison examines four key causes for this unhealthy state and finds out that carefully chosen metrics and a holistic, adaptive, and risk-driven approach go a long way to prevent and remedy this problem.
Test automation has become so key to modern methods of software delivery that it might seem almost sacrilegious to ask the question: “Is there such a thing as too much test automation?” After all, are we not pursuing the nirvana of reduced manual labor, as well as great reliability, flexibility and time to market? Can there ever be too much of that?
Well, I would say that test automation doesn't automatically equate to any of these things, and it can, in fact, burden you with the exact opposite. So how can we tell whether we've got it right, or whether we have too much or too little?
To make us think about that question, let's look at an example.
A 200 Hour Green Run
I've recently spoken to someone who was proud to tell me that their team had created 10,000 automated tests that took 200 hours to run and always passed.
Now, this was the result of many years of dedicated work, and the team was highly satisfied with their accomplishments; and rightly so, because it is no small feat to do that. It's like building a life-size model of the pyramids in your backyard. Respect to you for all that hard work and engineering. The question is just: “Is it needed?”
Upon prying a bit, I found out that the company was in the business of delivering networked IoT devices, and their principal concern was the firmware on those devices and some web-based administration software. They shipped updates three times a year, and they ran the full test suite before each release.
So to answer the most basic question, they could afford a 200-hour runtime, because it meant that they only had to invest a week or so of final testing after four months of development work, which sounds like a good proposition. This, of course, was only possible because the suite always went green and did not have to be re-run. These results were due to the team spending their four months with manual testing to find all the bugs in the system under test, and with maintenance of the automated test suite to identify any needed changes and make them.
In essence, the team was testing and developing two systems instead of one. And the benefit of their automated test suite was chiefly to produce a green light before release.
Now, there is no doubt that this green light made everybody feel good. What made them feel less good (and the reason we had this chat) was that the team was faced with lay-offs due to cost-cutting measures.
You can probably spot the irony here. The development team could have increased its productivity significantly by shelving this test suite with equal results for the customer. A reduction of their test automation to 0 would have been a vast improvement. And there are far easier ways of producing a green light.
So what has gone wrong here? How can we evaluate whether our test automation is working for us instead of against us? And how do we know whether we have the right amount and the right kind?
To point us in the right direction, let's look at some common pitfalls. They are:
- Methods have changed
- Not looking at it holistically
- Not understanding risk
- Not continuously evaluating health
Methods Have Changed
There have been tectonic shifts in the way software systems are designed, developed, tested, and delivered in the last few years. Chances are that your organization is shifting its methods, process, and technology in response to this rapidly developing environment.
Pitfall #1 is not to adapt your automation approach to these changes.
Take, as an example, agile transformation. Many organizations are changing from a traditional SDLC to an agile lifecycle to respond to their customers better and be more competitive in the market. This usually means changing from a traditional SDLC to an agile lifecycle.
The traditional approach is that automation replaces manual labor. The agile approach is that it provides rapid feedback and timely risk-mitigation. The difference matters enormously.
Under traditional criteria, the size and speed of a test suite do not matter, as long as it can do the job more efficiently than manual labor. Think of a conveyor belt instead of a bucket brigade, which takes one person to operate instead of fifty.
Under agile criteria, however, the speed and timeliness of feedback are critical. Automation has to be the lubricant of development, not some sort of additional machinery that itself requires constant lubrication. This means it has to be fast and low on maintenance.
In other words, if you are trying to be agile and you are still measuring success by how many manual test cases you've replaced or hours of manual testing you've saved, you're not looking at the right criteria. You'd do better to worry about running in minutes instead of hours, running seamlessly and continuously in your integration and deployment toolchain, and monitoring test stability at each test level.
Not Looking at It Holistically
The consulting industry likes to promote the concept of best practices. This gives us the idea that there are certain concepts which are so good that they fit any and all situations. In truth, there is no such thing, context matters, and even a fully manual testing approach can beat $5 million dollars worth of test automation in the right scenario (our example above).
Pitfall #2 is not to know or care what your organization needs you to achieve.
There are two contexts to consider. The first, in which test automation is part of a maintenance process. This is quite typical for organizations in which IT is a support function and not a product. The second, in which it is part of a production process, which is typical for organizations that sell software products and services.
In both cases, there are abilities other than quality that matter. For maintenance it is usually cost-effectiveness. For production it may additionally include time-to-market (speed), flexibility to react and innovate (agility), productivity and more. These abilities directly affect your organization's success in its business domain; therefore you cannot afford to look at test automation purely from a quality stand-point.
In our example above, the 200-hour suite made perfect sense until we found out that the company needed it to be cheaper. Or consider a different example, in which you are a start-up and still in the phase of discovering your customers, business model, and market opportunities. In that case, carrying any kind of heavy automation gives you just one more thing that needs to be changed while you are molting and adapting to the market (a better fit in this case would be to keep it light and left).
The lesson here is to shape your test automation so it can deliver on all needed abilities. If speed is needed, make it fast. If flawless quality is needed, give it coverage. If cost-effectiveness is needed, keep it lean and shift it left. But remember that every ability you give it usually comes at a cost to another ability, so pick wisely.
Not Understanding Risk
With the emergence of automation specialists and engineers in test, teams are often facing a situation in which they are perfectly capable of solving the technical obstacles of automated testing but are less good at understanding what to test and why.
Pitfall #3 is to forget that testing is there to mitigate risk.
Testing is really only worth doing if what we test can break. And it is efficient when the damage of that breakage is more costly than our efforts to prevent or catch it. A common problem for test automation is when it is detached from that assessment and grows in areas where it does not mitigate sufficient risk.
As an example, I recently encountered a back-end system, which in addition to its ~10K unit tests had another 20K higher level regression tests, a significant part of which were Selenium GUI tests. If we look at the test pyramid model, we would suspect that a large percentage of the higher-level tests are probably redundant and not mitigating any significant risk. And we would also ask why a back-end system in which 99.99% of all interactions happen through APIs has several thousand GUI tests.
What happened here is that this automation suite had grown over years, driven by a definition-of-done that required automated test coverage for all acceptance criteria on a ticket. And stories were written with a user-centric view, which encouraged testing to see every flow with human interaction in mind.
What did not happen was routinely asking what the risk of failure on a particular change was (e.g. what part was most likely or impactful to fail, or which parts were tightly-coupled with other parts of the system and so at a risk of unwanted regression) and which part of the risk was already covered by unit or other existing tests.
To avoid the pitfall of unpruned and very quickly unsustainable growth, we need to make good old testing smarts part of the process. This means risk-awareness and quantification, knowledge of defect patterns, employing test design patterns such as the testing pyramid, and clear success criteria for testing.
Which leads us to our last point.
Not Continuously Evaluating Health
In software engineering we are relying on metrics and success criteria to tell us how we are doing. Yet in test automation we frequently encounter measurements which are detached from any kind of business purpose or fitness and are often simply self-serving (e.g. the popular 'percent automated').
Assuming that test automation should mitigate risk, deliver on needed business abilities (speed, cost control, agility etc.), and support your transformational goals, how do we know if we're successful without measuring for these intentions?
Pitfall #4 is not to have effective flight controls.
In our first example , the 200-hour green run - success was measured in functional coverage and equivalent manual hours saved. A simple calculation might demonstrate that an hour of automation covers 20 hours of manual testing. If so, the suite saved us 4K hours of testing, regardless of the fact that our cost-cutting company would have never entertained spending that time to find exactly no defects (the green run). Given the cost-pressure, perhaps a healthier metric would have been cost of maintenance and effectiveness in finding defects?
In our second example , the Selenium-tested back-end system success was measured in ticket completion (labor-driven). The oversized framework, however, had the interdependency and stability problems that come with top-heavy tests and a long run-time that required a significant number of static test environments to test many changes and branches in parallel. In other words, it was brittle, slow and expensive. If we want to turn this trend around, more useful flight controls would show us that we are successful when run-time decreases to more acceptable levels, test stability rises, and maintenance and environment cost go down over time while quality (e.g. defect slippage) stays level.
Implementing a working automation capability is hard, and it is seductive to rest on our laurels when we get it running and accepted. However, we risk becoming victims of our own success when it becomes detached from risk, change and business health.
The cost we may pay is the loss of agility (more rigid, tightly coupled code), productivity (more maintenance), speed, and of course cost-effectiveness. And these, you may notice, are all the things test automation was supposed to help us achieve.
A remedy is awareness, carefully chosen metrics, and a holistic, risk-driven approach with built-in adaptivity.