Exit Criteria, Software Quality, and Gut Feelings

article

November 14, 2016

Summary

Bug counts and trends don't cover all the quality aspects of a product. A good exit criteria list provides an orderly list of attributes that research and experience showed to have impact on product quality, so you can monitor the product quality at any given time and forecast the expected status at release. That's how you improve your product.

A few months ago I happened to meet an old friend who is also a software tester at a social gathering. Soon enough we started to talk shop, and my friend shared his bleak situation.

“Tomorrow we have a project meeting to approve the release of our latest version,” he said. “I am going to recommend we don’t release it, but I know I will be ignored.”

“Why this recommendation?”

“Because I have a strong gut feeling that the last version we got—the one about to be released to our customers—is completely messed up. But the product manager just follows our exit criteria list, which we meet, and ‘gut feeling’ is not considered to be solid data.”

I inquired about the source of this gut feeling and got the whole story: Two weeks ago his team received a build for a last test cycle before the official release. Theoretically, everything was supposed to be perfect. But on the first day of testing, the testers found two critical bugs.

The development team rushed to fix them and released a new build. However, for two weeks, the cycle repeated: The testers would find one or two critical bugs and the developers would release a new build with fixes. This created the problem my friend had.

“By our exit criteria, we can’t release a version that has critical bugs,” he explained. “But the development team fixed every bug we found, so at least for this moment in time, there are no open critical bugs. In the last two weeks I got eight new builds. Do you think we got even close to testing the product properly? Based on the quality level of builds in the last two weeks, I can guarantee that the build we got today contains a good number of additional critical bugs. If I get a few more days to test it, we will find them, but release day is tomorrow, and on paper everything is fine.”

What can we do to deal with such situations? How do we translate our professional intuition, our gut feeling, into hard data?

Creating Comprehensive Exit Criteria

The ideal way to communicate these feelings is to be proactive. You need exit criteria that are a bit more sophisticated than a simple bug count and bug severity. One has to add trends: the rate of incoming new bugs, the rate of bugs being closed, and the predicted bug count based on the number of yet-to-be-executed tests.

Here are some examples for these criteria:

The number of critical bugs opened in the last test cycle is less than w
The count of new bugs found in the last few weeks is trending down at a rate of x bugs/week
Testing was executed for at least y days on the last version without any new critical bugs and not more than znew high-severity bugs

Note, however, that bug counts and trends are not covering all the quality aspects of a product. An exit criteria list that tracks only bug statistics does not guarantee much; you can meet all the bug-related criteria and still have a bad product. Your requirements, product manuals, support readiness, hardware quality, etc., may not be as good as they should be. Additionally, the limits you put in the exit criteria (e.g., maximum open bugs) may be too relaxed; it takes some experience to know how to set them for a specific product line.

A good exit criteria list provides an orderly list of attributes that research and experience showed to have impact on product quality. Improving the results on these attributes is known to have a positive impact on product quality.

For example, I found this outline for release criteria useful. You can also download my exit criteria spreadsheet. (Make sure to read the “How to use this file” text before adopting it wholesale.)

Monitoring Quality and Forecasting Status

A list of exit criteria can be used as a simple go/no-go decision tool when a release date arrives. However, it can be used much more effectively if it is referred to all along the product development timeline.

For example, let’s say you have a criterion of “No more than one hundred open bugs with medium or low severity.” Assume you also have the following information (you can get some of it from the bug database, and other pieces come from the development and test teams):

Current open bugs: 300
Time to release: 10 weeks
Projected new bug submissions in the next 10 weeks: 100
Number of engineers assigned to fix bugs: 2
Average time to fix a bug: 0.5 days (so in the next 10 weeks, two engineers can fix 200 bugs)

From these data you can estimate that the bug count expected on the release date is two hundred. Your exit criteria, to remind you, is no more than one hundred.

This means that by the current estimation, which is done ten weeks ahead of release time (!), you can proactively raise a red flag and inform the development manager that they are at risk of not meeting a specific exit criterion. They need to assign more people to work on bug fixes.

While the example can be seen as fabricated (well … it is), it does convey an important message: Exit criteria are a useful tool to monitor the product quality at any given time and to forecast the expected status at release date. When used properly and consulted throughout the development time, these criteria can help avoid at least some of the firefighting we all experience just before a release date.

Many of the criteria in the list I propose are simple to track and do not rely heavily on estimations as the given example. If your exit criteria require that all the committed design documents are published, reviewed, and updated, the simple act of collecting the actual data will give an early warning when too many documents are not completed.

Getting Your Team On Board with Exit Criteria

You can’t just set the exit criteria or their limits without wide agreement. Any team that will need to deliver in order to meet the criteria must agree that the criteria are worthwhile and the limits are fair and achievable.

You also must set the criteria as early as possible in the project timeline. At that time all the stakeholders are optimistic, full of good intentions about quality, and not under pressure. They are in a mindset that allows discussing the criteria and limits in an objective manner. If you try to agree on criteria and limits a few weeks before release time, the limits that the development team will agree to will be significantly influenced by the current state of affairs rather than what good quality is.

Back to my friend.

Adding exit criteria ahead of time would have revealed a clear stop sign to the project managers, and my friend could have enjoyed the party. But what can be done in a case similar to the one described here, where such criteria were not established in advance?

I suggested to my friend that he “translate” his gut feeling to hard data and present it in a clear manner: “This is the current status: There are too many new critical bugs in the code and not enough time for proper testing. Based on these data, I conclude that the version quality is not fit for release.” These aren’t feelings or intuitions, but a professional opinion based on facts, experience, and in-depth knowledge of development processes.

I called a few weeks later to hear how it went. My friend had presented graphs:

The number of new critical bugs on a daily basis (one or two per day; no improvement trend)
The time the testers had to test each new release in the last two weeks (one or two days)
The percent of tests that were executed on the last version (about 25 percent of the planned tests)

This was enough to give everyone a bad gut feeling, and the release was delayed.

The moral of the story? It’s possible to translate gut feelings about quality to hard data. However, as a rule you want to avoid such situation altogether. The early definition of exit criteria, combined with data, lets you monitor quality throughout the project’s lifetime and avoid potential last-minute disasters.

Topics:

process improvement quality assurance quality management testing

About The Author

Michael Stahl

Michael Stahl is a SW Validation Architect at Intel. In this role, he defines testing strategies and work methodologies for test teams, and sometimes even gets to test something himself - which he enjoys most. Michael presented papers in SIGiST Israel, STARWest, EuroStar and other international conferences, and is teaching SW Testing in the Hebrew University in Jerusalem. Before starting his career in testing in 2000, Michael worked at Intel’s manufacturing facility in Jerusalem, Israel, as a chip-level test engineer.

Michael is an executive board member of the Israeli Test Certification Board (ITCB), holds a full Advanced ISTQB Certification, and chairs ITCB’s advisory board. Some of Michael's presentations and papers are available on www.testprincipia.com.

Community Sponsor

User Comments

18 comments

Diwakar Menon

November 15, 2016 - 10:20pm EST

The problem with coverage data is that it ignores the question "What if I go live?" Testers caught up in trying to prove, as your friends team required, that you have less than x number or critical defects, ignore the perils of going live. We have gone live with critical defecrs, but always with a thorough analysis of which part of the business would be impacted, what risks we faced and whether it was well worth it given the business imperatives.

Diwakar

Michael Stahl

November 16, 2016 - 5:25pm EST

Hi Diwakar -

I don't think there is a contradition. A possible exit criteria for you would be "Analysis of possible impact to the business for each of the critical bugs was done"; "A risk analysis of all critical bugs was done".

Exit criteria are there to make sure you go through the process your organization thinks is needed to take place, before a milestone can be announced. The details will vary between organizations.

Michael.

Diwakar Menon

November 15, 2016 - 10:20pm EST

Diwakar

Johnny Marin

November 15, 2016 - 11:06pm EST

I just read your article, really like it and wanted to use as reading in a testing curse I’m making. But wanted to know if you have a Spanish version.

Else if you grant me to translate

Michael Stahl

November 16, 2016 - 5:18pm EST

Hi Johnny -

You can use and transalte. Please remember to give credit to the source and to keep the copyright note.

Regards, Michael

John Wilson

November 16, 2016 - 4:21am EST

Interesting that a quality problem has been percieved as a test problem and not a whole team problem, and that the fix has to come from test not the whole team. There's a far bigger problem than exit criteria to be resolved here and putting in place 'improved' exit criteria, at whatever stage, just ain't gonna fix it.

Michael Stahl

November 16, 2016 - 5:32pm EST

Hi John -

I agree that if exit criteria (putting them in place; reviewing them) is considered to be solely Test responsibility, it won't lead to a marked improvemnt in the product; it may provide a better release control, but indeed you don't want to even get to this point.

As I recommend, defining the exit criteria is a cooperative effort that should take place early in the product life cycle, with all the involved stakeholders. Tracking them should be the ownership of a specific person (e.g. product manager) - becasue spread ownership does not generally work in my opinion. Taking measures to meet them is again a cooperative effort.

In the case of my friend, I agree that the fact the program manager only considered the "dry" numbers and did not read further into the larger picture is hinting to a bigger quality problem.

Thanks for the comment!

Michael

Herb Ford

November 21, 2016 - 6:24pm EST

Did you friend have as part of his plan, regression testing. Normal practice that I try and follow is to always have regression time backed into my QA Strategy.

Michael Stahl

November 22, 2016 - 5:44am EST

Hi Herb -

Planning for regression time is fine - but in this case, regression failed and failed again.... even if you do plan for regression time (as was the case here), there is a limit to how much time you'd assign to it. Jut before release time, you assume things will be pretty much healthy.

(but all this is beside the point; the story is just a way to show how Exit Criteria should / could work for you).

Michael

Tim Thompson

November 22, 2016 - 6:19am EST

Regression is important, but why does it always happen at the end of a cycle just weeks before release? There is not much time left to uncover and fix issues. Regression tests are in my opinion a subset of the regular tests and regression tests are most effective if they are fully automated and can run autonomously. That way regression can happen every week or even every day depending on how many resources are available for automation.

Things being healthy just before release time is also not a given. Feature development is pushed to the end ignoring that time is needed to properly test and fix. The only way out of this is to eliminate set release dates. Setting release to a date is artificial time boxing. Sometimes things take a few days longer. It will also benefit the customer because if features are done they do not have to wait until release date to get the features. Continuous delivery in unison of continuous regression will generate more value and also take the pressure of teams. The business and customers will get features when they are done and working, not on a date that was selected without quality in mind. I think everyone is served better when we take that extra week to get it right the first time rather than constantly chase preventable issues in production.

Michael Stahl

February 5, 2017 - 9:54am EST

Hi -

Re: Regression: Of course it's not done only before the release; its done every drop. Still, you may have a situation where the last run revealed bad bugs. This is what regression is...

As for set release date: it depends on the industry. In some cases, if you don't deliver a SW drop up to a certain date, the customer shuts the door on new releases since they need the time to do integration to meet their product delivery time. Their delivery time may be set by external events, like the Holiday season.

Thanks for the comments, Michael

Tim Thompson

November 22, 2016 - 6:09am EST

Ideally, yes, we have exit requirements and we have a project management team that looks at quality as a factor in deciding if a release goes out or not. Reality is that we do not get requirements which means we cannot really say if we met the goals and no matter how many high or critical bugs are in place, the release ships on the date set months ago.

As always, the truth is in the middle. Does it really matter if there are 200 bugs if most are visual issues that are subjective to begin with and all of them only occur in features used by less than 10% of users once a year? Setting exit criteria as "less than x bugs" is falling short of what matters. It also assumes that all testers stick with the plan and craft detailed test cases that are highly likely to find all bugs. Many simply rely on exploratory testing at best and typically only go down the beaten path. Just because there is no bug report does not mean that the bug does not exist.

It is also more and more common to de-emphasize quality over features. Ask any product manager what they will rather do in a set amount of time: fix bugs or add new features? I'd be surprised if any one of them says fixing bugs...unless we are talking about control software for medical devices or nuclear power plants. I do not disagree with that approach as long as there is always time set aside for bug fixing. Sadly, often there is not and we ship when planned and fix stuff if customers complain. That also happens less, look at the craptastic state of most mobile apps. Users were groomed over the past years to put up with a lot.

Tejaswini U

January 25, 2017 - 1:22am EST

I just read your article... Its really nice brief on Exit Criteria. Thank you.

Per Ekvall

February 10, 2017 - 5:16am EST

This did we do back in 2001

Using known quality as calculations of Open critical bugs in latest and next latest back to 10'th build devided with number of executed prio a test. Normalized with #test passet. Goal to reach 75% known quality.

There were no such ting as gut feeling.

Costas Chantzis

February 13, 2017 - 11:25am EST

Well, I am NOT surprised. When you have companies like Microsoft marketing fundamentally flawed, new products - means limited due diligence to verify/validate each new product meets consistently its labeling claims - and then fixing them by 247 retrieval of related data form users who have paid to use same, then why others marketing same/similar products should be any different?

I provide below what I believe are the proper NPDLC phases for developing and commercializing high quality and low cost new product winners consistently over time and invite your comments:

1. Ideation: Come-up with an idea you think it could be a market winner.

2. Patenting: Evaluate the merit of getting Intellectual Property (IP) protection for your new product idea.

3. Design: Develop non or partially functioning models of your new product idea and start soliciting user feedback.

4. Feasibility Proof: Perform various technical/business studies to determine IF your new product idea is feasible.

5. Development: Develop a product/process to yield a number of fully functioning new product prototypes as if each of them were to be used by the final user/customer.

6. Controlled Field Use Trial: Let potential users use the new product as IF it were purchased by them and generate feedback based on a previously approved protocol.

7. Design-Verification/Validation: Fine-tune new product design/development and complete design verification/validation studies to confirm the new product meets User Requirement Specifications (URSs) per its respective Labeling Claims.

8. Regulatory Filing (if required): Prepare any required documentation and submit it to appropriate Regulatory Authorities for securing their approval to market it.

9. Full Field Use Trial: Repeat above phase 6 but using a comprehensive pool of potential users/customers.

10. Scaling-up: Scale-up your process so your new product could be manufactured in high quality/low cost conditions.

11. Technology-Transfer: Develop documents' list for new product's production in high volume.

12. Process Validation: Validate your process so it produces final product consistently meeting specifications.

13. Manufacturing: Start manufacture and ship product to market.

14. Launch: Launch product and monitor its acceptance by the user.

15. Line Maintenance/ Improvement: Compile, analyze user feedback and continuously improve product.

What is your opinion? Please explain, if you could.

Michael Stahl

February 13, 2017 - 3:39pm EST

Hi -

The flow you have above seems to me a classic waterfall one which is not always fitting the situation at hand; but as waterfall life-cycle descriptions go, it's usable.

To link this back to my article, I'd expect exit criteria to be discussed and agreed upon around steps 3 or 4 in your flow.

A side note: I don't share your view on Microsoft. It is true that their products have bugs, but you need to balance this with two things before writing them off as non-quality minded:

- The task they have is huge - because of the huge installed base, platforms, HW combinations, backwards compatibility promise etc. It's a miracle it works at all.

- To the best of my knowledge Microsoft invests a lot in testing, more than other companies (e.g. Google). You can argue that they don't use this investment effectively, but not that they don't care about quality. If every company would invest what they invest in testing, overall SW quality would probably benefit.

(I don't know though if and what Exit criteria they use, nor when in the PLC they set them)

Michael

Costas Chantzis

February 13, 2017 - 5:27pm EST

Michael,

Thx for your comments. I appreciate them. My response follows:

1. I believe my posted npd outline is applicable to any product. software only products might not use phases 8, 10, 11 and 13 depending complexity and regulatory requirements, if any.

2. Yes, you are correct. User Requirement Specifications (URSs), Product Labeling Claims, Product Specifications, Critical Quality Attributes (CQAs), etc. must get locked down no later than Phase 4. Unfortunately, evidence suggests a significant number of software is developed/commercialized with either absent such deliverables or extremely poorly completed.

3. I wish I could agree with you about Microsoft's npd and commercialization strategies. Yet, there is abundance of evidence on the record about Microsoft's at best questionable npd and commercialization strategies.

Again, thx very much for your time and thoughts.

Joseph Cachia

October 29, 2017 - 2:04am EDT

Great contribution, as awlays valid even reading it months later. One aspect I believe is important is what to do to get to your exit criteria, which are essential as proposed in your post. Thanks and keep them coming ;)

Joe

Language Not specified

Exit Criteria, Software Quality, and Gut Feelings

Creating Comprehensive Exit Criteria

Monitoring Quality and Forecasting Status

Getting Your Team On Board with Exit Criteria

Community Sponsor

Lets Hang!

Featured Resources

User Comments

You May Also Like