The huge software development project was in trouble—integration testing was discovering significant issues that somehow escaped unit testing and confidence in the development vendor was plummeting. Cost and schedule overruns had been staggering. The sponsoring client was wondering what they had to show for their hundreds of millions of dollars invested to date: whether the project was “almost there” as was being reported or a black hole into which they could pour additional cash without making a difference.
The root cause of the problems weren’t subtle: The project started two years earlier with a vague notion of requirements and a fixed end date. Project scope was defined broadly and there was evidence of feature creep from the start. Trying to adhere to the schedule, the requirements process had been rushed. When concerns about requirements were raised, the project management team (inexperienced with projects of this size and complexity) had said, “We will correct any issues in testing.”
During design there was a half-hearted attempt by the vendor to establish traceability between product and requirements, but “There wasn’t time for all that process” so a spreadsheet was used that few people understood and that wasn’t maintained. The detailed design document was more than 15,000 pages long. The client team had (contractually) thirty days to review it and identify errors, omissions, and items requiring clarification. To facilitate review, the document was divided among a large team of people, each reviewing their own section. The thirty-day mark came and went, then the sixty-day mark ... ninety days ... individual reviewers met with designers throughout, identifying individual issues and processing them in parallel. There wasn’t time to look for patterns. There wasn’t time to understand the whole of the design. At the six-month mark, the client management team and the vendor declared design “sufficient” because much of the coding (which was happening in parallel) was complete and any further design issues would emerge during testing.
Unit testing reported going smoothly, but integration and acceptance testing—running concurrently because of schedule concerns—hit a wall. The number, severity, and “surprise” of the issues that emerged from “acceptance testing” resulted in gnashing of teeth, rending of hair, and a sudden one-year slip in the project. The year-long delay was probably what prompted the sponsor to request the project management review that got me involved.
Identifying the source of the problems wasn’t difficult (I imagine you see them in the preceding paragraphs). What was challenging for me as a reviewer was trying to decide what to recommend as next steps. The recommendations were due just as the project was scheduled to emerge from its one year “quality rehabilitation” period. Decisions were needed about whether to continue the project as initially envisioned, reduce scope and salvage the work to date, or euthanize the whole undertaking. Recommendations to kill the project would have been gladly received—sponsors were outraged that the project had gotten so far out of hand and a blood sacrifice would only begin to appease them.
The project review I was participating in was not technical, so I had no direct visibility into the quality of the technical solution. My concern was that the client might be just few payments away from owning a Ferrari. If the quality issues had indeed been addressed during the one-year hiatus, it might make sense to continue the project.
I recommended a technical quality audit to inform the go/no go decision—but the audit had to be done quickly or the impatient sponsors would make the decision without the input (they were getting torches and pitchforks ready for the meeting where the future of the project would be decided). I reached out to people who I trust for ideas about what to review: how to triage five million lines of Java code in the space of about two weeks to assess the project’s health and then inform a decision about whether to kill the huge project (forfeiting the investment to date) or invest hundreds of millions of dollars more rolling the product out.
The list I accumulated from my friends’ expert input was worth sharing. We decided on a combination of the following criteria:
1) Random code—Roll dice to make sure that you don’t just look in obvious places
2) Code that has a history of high fault rates identified in prior testing
3) Code that is called frequently
4) The largest programs (lines of code)
5) Sections of the system that represent a high rate of change (requirements, design, code, test cases)
6) Code with high cyclomatic complexity—A mathematical way of determining code complexity based on the amount of branching within a module
The review involved checking selected code for maintainability, consistency with standards, thoroughness of coverage for unit tests, correctness of unit test results, clarity of traceability to design and requirements, and anything else the reviewers thought was interesting or relevant.
The goal of the review was to get a sense of the quality of the current work product to inform a decision about whether the project was built on a fundamentally flawed foundation, or if the year-long delay to “address quality issues” had indeed been used to address the problems and create a system that had a chance of being maintained in the future. The result of the review would be somewhere on a continuum between:
The review wasn’t going to make the decision, it was going to inform the decision—primarily by identifying the risk of using the code reviewed as a foundation for further development and rendering an opinion about maintainability.
As you can imagine, the results weren’t at either extreme—but the review team felt, based upon the limited scope of their review, that the code reviewed was reasonable quality and was likely maintainable, although they had recommendations for the project about artifacts they might request from their vendor to improve maintainability in the future.
Reviews can be messy. Sometimes it’s hard to know where to start, particularly when you are in triage mode and can only review a small sample. I was pretty happy with the criteria for selection that we came up with (particularly “Random”, which had not occurred to me and I think improved the sample and the team’s confidence in their findings).
Postlude: The project was cancelled shortly after the review because the total cost to roll the system into production was estimated to be prohibitive. The project was only able to get to that assessment because the code review did not suggest immediate termination.
The purpose of this article is to share useful list of review criteria. One purpose for the comments section below is for you to add your ideas. If you could only review a few things, what criteria would you use to select what is reviewed?