Automatic Metrics: Turn Down the Volume and Increase Awareness

article

August 21, 2007

Summary

I have always found software development metrics to be problematic. On one hand, we need desperately to improve schedule and quality visibility throughout the process. On the other hand, we don't want to slow things down and distract the team with onerous metric collections. Measuring things that are easy to count, like lines of code, may not be meaningful unless you value your software by the pound. Even harder is deciding what to do with the information to keep it from being biased or politicized.

Ironically, the metrics themselves may drive bad behavior. If lines of code are measured, then more lines will be written whether needed or not. If defect rates are counted, then fierce debate may arise about what is actually a defect as opposed to a missing or ambiguous requirement. I was on a project once where the developers conspired with the testers to cook the defect books to assure their bonuses by developing creative new classifications.

If misused, metrics may become meaningless. In another case, the project manager simply required that all status reports be "green," which meant all was well—whether it was or not. Any other status would invoke unwelcome management scrutiny, so everyone reported as green and then created a black metrics market on the side to communicate the real status.

After more than twenty years in the software development business, I have not found a way to measure developers that is objective, unobtrusive, and meaningful. Until now, that is.

Actions Speak Louder
The company 6th Sense Analytics offers an innovative new technology that allows customers to track a set of common development activities, such as checking code in or out, automatically in the background. The company supports "sensors" for a wide range of development tools that measure what the company calls Active Time and Flow Time. Active Time is the time spent constructing software, and Flow Time is an uninterrupted period of at least twenty minutes during which a developer is likely operating at peak efficiency.

Activities can be tracked by person and project, even down to the component level. This provides a level of granularity and visibility that is otherwise unavailable without onerous time-reporting systems that rely on individual recollection and manual capture. Granted, this level of detail may be overkill at first until you know what you are looking for. But as you gain a feel for the patterns that develop, you can always expand your analysis.

The beauty of this approach is that it is not only fully automated, objective, and unobtrusive but it also measures activities in real time. You can analyze what is happening as it occurs, in what time intervals and increments, and thus explore trends, patterns, and correlations among many factors.

The downside of this approach is the inevitable specter of Big Brother. There is a knee-jerk resistance to being "watched," with its hidden assumption that if someone knew what we were really doing it would be used against us. I find this reaction especially incongruous with the fact that nothing is more visible than missing a deadline or delivering substandard software.But it turns out that this resistance is probably misplaced. Early adopters of this new technology found it more likely that the truth set them free. For example, some managers harbor an unstated suspicion that if they can't see you working then you probably aren't working, which leads to restrictive policies about working from home. Yet one company was surprised to find that their most productive developers were those who telecommuted. Even more revealing, developer productivity in the office increased measurably when the manager was out of town. In other cases, tradition was reinforced. Pareto's principal held true: 80 percent of the work was accomplished by 20 percent of the people.

By turning down the volume of what is being said and simply watching what is actually happening, you can strip away the assumptions and biases that often obscure the truth.

From Data to Decisions
Of course these are not the only metrics you will need. Activity does not convert directly into achievement, so you will still measure traditional benchmarks including requirements coverage, defect arrival and closure rates, and post-release issues. But the key here is that you now have a way to uncover patterns that may provide early indicators of trouble. For example, you may believe that you have your priorities clearly established, only to discover that resources are being hijacked by interruptions or back-door demands.

It may also take some time before these types of metrics acquire meaning, in the sense that you may first see the symptoms and then look back for the cause. Instead of discovering what you think is important and then starting to measure it going forward, you can look back at what actually happened and then look for correlations. The cool part is that the cost of collecting the data is so low that you can afford to track more than you think you need, so if you later discover a use for the data, you will already have it.

And, as always, you have to be thoughtful about how you use the data. Decreeing that a particular measurement is important may simply warp behavior without improving results. The hidden meaning behind metrics is not how many or how much, but so what? Put extra care into determining how the data informs decisions, not the other way around.

Topics:

development lifecycles

About The Author

Linda Hayes

Linda G. Hayes is a founder of Worksoft, Inc., developer of next-generation test automation solutions. Linda is a regular columnist and contributor to StickyMinds.com and Better Software magazine, a columnist for Computerworld and Datamation, author of The Automated Testing Handbook, and co-editor (with Alka Jarvis) of Dare To Be Excellent. Contact Linda at [email protected].