The Journey Through Traceability to Process Control

article

November 30, 2005

Summary

Taking a team from an undisciplined product development strategy, through an organized process with visible tracks, to a mostly automated, self-improving process is a long journey. It requires a good understanding of change, an adequate SCM tool or tool suite, good people for sure, and a lot of common sense. The journey is well worth the effort, though. I've been down the road more than once. It leads you to the path where you can manage properly and let the configuration management be handled automatically.

Traceability
Traceability is a cornerstone of change management. It tells you why things have changed. It lets you identify the level of verification for particular changes. It links the Requirements and Requests of customers and product managers down through code, test cases and test run data.

Granularity
Here's an example of a fully traced system development.

Requirement R1: Build a CM system
Request Q1: Make it easier to use
Request Q2: It's not working, we need it to work
Activity A1: Implement R1
Activity A2: Implement Q1
Problem P1: It doesn't work spawned from Q2
Problem P2: Testing failed - see R1
Change C1: Implement A1 part 1
Change C2: Implement A1 part 2
Create Build B1 using C1 and C2
Change C3: Implement A2 and fix P1 and P2
Create Build B2 using B1 and C3
Testcase T1: Test the system to ensure it meets R1
Testrun TR1: System failed T1 testing run on B1 - Created P2
Testrun TR2: System passed T1 testing run on B2

It doesn't really matter which system you're developing, you can probably shoe-horn your system into this template. The problem is, although I have full traceability, the granularity of my traceability makes the data all but useless. Granularity will certainly dictate how useful the data is.

Although the above example gives little traceability information, it does cover a lot of bases. It identifies the fact that requirements (Rs) and requests (Qs) are needed for development to proceed. It shows that the Rs and Qs spawn design activities (As) and problem reports (Ps). It shows that Changes (Cs) address the As and Ps. It shows that builds (Bs) are created with certain Cs. It shows that test cases (Ts) address Rs. It shows that test runs (TRs) run Ts using Bs and spawn Ps where necessary.
There are a lot of traceability items here - and we haven't even mentioned source code. The goal is to get the right level of granularity and process so that this simple example expands to adequately express your development.

Process and Relationships
Many newer developers are afraid of all of these relationships. They just want files glued together with make files. That's how their developers work after all. There's nothing wrong with that, as long as the developers design whatever they want and never deliver it to anyone.

All of these relationships are required (and perhaps a few more). The business decides what business it's in and develops products to meet the target market. Product requirements are broken down and turned into design activities. These are implemented as a series of changes, perhaps broken up into several major and minor releases. And so forth.

So where do you start? We can start from the inside out. Have developers work on and check in changes. A change combines new revisions of files together for a particular reason (or set of reasons if you allow less granular traceability). Start there. Make sure you're starting with change packages so that you can manage changes through your system. Otherwise you'll be managing file revisions through your system and that's a much more complex task (although maybe it's easier for a basic version control tool, like CVS or VSS, to support).

Then formalize your reasons. When a change is created, have a rule that it must have reasons. That is, it must reference one (or more) problem report or activity. Problem reports should always be reproducible problem descriptions. Activities may be coarse granularity to start out, but you'll quickly find the benefit of sub-dividing such activities into smaller tasks. Hence a work breakdown structure (WBS) is ideal for expressing activities.

In code reviews, make sure that the code changes reflect all of the reasons specified, and only those reasons. Do not promote part of a change - don't even have that capability.

Promote entire changes. If the change is incomplete, complete it first or break it into several complete changes and revise the reasons.

Start with a baseline (possibly empty), and add changes to the baseline to produce a new build. Track the baseline and these changes against the build record. Use a previous build as the starting point for each new build. Every so often, usually around significant milestones, create a new baseline to work from.
There's not a lot of rocket science here. If you get this far you're doing well. You can rest here for a while, but to improve further, you'll want to go back and identify where your activities are coming from: product requirements and requests, whether or not these are written down. The next step is to write them down and use them as reasons for spawning an activity. Eventually, you'll find that you need a requirements tree, in parallel with the WBS, which sub-divides product requirements and makes it easier to map them onto the design. When you do map them, trace the mapping by storing the Rs against the As.

The next major step is to look at your product requirements and make sure that there are test cases that can verify that the product meets the requirements. From there, track a set of test runs that run (and re-run) test cases against a series of builds until they all pass.

The initial simple example gives the relationships you need. All you have to do is get the granularity right and establish a working process that everyone will follow.

The Pay Back: Make it More Than Easy
Dead stop. Not everyone is going to follow the process. Legislate it! It still doesn't work.

All is not lost. You have to have the right incentives for people to follow process. Does this mean big bonuses and promotions? No. What people want is to make their jobs easier. This typically means that they want fewer meetings and less communication interruptions. They also want to know what they've accomplished.

This is where your CM tool comes into play. Start by showing them that if they use a change package, they don't have to check in each file one by one - just the change package. They don't have to do a delta/difference report on each file and glue them together for a review. Just do a delta report for the change. They don't have to type in a reason for each file revision. Just specify the reason(s) once against the change and check out each file against the change - and hopefully this happens by default, without any extra clicks. As a bonus, tell them that they can check it in whenever they think (i.e., have done their unit testing) that the change is done, and perhaps give them the option of promoting the change a level further to indicate that the build manager can now incorporate the change into the build. That way they're not worried about breaking the build. And they can check in a number of changes before deciding which are ready for the build.

When they're done all of these benefits, tell them that you don't need to bug them in the middle of the night because their change identifies its reasons, collects together all of the files, and hopefully, without any extra effort on the part of the developer, contains all of the context information that indicates which product and release the change is targeted to.

The SCM Tool Requirements
This is a successful plan, but you need the right tool. You need to track these objects and the relationships among them. You need an adequate database, and one that will allow you to change your process over time.

Some tools will give you a labeling scheme and let you fend for yourself by labeling different things in different ways so that they simulate different types of objects. Labels for change identification, baselines, builds, development streams, activity, etc. Other tools will have all of these concepts encapsulated as first order objects against which you can store the data you need.

Having the data is only one step. You want to be able to easily query and navigate it, especially the traceability links. If you can navigate the links quickly and easily, you'll be able to get the answers you need on the spot, without having to run to a third party to interpret the data. This is crucial for a CCB meeting, for example.

You need to be able to produce your reports: version description documents, testing coverage, etc. Your SCM tool should have a number of pre-canned reports, but more importantly, should let you easily define your own reports, because your process isn't going to come out of the same can. Your product manager will thank you that a report ready for the technical writers took a few seconds, rather than a few days or weeks, to produce.

Go further with your traceability. Ensure your SCM tool tracks a unique identifier for a build that it can (possibly automatically) insert the identifier into your builds so that you can query exactly what your customers and/or deployments have. You may be able to go one step further and configure your SCM tool to insert source code revision identifiers into the code automatically - without any need for developer compliance to a special standard.

Any SCM tool should also be able to trace any line of code back to the reason it was introduced. This should not require detective work. It should be quick and easy.

You'll find that traceability really defines the term integration when you look at the integration of your CM functions. If you have good integration, traceability navigation will be quick. You won't be constrained by which data is imported and/or exported from each function. If you have seamless integration, you'll be able to add to or change your traceability links on the fly, and you'll be able to navigate those as soon as they're defined.

Testing Traceability Matrix
A testing traceability matrix helps ensure that:

A test case exists for each requirement
A test case was run and successfully passed for each requirement

Conceptually, this is great. In practice, matrix is overkill. Typically there are hundreds or thousands of test cases and one or a few that exercise a particular requirement (unless your requirements are very coarse). So you end up with a very, very sparse matrix. I'd much prefer a couple of simple checklists, with a list of test case numbers/names beside each requirement and vice versa.
In any event, there are a couple of prerequisites for this to work. One is sufficient granularity. If you have half a dozen requirements for implementing a new CM Tool, the matrix will be next to useless. You haven't identified your use cases sufficiently (unless you write a book for each requirement). And the test matrix won't tell you if you've covered your requirements. You'll get a large number of test cases against each requirement, with no indication as to whether or not they completely cover the requirement. Keep requirements simple so that the coverage of the corresponding test case(s) can be readily seen.

In turn, keep your test cases specific. Don't have a test case that says "test the user interface" or even one that says "test the file menu". It might be ok to have one for a particular menu button, or you may have a number to test the various significant options. Either way, coverage would be identified a lot more easily. If you have a single test case covering several options on a button, remember that the test case will be marked failed if any of the options fail. So it might be better to have specific test cases. This is most easily facilitated by a hierarchical test case definition, where you can decompose a test case into sub-cases if necessary, much like decomposing an activity into several tasks.

The second pre-requisite is to have your test process feeding its data into your test run repository. Ideally, your test run repository will track a test case against either a specific build or a sequence of closely related builds. If the process doesn't feed the data properly or the repository doesn't track it properly, your traceability will be compromised.

I would recommend that you clearly segregate your test cases into specific areas (see my May, 2005 CM Journal article). In particular, there are different levels of requirements: customer requirements, product requirements, and software/system requirements that need to be dealt with separately. Customer requirements are tested to verify that you meet the needs of the customer. Product requirements are tested to ensure that your product meets its specification. Software/system requirements are tested to ensure that your design was properly implemented.

Configuring Your Process
Your SCM tool must allow you a good level of customization so that you may configure your state and task-based workflows. Your process is going to evolve. If you can't embody it in your CM tool, you'll never reach levels 4 and 5 of your capability maturity model. You need to be able to change it while maintaining control and improving traceability.

Process configuration means configuration of your object workflows, your data schema, your roles, your user interface, your rules and triggers, your data access permissions. If your SCM tool allows incremental changes to these without any end-user down time, all the better. You can continuously improve your process. You can define your process loosely and then tighten up the rules. Or you can over-tighten the rules and then relax them.

Avoid an SCM tool that requires you to stand down while you script a new process.
Hard Rules versus Policy versus Convention
I like to look at process at various levels: rules, policy and convention. All are needed.

Let me give you an example of a hard rule versus a policy. The CM tool forces you to use a change package to check-out/check-in files. This is a hard rule. You won't get files in the system without a level of traceability. But if change packages are just a policy (e.g., they are optional database lists that link revisions together), be careful. You're likely to find some revisions not included in a change package at some point or other and your basic assumptions underlying all of your queries may yield incorrect results.

I like the following loose definitions:

Hard Rule: Implemented by the CM tool or by the instantiation of the CM tool (e.g. the tool is configured from the start to support CM this way), as in the above change package example.

Policy: Configured within the CM tool. For example, a change package may link to one or more problems. Perhaps you want a policy that forces more granularity, thus allowing you to link to only a single problem. You might change the schema so that the problems field goes from a list value to a single value.

Convention: The CM tool allows multiple options and will support a convention that you choose to use. A common case of this is a naming convention for labels which correspond to baselines. A more critical one might be that baseline labels are not modified or applied after the baseline has been defined. The product will continue to support abuses to these conventions and will work well. The users may have more difficulty in identifying their data, or in the worst case, definitions may change (e.g., what is a baseline if it can change?). Convention turns into policy when the tool is re-configured to take advantage of the convention. For example a "display baselines" button may show all labels which correspond to the Baseline naming convention. If this becomes a key step in your process, you had better start to validate the naming of baselines.

In general, process control needs to be made up of a combination of these three items. There are things we know and that should be enforced as hard rules. There are policies that will change over time as our process changes. And there are conventions that may or may not become policy. When we introduce them we want to start working with them before reconfiguring things to enforce them.

Process Control: Basic Functionality
Process control has many different connotations. Think of an assembly line and then think of software development. The link between these, as far as process control is concerned, is the improvement of quality. Process control allows you to improve quality by helping you to automate, support and enforce process.

Process control can be viewed both within a function and between functions. For example, a process state diagram may be enforced for problem workflow. At the same time, checking in a source change could trigger a change in the problem state.
Process Control can take on many forms.

Enforcing a specific workflow for an object
Restricting operations based on a set of rules
Triggering events on data state transitions
Protecting data based on role
Configuring the user interface based on role

These are key elements which your SCM tool or tool suite should support. If they are not there, you're in for a lot of glue.

Talking to the Rest of the World
Process control is going to be easy or difficult to implement, depending on your tool suite. If your suite of tools all work together to a common architecture (e.g., Unix commands), you'll be able to take the output of one tool and feed it directly into another. Even better, if you have a single tool integrating a set of functions with an adequate process engine and a single repository (e.g., CM+, MKS), process control among these functions will be a matter of focusing on the process, and not the implementation.

If you have a set of tools that are glued together, even if they're from the same vendor, look closely at them. Do they have a natural means to talk to one another? Is it different for each component tool? Do they share a single repository? If the answer to any of these is no, you'll want to be careful. Even a perfectly pre-packaged configuration may fall on its face when you try to change the process to meet your own corporate requirements.

Even a single integrated tool will have to talk to other external tools at some point - to pass financial data, inventory info, sales and customer info back and forth. You may be able to extend your tool suite to include some of these functions, but eventually, you have to talk to some other tool. The key will be to have tools that communicate easily. If they have command line capabilities, they usually listen well. If they can issue commands at the operating system level (e.g. from within trigger) they usually talk well. A good communicator does both. A great communicator also speaks the major integration standards lingo as well (e.g., Eclipse integration).

Summary
We have covered a lot of ground: traceability, process, process control. We've also looked at areas where SCM technology should be helping you. If you have a well-oiled process, use this article to review it and identify the areas that may need a bit of work. If you don't have a good process in place, start from the inside out as described earlier: start with changes and build your process around it.

I haven’t addressed a lot of key areas, but have done so in earlier articles. I will continue in upcoming articles, with areas such as applying change management (and even traceability) to requirements and the pervasive theme of streaming your development, from the R's through to the TR's, and then into release streams. Somehow I feel that if you start down the road in the right direction, you'll start to see your way through to the desired end point.

Topics:

configuration management

About The Author

Joe Farah

President and CEO of Neuma Technology, Joe Farah is a regular contributor to the CM Journal. Prior to cofounding Neuma in 1990, he was a director of software at Mitel. In the 1970s, Joe developed the Program Library System (PLS), still heavily used by Nortel (Bell-Northern Research), where he worked at the time. He's been a software developer since the late 1960s.