In his CM: the Next Generation series, Joe Farah gives us a glimpse into the trends that CM experts will need to tackle and master based upon industry trends and future technology challenges.
Unix, with its SCCS capability, practically made version control a household name - at least for Unix software developers. On the VAX, it was CMS. Then came RCS and CVS. Some of these tried to be VC tools and did very well because they addressed the VC problem properly. They also tried to do more but not quite as well. Why? Primarily because VC typically deals with a single object at a time, while CM deals with an entire product or set of products. At its core, version control is about saving multiple revisions of a file so that any revision can be retrieved at a future date. The version control function includes:
- Recreating an old version of a file
- Allowing creation of multiple versions and parallel versions
- Identifying differences between different versions of a file
- Merging files versions which have diverged from each other
Files are generally bits and bytes. However, when a file is a directory, it has a more structured content, at least as far as the version control and CM functions are concerned. A directory contains a set of files and possibly other directories. In fact, it is the contents of the directory that defines it, for the most part. And just as files can change as they are edited, so too do directories change as their contents change. The change could be subtle - one of the member files was edited. Or it could be more visible - a new file was added or an existing one renamed or removed. Just as files need version control, so too do directories. So the version control function might also include:
- Defining the contents of a directory as a specific version of the directory
- Allowing parallel evolution of the directory, such as for a supported product versus the next release of the product
- Identifying differences between different versions of a directory
- Providing a means of modifying the directory contents.
The directory object may be a file system object, or a logical version control object, whose definition is stored in a file or a repository. The control portion of version control often is extended to have the meaning of change control. Change control defines:
- Who may check out a file
- Under what conditions
- When and how to branch
- When and how to merge
- How to add a comment for a new revision of a file
Change control, at the version control level, deals with controlling who may make changes to a file, and how its version tree may evolve. A version control tool may do a good job at change control for a given file, but the problem is that there is a bigger picture that needs to be tied together. And data is generally stored at the file level making it harder to evaluate the bigger picture. Still, add some basic reporting tools, such as listing versions of a file, listing directory revisions recursively, etc. and you have a fairly full set of version control functionality. Give me a branch identifier and a revision identifier within each branch, and I'm happy. I can identify any version.
A valid CM system must deal with the four basics: identification, management/control, status accounting, and audit. Many use this as the definition of CM, or at least the cornerstone. Version control is an important part of configuration management. In fact, first generation CM systems were just version control systems with a bunch of scripting around them. I remember doing the scripting on a project in the mid '70s and again in the early '80s - as a stop-gap measure until a real CM system could be put in place. The goal was primarily to capture information from the developers in such a way that a nightly build could be successfully accomplished: lots of glue, lots of short cuts, but it bought the time necessary to put a valid CM tool in place. A good version control system is just that. The problem comes when one tries to use a version control system as a Configuration Management tool. It doesn't fit the bill. It helps in the identification process. Because the version control system allows you to identify revisions of an object, it facilitates identification of a baseline as a collection of (generally compatible) object revisions. Not all version control functionality is applicable to CM because VC is applied to individual object. Configuration management deals with the entire product picture. So although a VC tool might attempt some change control, you quickly find that change control is not naturally done on a file revision basis. Rather it is better done on a change basis. Similarly, baselines deal with a collection of revisions. Many are frightened by the complexity of the CM problem. Try to build a CM tool that works and you'll see what I mean. I've designed compilers, database systems and numerous other middleware and tools. None comes close to the complexity of a CM tool. Compilers have very straightforward specifications, and database users basically all have the same set of expectations, give or take a bit. You can't really say the same about CM tools. Just look at a few (ClearCase, MKS, CM+, Accurev) and you'll see widely different perspectives on the problem. Maybe we can propose some specifications and expectations. An ideal CM system will:
- Automate all of the CM tasks except the high level directives
- Allow me to manage Changes rather than file revisions
- Let me easily specify a context for my queries and for my work
- Be easy to use with little training
- Perform reliably and responsively
- Let my distributed team work as if it were all under one roof
- Provide the traceability and reporting necessary to make timely product decisions
- Support the processes and data that my organization decides is required
- Make each of my team member's tasks easier rather than more difficult
- Be easy to roll-out, with neither high risk nor high costs
It is a partial list, but maybe it's not a bad start. There are some challenging expectations there. The key to a successful CM solution is to complete this list and then find a solution that will deliver. Twenty years ago the only way to deliver was to hire a large consulting firm that each team member could turn to whenever a task had to be performed. Even then, timeliness left much to be desired, not to mention cost. Today's CM solutions have evolved considerably. Most second generation CM solutions can deliver on a handful of these expectations. Third generation tools will do even better. In the end, what you want to be left with is customization tasks to fill in the gaps, but not customization of the tool. Rather customization of your processes and user interface to match your organizational requirements. The leading edge of the CM industry is rapidly approaching this capability.
Taking Short Cuts Doesn't Work, Simplicity Does
So what do you look for, specifically, in the technology of a particular solution. First of all the solution must hold the view that Configuration Management requires a broader data perspective of a product, or even a collection of products. This perspective should include all of the data and processes necessary to successfully manage the product. The scope of management may start with the development team and over time grow to the entire product team. The next thing to look at is how the solution expands over time to meet the widening scope. There are some solutions out there that are barely tapping their technology's potential, while others have been maxed out for years. It makes sense to look at the history of the solution to understand not only how it has grown, but how it can grow. Then remember that the road from version control to configuration management is complex, but it doesn't help to take short cuts. So here are a few other guidelines.
- Make sure that your data is centralized. What I mean by this is that it's not distributed across files, such as in your version control files. It's fine to have data there, but it's not going to be of much use if I have to open and query 20,000 files to perform a simple query. Your CM data must reside in some central database.
- Version control systems are limited to performing change control at the file level. This is very unnatural. Change control needs to be done at the change level. Otherwise I have to implement a way of linking all of my file revisions to a common cause so that I can perform change-based operations such as ensuring that the entire change is checked in, creating a delta/difference report for the change, and ensuring that all revisions get promoted for a change. Make sure that the change package technology is central to the CM solution. Adding change packages into a file-based solution doesn't cut it. An automatic baseline generation tool that works for file revisions will not adjust easily to one that works on changes. A file promotion solution has numerous problems that cry out for a change-based promotion solution. Change packages should be the central point of traceability.
- Change management involves change requests (CRs) as well as changes. These are not the same. A CR is basically a problem report or a feature request. It is requesting a change to the product. However, development teams need the flexibility of implementing CRs in the most beneficial fashion. For a complex CR, typically an initial change will be made to allow easier transition to the new functionality, at either or both of the Product and Design levels. Perhaps an initial change goes in to ensure that previously unused data fields are initialized. Or perhaps existing APIs are modified so that the development may proceed more easily. CRs will often involve a series of changes. Sometimes a change may address more than one CR. If you don't treat these separately, you'll find that you'll be shoe-horning your process into an inadequate model.
- Workflow management must be central to the CM solution. It's not sufficient to have a nice modeling tool if the CM solution has to be retrofitted to support the model. It's perhaps even worse to have one type of workflow management for problem/issue tracking, another for change management and another for build management.
- Look for a seamlessly integrated set of applications. You want to avoid the costs of training on different tools, of having vendors point fingers at one another, of coordinating upgrades across vendors. You want to avoid having to do the integration glue yourself. You don't want to have to deal with multiple repositories for different parts of your solution - it will inevitably lead to inconsistent data and limited traceability and reporting. And you don't want to have multiple administration and customization teams, one for each tool.
Once you find the ideal solution, have another look. What short cuts does it take? Does it use administratively complex technology to deliver benefits? How reliable is it? A solution that is simple can often deliver much more than a complex one.
Go bigger. If you've used a VC tool and moved to a CM solution, you know the benefits (and hopefully your solution wasn't a painful one). You moved from a developer-centric tool to one that was development team-centric. The next step is to go to one that is product-centric. Your product management solution should bring in all team players. The product managers, key customer representatives, QA members, the development team, the verification team, the technical support and customer management teams, project managers, executives. The next generation of CM/ALM tools will bring together the entire product team. That in turn will require an easier to use solution, and one that is even more reliable and can scale easily. This is not a pie-in-the-sky dream. Technology is moving ahead rapidly to provide these capabilities. Make sure the solution you move to is not on a dead end track. If it is, find another train on a track that reaches as far as you can see.