Lean Development Principles for Branching and Merging

[article]
Summary:
By reworking lean principles for the branching and merging arena, we're able to create automated builds and unit tests to increase effectiveness and improve quality in software configuration management. Individual developers and teams alike can benefit from this process-improving strategy.

By reworking lean principles for the branching and merging arena, we're able to create automated builds and unit tests to increase effectiveness and improve quality in software configuration management. Individual developers and teams alike can benefit from this process-improving strategy. With lean principles, we can"

  1. Eliminate Waste - Eliminate avoidable merge-propagation (multiple maintenance), duplication (long-lived variant branches), and stale code in infrequently synchronized workspaces (partially completed work). Detect these sorts of situations using some judicious metrics (discussed further below).
  2. Build Quality In - Maintain codeline integrity with (preferably automated) unit & integration tests and a Codeline Policy to establish a set of invariant conditions that all checkins/commits to the codeline must preserve (e.g., running and passing all the tests :-)
  3. Amplify Learning - Facilitate frequent feedback via frequent/continuous integration and workspace update.
  4. Defer Commitment (Decide as late as possible) -- Branch as late as possible! Create a label to take a "snapshot" of where you MIGHT have to branch from, but don't actually create the branch until parallelism is needed. See the example below on "branch on conflict".
  5. Deliver Fast (Deliver as fast as possible) -- complete and commit change-tasks and short-lived branches (such as task-branches, private-branches, and release-prep branches) as early as possible and merge back to the mainline.
  6. Empower the Team (Decide as low as possible) -- let developers reconcile merges and commit their own changes (as opposed to some "dedicated integrator/builder"). Educate and train developers in patterns and their tools such that they are able to select the most appropriate pattern and apply it.
  7. Optimize the "Whole" -- when/if branches are created, use the Mainline pattern to maintain a "leaner" and more manageable branching structure. Use an appropriate balance of methods, some of which conflict.

Much of the above is fairly obvious, and yet there are some implications and advice that we can perhaps tease out.

Metrics for Waste and Whole Process Optimization
Principles such as Eliminate Waste, Deliver Fast and Optimize the Whole all benefit from the use of appropriate metrics to measure and track what is happening and allow feedback on the process (Amplify Learning) to be implemented.

Useful metrics (this is usually easier in those tools which track such information centrally) include:

    • Changes done in Task Branches vs Changes in the Mainline - are small tasks done directly in the mainline? How many changes does it take to implement tasks in different branches - is there a pattern?
    • Changes not yet merged from Task Branches to Mainline (WIP). Ensure this doesn't get too high.
    • Age for Changes not yet merged (how stale)
    • Files checked out in Workspaces (by age) - well worth keeping an eye on. Its amazing how many old workspaces can lurk around owned by people who have long since left the company. Is your SCM tool linked in to the HR system to manage company leavers (i.e. included in user permissions to revoke or remove access)?
    • Number of conflicts - either for Task Branches or for Workspace updates - indicates which to use, or perhaps even some repartitioning of the code to reduce conflicts.

Collection of all metrics needs to be automated, and should preferably require no extra work on the part of the developers creating branches or checking in changes.

Some people use automated scripts and yet with a configuration file which indicates perhaps the current set of active branches to be processed (mined for data). Consider if an appropriate repository structure, or branch naming convention can be used which is sufficiently regular to allow scripts to automatically deduce the presence of new branches by their location in the repository. For example, in tools such as Subversion, Perforce or Team Foundation Server, branches exist in the path space of the repository - make sure their location and the naming standard used is regular enough to be automatable.

Task Branches - Help or Hindrance?
Task branches are aimed at allowing changes to be made independently from a mainline and merged back as one complete unit. They permit frequent check-ins which may contain value, yet which are perhaps not fully tested and thus risk breaking the mainline.

There are many organizations, particularly agile development shops, which do not like to use Task Branches at all due to the perceived overhead of merging changes, and the dangers of delaying changes. They prefer to make all changes on the mainline and to deal with conflicts within the workspace.

As we mentioned in our original Branching article [1], there is in fact no difference between resolving a conflict which results from a Private Workspace Update, and the conflict which results from merging between two branches. In the same article we suggested the possibility of branching on conflict (the equivalent of the Private Checkpoint pattern) to ensure that the original workspace version is saved in the repository. This was specifically to address the requirements of an agile project.

Automated Merging Between Branches
This may seem rather dangerous to many people, and yet it is well worth considering.

To be able to do it at all will require good development practices and tools:

    • Reliable automatic merge tool
    • Developers checking in consistent change sets
    • Automated builds and unit tests to provide suitable quality guarantees

Of course it is not possible to automate 100% of merges, and there will always be the need for developers to get involved to resolve conflicts or failed tests.

The more you keep your change sets small and consistent, and merge each one individually, not in one big lump, the more likely it is that automated merge will be of benefit. In addition, the cleaner the code under development, the easier life will be too!

In the worst case scenario a subtle bug might be introduced. Some organizations are so fearful of this that they ban it outright, but bugs happen often enough with ordinary developer changes anyway, so the overall increase in productivity or automated merging (inspite of occasional introduced issues) is likely to be significant (consider appropriate metrics to track this).

Chris Berarducci describes the use of automated particularly in maintaining localized versions of a product. While the original article was written in 2003, it is still in use within Palm and of major benefit.

rcjul07-1

Critical to the "Merge as you go" success are:

    • The SCM Tool's merge facilities
    • Established roles and responsibilities
    • A single automated daemon
    • Conflicts can be handled on the engineers CPU

Development engineers like it:

    • They own the configuration and merging; they do not need to consult with another group or person to turn on or off the auto merge daemon.
    • They are relieved from thinking about merges unless there is a conflict or configuration change needed. When it is on and configured, changes checked into one tree will be migrated to the destination tree(s) in a timely and reliable manner.
    • It's easy to set up and the configuration files are tracked in the SCM tool.
    • It's easy to turn off and on

More information is provided in the submission comments:

  • Easier for QA and Program Management to generate release notes, and for other non-technical users to understand changes and their history

While the system as defined works very well, Palm are looking at managing conflicts via a centralized web interface - this would allow conflicts to be resolved from pretty much anywhere and by anyone and would reduce the overhead.

There is some similarity with this situation and the centralized code review system which has been implemented by Guido van Rossum at Google [3].

Conclusion
Branching and merging are a key practice in Software Configuration Management, and many organizations do not get the best value out of these practices. Applying Lean Principles can make a significant difference to your effectiveness.

The principles, and indeed the mindset, are key factors - if you are aware and looking for possibilities to improve your process you will find them (and most of the time they will not be difficult to implement). If you don't look and just rely on your tools support, or individual developers or teams to address this area, you are missing out big time.

We are keen to learn of more examples of good practice - let us know and continue to share.

References

        • Google Code Review Process, Guido van Rossum

        About the author

        About the author

        About the author

        StickyMinds is a TechWell community.

        Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.