Cost Effective: Pre vs Post Comparison Regression Testing

[article]
Summary:

The approach to test via comparison of multiple API responses between production and test code versions is very effective and produces the required results, release over release. Improvements and changes, however, are needed to address changing needs. This is true for most if not all tech solutions; the economics principle ‘Law of Diminishing Marginal Utility’ also applies to software. A tech solution that excited stakeholders when first introduced could become stale very soon. A revamp or a new solution is needed to match evolving expectations.

Overview

The approach to test via comparison of multiple API responses between production and test code versions is very effective and produces the required results, release over release. Improvements and changes, however, are needed to address changing needs. This is true for most if not all tech solutions; the economics principle ‘Law of Diminishing Marginal Utility’ also applies to software. A tech solution that excited stakeholders when first introduced could become stale very soon. A revamp or a new solution is needed to match evolving expectations. On the bright side, this results in an ever-improving world around us.

In order to address leakage of issues in critical business workflows due to tightly coupled and legacy code, parallel setups are created with production and test versions of code, and the DB is restored from the same production backup; each workflow step is completed on both setups in parallel and then all reports are compared at workflow milestones. Because of identical data on every setup, the differences are a direct result of code changes. The differences are then verified for expected and unexpected changes. [Further Read: An Automated Approach to Regression Testing]
This approach raises two challenges:

  • Maintaining parallel setups
  • Unavailability of production data for testing

This solution is very effective in finding all issues in a critical flow that are missed in feature testing where the specific result of an action is verified. At the same time, this is a costly solution. The cost for h/w and licenses for testing is multiplied because of the need to have parallel setups. Another constraint is the use of a production data backup. The confidence achieved by using production data is unparalleled, and comes with problems of scale such as when the number of clients increase, testing with one complex client dataset is not enough. Many times clients are not comfortable with this, or due to organization policy, cannot share data for testing.

Using Single Setup Instead of Parallel Setups

Using multiple setups gives an advantage of comparing multiple versions, though in practice the testing is generally to compare production with the version under test. The initial thought was to use one setup for multiple runs–one for each version, and then save responses and compare results. So, one can prepare setup with the production version and run steps of the workflow and save the API responses to be compared. Then refresh the setup with the version under test and then execute workflow steps and in parallel compare the API responses at milestones. This option fell flat on its face as it doubled the testing time. Meeting the release deadline would become very difficult.

To optimize this further, the API responses were saved for the previous release in the code repository. Let’s elaborate in more detail. The release branch is cut out for the features to be released and the GitLab pipeline takes care of deploying the build on the test server and triggering the test suite. Test suite executes making the API calls and comparing the response with the existing saved response from the previous release build. The difference in the response is reported for a manual verification–to check if the difference is expected as a result of change implementation or if it is an unexpected change and requires a code fix. If this is an expected change, the saved response is updated in the repo, and if the difference is unexpected then a bug is logged for fixing the issue.

The solution is cost effective in terms of resources and time, the setup used is only one and for each consecutive release only one setup and execution is needed. It does come with a downside, in that the comparison can only be done with one version for which the responses are saved.

Using Non-Production Data

Testing with production data gives an extra level of confidence; any tester would not want to let go of this privilege. For a single client, like in a captive organization, this is fairly simple, and production data can be available for testing. However, for many other cases production data is not easily available. Many times, a client’s data policy does not allow data to be shared with service providers, or data can be made accessible only to a fixed set of people in the organization. In other words, there can be various reasons resulting in the unavailability of production data for testing. Also, as the number of clients grows, and multiple scenarios need to be covered in testing spanning multiple clients, any single production backup cannot cover all scenarios. This results in a need for generating test data. In the solution we are discussing there is a critical need for the data to be consistent between the parallel executions. If the data is not consistent then the responses cannot be compared.

To solve this and also keep the confidence of using production data (to some extent), an obfuscation script can be created. Using an orchestration tool [like Jenkins], production backup is restored, the script is executed to obfuscate the data and a new backup is created. This second backup is used for testing. This second data backup became our base and from the next iteration this is used for preparing the test setup and the obfuscation script is not executed. Instead, a script to add new data needs to be executed. The new data is to improve coverage and to cover new features. Another step in the process is to execute the schema changes as part of the new version. This keeps the testing backup to the latest release.

Conclusion

The combination of these steps solves the issues with parallel regression testing. The approach of comparing responses between multiple releases uncovers potential incidents and production issues. In comparison to testing an expected reaction for an action, this verifies everything else to catch the unwanted changes leaking to production. The underlying theme of testing everything related to critical business workflow provides a cornerstone for stable releases. Client confidence is enhanced in the product and it also results in business expansion. The use of a single setup and generated data makes the use of this approach more practical and also easier to adopt.

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.