Effective Test Data Design: An Interview with Rajini Padmanaban


At STARWEST 2012, Rajini Padmanaban is delivering a session titled “The Art of Designing Test Data” and she recently spoke with TechWell Editor Noel Wurst to share her knowledge on this topic.

As Director of Engagement, Global Testing Services, at QA InfoTech, Rajini Padmanaban leads the engagement and relationship management for some of QA InfoTech's largest and most strategic accounts. Rajini has more than ten years of professional experience, primarily in the software quality assurance space.

Noel Wurst: What factors go into making test data truly “effective”?

Rajini Padmanaban: By definition the word “effective” means something that is successful in producing intended results. Test data is said to be effective when it is a set of inputs that are fed into a program to produce desired results. By “desired” I do not mean a test case that passes; it is a test case that passes or fails but helps us understand system behavior under the given scenario to verify whether it has been implemented correctly or not. Some of the core factors that make test data effective include:

1. Whether it gives us useful information about the system under test

2. Whether it helps us influence system behavior both positively and negatively, to simulate end user actions

3. Whether it helps us achieve the goal of the test under execution

Noel Wurst: You’ve recommended using a “reverse engineering process” for test data generation. How did you come to the conclusion that this was an effective strategy, and does it apply to multiple areas of testing?

Rajini Padmanaban: Good question. Test data creation precedes test execution. But it is very important to keep in mind what test data you need for a given scenario, what is the best way to create it, and is there scope for re-use - before you delve into actual creation. If you really look at this closely, it turns out to be a reverse engineering process where instead of randomly creating data and using it in your tests, the best way is to think about the test to be executed, is to understand its goal and align your test data creation effort with that goal. This not only helps you optimize and re-use your test data but also help you get the maximum returns on investment, by giving you the most meaningful information about the system under test. Such a reverse engineering process also helps you streamline your test efforts, and optimize your test execution time and costs. This strategy applies to all types of testing be it functional, UI, usability/accessibility, performance, security, globalization etc.

Noel Wurst: Is there a misconception that a high level of creative test data generation is going to take a significantly increased amount of time?

Rajini Padmanaban: There is an element of truth in such a misconception. Test data creation is an art and calls for careful upfront planning and thought process. That being said, the effort reduces or even plateaus over a period of time (across multiple releases of the same product), if one diligently reuses the data. The effort spent upfront is totally worth it in bringing in the required breadth and depth in test coverage. After every release, it is a good practice to clean and archive test data for subsequent re-use especially in areas such as functional and performance testing where large data sets are required. Such a choice on where, when and how to re-use test data can make this a less cumbersome yet very effective process in the overall test execution effort, over several releases.

Noel Wurst: Are there any algorithms that you employ to reduce the time it takes to generate a significant amount of test data?

Rajini Padmanaban: Depending on what type of test data you need and what tests you would use it on, there are 2 alternatives for generating a large amount of test data:

1. One could explore the possibility of using live data from production databases for test purposes, especially in areas such as performance testing where you typically need hundreds of thousands of data. It is very important however that you get the required stakeholder approvals and use the right practices such as masking the data during use and storage to ensure end users’ sensitive information is not compromised.

2. Automated tools are also available to generate various kinds of test data. One can search for open source, free or commercial test data generators depending on their project requirements. I have also seen teams develop internal tools and frameworks to meet their ongoing test data generation needs, especially around fields such as names, addresses, dates of birth and other user identifiable information that you may need in volumes for functional and performance testing.

Noel Wurst: Could you describe when reusing test data has the most benefits, and are there any scenarios where you would not want to reuse test data?

Rajini Padmanaban:

1. Test types that need volumes of predictable data – again the example of performance and functional testing, benefit the most from re-used data.

2. Test areas such as security need very specific, limited data that take a lot of time to create. This is again a place where re-using might be useful but sometimes depending on database changes, other feature changes in the system, only a partial re-use might be possible.

3. Other areas of testing such as UI, usability that typically doesn’t need large volumes but needs very specific test data, can also benefit from re-used data, but these are areas where the tester’s creativity brought in through extempore testing brings in larger returns on investment compared to the time saved in re-using data.

4. Every test cycle will typically have some time set aside for exploratory testing across its testing types. In my experience, as far as possible it is a good idea to not re-use test data in exploratory testing, because you want the tester to play around with the system like how an end-user would. Re-using data in such scenarios often hampers the tester’s creativity, which is the key to successful exploratory testing.

5. I would also not recommend data re-use in cases where cleaning up and archiving previous data sets is cumbersome and not worth the effort. Some areas of database and admin module testing which deal with complex user scenarios such as order transactions, user history, (in an example of testing for ecommerce applications) might involve complex data clean up. In such cases you would be better off creating user data using test data generation tools rather than re-using data which is not only complex to clean up but also invalid to use if the clean-up process is not foolproof.

Noel Wurst: At the upcoming STARWEST 2012, you’re delivering a session titled “The Art of Creating Effective Test Data.” What do you want attendees to be able to take back home?

Rajini Padmanaban: Test data generation needs to be given due importance in the overall test execution effort. It needs to be started early, and understanding its alignment with the various testing types to backtrack and see what kind of test data is needed in each scenario will go a long way in a successful test execution effort. Very often gaps exist around:

1. The lack of a planned test data creation effort

2. Absence of good practices in making test data usage a group inclusive process

3. The breadth in generated test data although there is enough depth

I want these to be the key takeaways from this session, which will help the audience take back actionable items to evaluate the test data generation effort in their teams and work on fixing any such open gaps that exist as of today. This will go a long way in bringing in a meaningful test execution effort and contributing to a product of exceptional quality and increased user acceptance in the marketplace.

About the author

Upcoming Events

Sep 22
Oct 13
Apr 27