When you search through an ordered list, like a dictionary or phone book, you're probably using the bisection technique to find the information you need rather than starting at the beginning of the book and turning the pages one at a time. In the world of testing, bisection can make your testing much more efficient. In this week's column, Danny Faught describes the basic bisection technique and how to modify it in order to better respond to the real world.
Let's say I'm testing an integer input field. I input 0 and the test passes, and then I input 1,000,000 and the software crashes. Before I file a bug report, I want to find where the transition point is between a passing test and a crash. Just like it speeds up a search through a dictionary, bisection (also known as a "binary search") can help to find the number we're looking for.
Here's how bisection works: Keep track of the largest number that hasn't failed the test and the smallest number that failed. In our example, that's 0 and 1000000, respectively. Then try the number halfway between the two, which is 500,000 in this case. (We can find this middle point by averaging the two end points.) If the test fails, then we can make a hypothesis that all inputs between 500,000 and 1,000,000 will cause a crash. Our new end points are 0 and 500,000; we've eliminated half of our search space.
Here's a hypothetical example of a testing session where we find our needle in a haystack--a single value out of 1,000,001 that represents the lowest number that results in a failure. Each row is a test case, showing our two endpoints and the average of the endpoints (truncating fractional values) that become the input for that test. The result of each test determines how the endpoints will be adjusted for the test on the next row. It doesn't matter what kind of software you're testing or what your test is, the approach will work for any integer input.
|Table 1: A hypothetical example of a testing session using the bisection technique.|
After twenty-two tests, we know that the test passed at 32,949 and failed at 32,950. We now have much greater detail to include in the bug report.
By the way, if you know your powers of two, you might recognize that my hypothetical failure point of 32,950 is close to 2 15, or 32,768. While I do often find failure points at powers of two, just as often there's no obvious reason why the software fails at that particular point.
The real world can throw you some curve balls that you have to hit. One that can cause massive confusion is a moving failure point. As you're trying test values, you may see the same input passing your test on one occasion but failing it on another. This sometimes happens within the same testing session, but most often when you restart the application or run it on a different computer, especially when hours or days have passed since your last test.
In our example above, I recommend adding two tests: Try 32,949 and 32,950 again to see if the failure point is moving. This also helps to verify that you didn't make an error recording the results of a previous test. It's easy to get lost in the details and make a mistake. Even if you don't immediately see a moving boundary, there still could be conditions that cause it to move, so keep an open mind when you help the developer further isolate the cause of the bug.
Bisection works most efficiently when your next test input is exactly halfway between your two endpoints (assuming you don't have any clues about where the failure may lie). But don't feel obligated to be so exact if you're testing manually. If I were testing manually in our example above, once I got to 125,000, I probably would have chosen rounder numbers that I could calculate easily in my head like 60,000,