Automated tools are essential to software development. Tools can take the drudgery out of the more tedious development and testing tasks and let us get back to what we love: writing code (or in the tester's case, breaking code). This is especially true for security testing where the goal is not to prove that the software does what it is supposed to do, but rather that it doesn't do what it's not supposed to do. This is a much more difficult, if not actually an impossible, but, thankfully, we have some great tools to help us out. In this week's column, Bryan Sullivan covers one of the most valuable of these tools: the fuzzer.
Fuzzers are tools that feed random input to software programs for the purpose of revealing defects in those programs' input-processing logic. For example, if you create a malformed data file and open it in a word processor, will the word processor fail gracefully and inform the user that the file appears to be corrupt or will it crash? Does it start executing the data in the file as if it were assembly code, potentially leading to a complete compromise of the system's security? This scenario is not as impossible as you might think if the program contains a buffer-overflow vulnerability. Fuzzers can reveal vulnerabilities like this by automatically generating variations of malformed input and feeding those variations to the program being tested.
You might ask, "How much fuzzing is enough?" It's unlikely that you will find a critical security vulnerability with just a single fuzzed input variation. If you do, it's extremely unlikely that you've found the only one in the code and, therefore, one iteration is clearly not sufficient. On the other hand, you could easily let the fuzzer run for days or weeks, testing millions of iterations, but you still wouldn't be assured that every possibility has been covered. The rule that we use in the Microsoft Security Development Lifecycle (SDL) process is that every input must pass 100,000 fuzzing iterations without failure. If a defect is found in the input-processing code, then the module must start the fuzzing process over again after the defect has been fixed and then pass another 100,000 iterations without failure.
Another good question is, "How much of the data should be malformed?" Let's return to our word processor example. Suppose that the word processor's data-file format includes a checksum to ensure that the file hasn't been corrupted or tampered with. Unless the fuzzer is aware of this and then modifies the fuzzed data file checksums accordingly, the vast majority of fuzz iterations that it creates will immediately be discarded by the word processor due to checksum failure. This is not only a waste of time, but it also leads to a false sense of security, as the input-handling logic isn't tested beyond the very shallow first level (the checksum logic).