Writing software for the cloud can be very different than writing software that runs on a single server. It can make test-driven development (TDD) more complicated, but it is still well worth doing. For the purposes of this article, I’ll consider two types of software development in the cloud: cloud hosting and distributed computing.
Don’t Discard Test-Driven Development in the Cloud
Summary:Developing software for the cloud can make test-driven development more complicated, but not impossible. Arin Sime offers advice for continuing good development practices in the face of challenges from cloud hosting and distributed computing.
In cloud hosting, you are still writing the same type of software that you have always written. A simple example is a website developed in PHP, Java, Ruby on Rails, or .NET. You are not developing anything out of the ordinary, and the only impact cloud computing makes on your architecture is that it is easier for you to scale the web UI of your system as traffic grows.
For cloud-hosting scenarios, nothing has changed with regards to TDD. The typical xUnit frameworks will provide all that you need to write solid software using good XP practices.
Distributed computing is different. For the purposes of this article, I will define it as software that is designed to scale horizontally across many servers in order to improve some combination of reliability or speed or simply to spread the computational requirements of complex algorithms across many servers.
The use of clouds for distributed computing is more complicated and less common than the more straightforward cloud hosting scenario. However, more teams are being called on to develop these types of applications, and there are many open source projects that are making it easier to tap into the more advanced powers of cloud computing.
One example of this is Hadoop’s MapReduce project, which is used for processing complicated jobs in parallel across many servers. Or, consider the Katta or SolrCloud projects that parallelize searching and indexing across a cloud. Other examples include noSQL data stores such as Cassandra and CouchDB.
Each of these examples runs as its own server or Java virtual machine. This makes TDD a little more complicated, because you need to have those servers running and in a known state for each test that you run. Because many of these projects are relatively new, you may find it hard to find solid examples of how to do TDD with that project.
Adhering to strong test-driven practices is still very important. It may be even more important with these projects, since you will frequently change your code as you learn the nuances of the library. Having solid test coverage will make your learning experience much more pleasant, but simply writing unit tests with a standard xUnit library alone is probably not enough to test your code.
Let’s consider a few ways you can do TDD for the cloud. The availability of each of these methods may be different for each package, but these general descriptions should help you to know what to look for as you choose the tools you want to work with.
Embedded servers: Check if the libraries you are using have Maven plugins that will automatically start and stop embedded virtual servers for you. Cassandra is an example of a noSQL project with excellent Maven plugins available. This is perhaps the ideal solution, because you can “spin up” a new server for each unit test and populate it with only the minimum data you need to test your functionality. One possible downside to be aware of is that the overhead associated with starting up and shutting down these embedded servers may slow down your unit tests.
xUnit-based libraries: There may be an xUnit-style library already written for the tools you are using. For example, there is an MRUnit library for writing tests of Hadoop MapReduce jobs. In the case of MRUnit, there is no real Hadoop cluster behind it, but the library essentially provides a mocked interface that you can test against. This is more efficient than spinning up actual embedded servers, but keep in mind that it may limit the usefulness of your tests.
Mock the interfaces: If no xUnit library exists for your toolset, consider mocking the behavior yourself using existing tools like JMock. This can be a simple way to at least test that you are providing the expected information to your cloud toolset’s interfaces. But, keep in mind that you may only end up reaffirming your own assumptions about what that toolset will do with your inputs, since you are only testing against your own mocks.
Call a local testing cluster: If all else fails, your team may need to maintain a local cluster specifically for running development and continuous-integration tests against. This is the least desirable solution, since a shared testing cluster may introduce version control problems between developers. Each developer could also run his own local copy of the tools on his computer, which removes the shared dependency problem. But, it is still harder to maintain than the previously described solutions that are designed to smoothly integrate into the test-driven paradigm. So, you will have to be more diligent about maintaining test-driven practices when you operate outside that paradigm.
One simple way to figure out the best TDD practices for the open source project you are working with is to see what committers to that project are required to do. Download the project’s source, and check out how they are writing their tests. You can often base your tests on their examples. Make sure you are looking at examples from the same version as you are working with, since the TDD methods available in the project may change over time.
No matter which option you choose, keep in mind that these are methods for functional unit testing of your code. Load testing and scaling exercises are a different matter and likely will require you set up large server (or virtual server) clusters that mimic your production environment.
That sort of testing is outside the scope of most TDD, which is focused more on functional unit testing. Keep your TDD focused on validating features and covering code with tests before you refactor it.
Due to the rapid changes in cloud computing and the open source nature of these projects, you may find the lack of documentation, examples, and completed libraries frustrating. Just keep in mind that you are on the bleeding edge of software development, and be proud of that. You definitely should not let those factors become an excuse not to keep up with good development practices like TDD.