What do your tests say about your application?
Software teams often wonder if a feature is worth testing or if a manual test is worth automating. However, engineers rarely ask themselves what a test teaches others about the application. It’s important that test authors keep in mind the inherent authority their tests possess. An application’s tests are sometimes the first lines of code a new developer will read when acclimating to a new codebase. Always remember that tests have the power to guide, but they also have the power to mislead.
Even a well-written test can be damaging when written for the wrong reasons. I was doing a code review for a coworker recently when I caught something in one of their integration tests that seemed out of place. The review was for a new API endpoint: a simple GET method that returned a list of all saved messages for a user. The test that caught my eye was titled “Should return a 404 if the user has no saved messages.” This seemed odd to me.
Assuming the parent entity existed (in this case, the user), I expected any API that returned a collection on success would return an empty collection if no items were found, not a 404. My coworker agreed with my intuition but said that the acceptance criteria for the story were vague, and the endpoint currently returns a 404. There was a story in the backlog to update the endpoint, but for now he was testing that it worked as implemented. My coworker argued that the test should validate current functionality and that they could go back and fix it later. The tests were being run automatically, so if the functionality changed, we would find out when the tests started failing. And besides, if anyone had questions about the API, they could talk to the team.
Their approach made sense at face value, so I approved the code and moved on. It wasn’t until a week later that we started paying the price for this testing approach.
Fast-forward to the following sprint. A different developer was working on a new endpoint for the same service. I noticed that this developer had added a similar negative test titled “Should return a 404 if the user has no new messages.” When I asked them for their reasoning, they pointed me to the test I had approved last week and said they were just following the convention for this service. They had actually updated their acceptance criteria to include returning a 404 on no results in order to match what was already in place. A single bad test had defined a bad convention.
Always ask yourself, “What do I want an unknown person to take away from reading these tests?” Tests have a way of escaping their original sphere of influence, and it happens more often than you think.
Another time, I was part of an engagement where several distributed teams were working on a single codebase. We were building a portal where users could create small communities to manage the needs of their organization. One evening after a production release, my team got an urgent message from one of the client’s product managers: Admin invitations were broken in production! That sounded bad, but I had no idea what that meant. I’d never heard of the feature, and my team had no hand in implementing it.
This feature was implemented by another team in a different time zone, and they’d all left for the day. The client was aware that we had nothing to do with the feature, but we were the only team online at the time. Was there anything we could do to help? Not wanting to leave the client hanging, I offered to take a look.
Very quickly into my investigation I found out a couple worrisome things. Anyone who knew anything about the inner workings of this feature had left the project, and the issue was being reported by multiple users in production. It turned out that this feature had gone through a few rounds of rework and had been deprioritized. The original product manager had left the company, and the offshore team that wrote the code was composed of different people now. Add to that the urgency that comes with a customer support team getting calls from frustrated users, and I’m sure you can imagine the level of anxiety this bug was generating.
However, all was not lost. We’d sold the client’s company on a comprehensive approach to automated testing. Every team was required to have tests in place to validate all acceptance criteria for any development work taken on. The mileage on the tests could vary, but I knew the tests had to be there.
To find the tests related to this feature I got the component name from the email form, then searched the codebase for any spec files that contained that component name, and less than fifteen minutes later I had a good understanding of how this code was supposed to work. A current user with an admin role should have been able to invite new admins via email. The recipient would get an invitation email with a single-use link to join the organization. Clicking on the link in the email should take users to a registration flow where they could finish setting up their account. Armed with an understanding of how things were supposed to work, I dove into production to understand what was going on.
The issue in production was that users were getting the email invite, but the links all took them to an error page. I was able to verify the buggy behavior in production and moved on to figuring out what had gone wrong. Again, I turned back to the tests.
All of our end-to-end tests were running in the continuous integration (CI) pipeline at three separate gates: on merge to the development server, promotion to staging, and after release to production. These tests were passing in dev and staging but were being skipped in prod. The feature seemed to be working correctly in both other environments, and I had the test logs to prove it, but why weren’t they running in prod?
Well, again, the logs had my back. There was a warning message in the logs: “Admin invite test suite disabled—flag admin-invite-landing-page is false.” That was certainly worth looking into.
On this project we used flags to control which features were turned on in each environment. Before the test ran, we checked the feature flag API to figure out which flags were turned on in that environment. If any of a test’s flags were disabled, the test would be skipped, and we’d output a non-blocking warning message.
These particular tests had two different flags they were checking. The invite form was one flag, and the registration flow was a separate one. Both flags were turned on in dev and staging, but one of the flags was turned off in prod. I took the invite link I’d sent myself just moments before, appended the query string parameter to enable the invitation landing page, and sure enough, there was my registration flow page.
I let the client know that all they had to do was enable the second flag for production, and everything should work perfectly. One flip of a switch later and everything was working again. Crisis averted.
I can’t explain why the feature was divided into two separate flags, but I can guarantee you that this particular client would have been in hot water for at least another day if it hadn’t been for those end-to-end tests. Since I knew that the tests had to be running at predictable steps in the pipeline, and I knew that no story could be closed without demonstrating each piece of acceptance criteria had been tested, I knew there had to be tests somewhere that could describe how the feature was supposed to work.
The often-forgotten quality of a well-written test is its ability to quickly and concisely communicate information not just about how things are, but how they should be.
Distributed teams are prone to gaps in communication, and sometimes knowledge handoffs don’t happen. It's hard to keep documentation up to date in an agile workflow. But if you run your tests automatically as part of your deployment process, you can guarantee a base-level understanding of what’s going on in your project.
I’m not saying tests are the only form of documentation you need—far from it. I need fleshed-out user stories to understand what I should be building and how I should be testing. However, automated tests in a CI environment are definitely the documentation I trust most. You never know who’s going to be reading your tests in the future, so make sure to keep them clean, current, and valid. Tests are documentation, and don’t let anyone tell you otherwise!