We live in the times of microservices and big data, especially when it comes to web services. Monitoring and testing in production are essential practices for running modern software services at scale. But how does monitoring fit into the big picture of testing? What should testers do to utilize monitoring, and how can it help our work?
Discovering monitoring tools had a big impact on my career as a tester. I learned a lot about the product, users, and even my own processes. It helped me manage biases I had, as well as ease my testing work a lot.
Imagine you have a website that gets visited by seven million people per day. How do you test it to make sure it works not only on your shiny, new work MacBook, but also for a customer running Windows XP and using an old Internet Explorer browser? The power of monitoring could help you a lot.
Here are seven concrete benefits testers get from monitored data that you can use to convince your team to implement monitoring—as well as realize for yourself.
1. Get to know your users
Using monitoring and analytics, I can find a lot of data about the users. For example, what kind of browsers and devices do your product’s consumers use?
I was recently in an engagement where, with the help of analytics, we found out that the top used browser for the product was IE11. I was very much surprised this was true in 2018, especially because our team was secretly hoping to drop support for IE11 altogether. Monitoring helped us discover that our users are not necessarily like us: They use technology differently.
Getting to see analytics like browser and device usage data, or even how users use our products, can be eye-opening. By looking at the way users interact with the product, I can find out what the most used functionality of the product is—and it is not always what I expect it to be. Monitoring helps me quantify and understand the product itself more, along with what is important to the users.
2. Visualize KPIs
There is often frustration due to business and development teams not speaking the same language. But monitoring helps me build a bridge between these departments.
First, I clarify what the key performance indicators (KPIs) are to the team. Different departments have different ideas of what is important to them. Sometimes I have to question KPIs because certain aspects may not actually show the success of the product, and we shouldn’t blindly follow the numbers. But once the KPIs are clarified, by using monitoring tools, I can work on the visualization of the KPIs by creating dashboards using monitored data with specific time frames.
For example, feature A was the most popular feature, so I created a dashboard to track user activities with this feature, such as the number of users interacting with it during the week. Dashboards about the important feature fostered much smoother communication among the team members and created a unified language.
3. Prioritize your testing
Understanding what features are really the most important—based on data, not because someone on the team told me so or because I believed them to be—allowed me to prioritize my testing accordingly.
I created queries using monitored data to help me answer these example questions:
- What functionality do users take advantage of most?
- How do users interact with functionalities? (Look at a few full user journeys, filtering for the specific user)
- What browsers cover most of these users?
Once I answer these questions, I can actually act on them. For instance, if mobile Safari covers 80 percent of users, I must make sure that I do extensive testing on it first. The functionalities, browsers, and devices that are used most also should be tested most so that we can be more confident about our product’s quality—and customer satisfaction.
4. Observe regressions
When the user base is very big, your tests may not cover all the scenarios for regressions. To make the situation less drastic and scary, you could try dogfooding (testing your own product internally), user testing (releasing only to trusted users who are willing to deal with glitches), or doing canary releases (releasing only to a small percentage of users and monitoring extensively).
When a new release happens, especially if it’s just for a set of users, I follow the release and observe the user interactions in real-time monitoring dashboards that I created. What is great here is that users will cover the combinations of devices and browsers and test the product in the way they’d actually use it.
During many releases, I usually first perform a quick manual exploratory test in production myself, and then I check the real-time dashboards that visualize results of queries in order to track the user interactions related to the change that is being released. Sometimes I filter the dashboards by environments or browsers. For example, I can check whether Edge mobile browser users can execute the same workflow and actions as users of other browsers without drops in functionalities, even if I do not own a device with an Edge mobile browser. I also often follow certain time periods if monitoring has real-time data.
5. Set up alerts
I am not limited to simply observing the releases. I can also create automatic alerts that get triggered with a certain threshold value.
Once I have the KPI definitions clarified and agreed upon, I can easily set up the alerts for those conditions—for example, that logged-in users should be 25 percent on average, per hour, of all users browsing a certain website. Another example could be the number of users of the important functionality.
Alerting is most beneficial for crucial functionalities that are carrying the business logic. I have certain alerts set up that also send me an email so that I can immediately investigate why the behavior may have happened. Then, if it’s confirmed and acknowledged as a real-time issue, we are able to acknowledge and fix it as soon as possible.
When it comes to testing web services, they may be up 24/7 and traffic may affect alerting, so it’s important to make sure that the issue is actually there. Sometimes I find false alarms, but that also strikes a discussion about whether a KPI is reliable enough if it’s flaky.
6. Investigate issues using patterns
What I love most about monitoring is the power it gives me as a tester to understand the issues that occur. Using simple querying languages like SQL, I am able to track down the workflow of a certain issue, which helps me spot patterns and identify more users experiencing the same issue.
For example, I once encountered that I was not able to use an important functionality of our product in a certain workflow. I could not remember the exact steps I took, but monitoring helped me track down my own actions. Once I did that, I could explore the scenario that caused the issue. After making the scenario have as few steps as possible, I queried the system to get the users who had executed the same scenario. What I found was that not every user performing the very same steps had the same result—for one group of users, the functionality worked. When I looked at the data closer and checked other user characteristics, it turned out that the issue was related to the region where the users were. Only some countries were affected.
It can be exciting as well as revealing to learn more about why issues have been happening. And the information gained from monitored data can help your development team a lot in fixing the bugs.
7. Quantify the importance of bugs
Do you know what the impact is of the bug you just reported? Because I have monitoring set up, I am able to answer this question, and it adds an extra superpower to my work as a tester. Instead of just shaking heads that yet another bug was reported, members of my team started mentioning me on various bugs reported by others to ask me what their impact is. Having a thorough monitoring system in place allows me to recreate the scenarios using SQL on monitored data, so I can get an exact number of how many users are affected in production.
Of course, this works only on features that are already in production, but that can be very helpful. It was useful for me to realize that some of the bugs I thought were critical ended up affecting only 1 percent of the users. The ability to quantify the impact of bugs makes development and prioritization way easier.
Implement Monitoring on Your Team
Many organizations have no monitoring practices whatsoever, and the ones that do often lack certain aspects that would be useful to you as a tester. You have to explore your product and see what API requests, parameters, and so on are there that could be recorded. Then, express the importance of these items in one universal place where you can check monitored data. It could be one of the tools for monitoring that has a web UI, or just a database that you can query if you have your own monitoring systems set up.
Do not be afraid to go the extra mile to get something monitored. The information will likely be enlightening—not only for you, but for the whole team.