How to Incorporate Data Analytics into Your Software Process

article

April 20, 2014

Summary

Big data isn’t just a buzzword; it lives in your software. With millions of possibilities to leverage analytics, how do you pick what’s right for your organization? Robert Cross provides some insight into how to start incorporating data analytics into your software process and management plan.

As a provider of independent software quality and security assessments, my company has the unique privilege of examining in great detail our client’s source code for all different types of risks. Many of our new relationships start off with our clients finding themselves in a “code-red” situation and in need of perspective. This might make some of you giggle and others cry, but we’ve all been part of projects forced to release software in less time than process said was possible (i.e. code bending).

The first face-to-face meeting we hold with our clients is usually the same across our accounts. The testing, development, and management teams look very tired. If they still allowed smoking in office buildings, everyone would be lighting up in the conference room. An argument typically ensues amongst the groups about the root cause of the problem and ends with the executive decision maker looking at us and saying, “Help us figure it out and throw everything you’ve got at it!” This can be translated to mean “Measure anything and everything that has a heartbeat of a chance to getting us out of this ditch, and, by the way, we needed it yesterday.”

We then ask if the teams are subject to any particular standards that are driven internally, by the industry or customer. In cases where there are standards in place we have found the engineers know about them but did not have time for due diligence because of schedule compression. It’s rarely the case of the team not knowing “how” or “what,” rather it’s the “when” that is the main culprit of chaos.

The process for analyzing software that your team has not authored is tedious and measures thousands of “things,” including quality risks (null pointers), defensive programming risks (exception handling), security risks (buffer overflows), and metrics (cyclomatic complexity) to name a few broad categories. This type of process incorporates numerous technologies to generate a large software risk-data profile on their system. From this, experts analyze and distill the data down to true positive findings and document them in various report formats tailored to the audience to explain their system’s specific and unique risk signature.

The key is balancing the importance of various data sets on what’s relevant and what’s nice to have, which adheres to the “count what counts” principle. I specifically remember one particular customer in which one of the senior executives proclaimed with absolute certainty prior to our analysis of his system, We have no default-less switches in our code, no way, no how! This in our standards and our engineers follow strict guidance!” Well, wouldn’t you know that we identified that his system had over 16,000 of them. We were all shocked that day because of the certainty of this executive’s statement. That meeting was a tough meeting to sit through, because the manager’s entire belief system was based on the assumption that his team members were following and not deviating from their defined software process.

However, there was no mechanism in place for anyone on the team, including management, to count what counts. Is measuring the number of default-less switches important? If it’s important to your program and to you then it should be counted. Find a way to collect the data, correlate it to specific risk factors, and collaborate these results across your teams to establish expectations and communicate the importance of this measurement. There are thousands of opinions out there on what should and shouldn’t be counted.

After analyzing billions of lines of code as an independent, my opinion is to start small and count a couple of key metrics. More important is to learn how these metrics correlate to business risks relevant to both engineers and executives; this allows you to create a common language framework so that when brought together, the consumers of this information aren’t talking past one another. Once your organization gets its arms around a small set, then expand it to incorporate other metrics again collecting data, correlating it to risk and collaborating it from top to bottom in your organization.

Why is the “count what counts” notion so important? Here is the truth. Schedule compression happens every day to all of us no matter what process you’re following. Either the customer, the market, upper management or the competition causes us to make decisions to take on exceptional risk. The art of bending space, time, and code will never go away, it’s the blessing and curse of software. However, unlike other business disciplines that have systems and measures in place so when they shortcut process there is an audit trail of data proactively alerting them to the consequences of this decision. They have transparency into their technical debt, whereas software is still catching up and has a ways to go.

One of the joys in being in this aspect of the business is enabling teams to have that “Ah Ha!” moment by realizing the power of data analytics when done properly. Having an opportunity to unify a sometimes-fractured team by focusing the conversation on data and not opinions is truly an amazing and fun experience. Of course hindsight is always twenty/twenty, but if it leads to customers changing their strategy to focus on important data rather than just tools or buying the newest widget, then it’s a big movement in changing our industry from being reactive to proactive.

Topics:

collaboration development security software engineering teams

About The Author

Rob Cross

Rob Cross has been in the software development industry for over 25 years in various capacities. He has worked for several start-up businesses. These companies have been focused on providing software quality, security and performance data to organizations leveraging state of the art technologies, having analzyed over 8 billion lines of code as an independent software assessment company on products ranging from military systems, medical devices, satellite systems, video games to Wall Street exchanges. Rob currently works at Synack, a Pentesting as a Service company.

Rob and his family reside in New Jersey where his time out of the office is spent cycling and staying in shape, supporting his three (3) fantastic kids and wonderfully supportive wife. Rob is a University of Cincinnati graduate with two degrees in business and engineering. Hoagies from Whitehouse Subs in Atlantic City, Skyline Chili, Extreme Hot Sauces and Graeters Ice Cream are sure ways to get Rob's attention on any subject matter.

Community Sponsor

User Comments

1 comments

Sunny Patel

September 26, 2023 - 4:26am EDT

To integrate data analytics into your software development process, start by defining clear data objectives, collect relevant data, and implement analytics tools or libraries. Develop data-driven features and dashboards to monitor application performance and user behavior. Continuously refine and optimize your analytics pipeline, leveraging insights to inform software improvements and decision-making throughout the development lifecycle.

Language Not specified

How to Incorporate Data Analytics into Your Software Process

Community Sponsor

Lets Hang!

Featured Resources

User Comments

You May Also Like