Finding the Information inside Your Data

[article]
Summary:
Data analysts have to know a lot about diverse business areas so that our reports provide usable information, not just data. We can use this awareness of the value of information to merge different data sets in order to answer new questions, and even help our users make better decisions. But in order to do this, we need to present not just the data, but the information value represented in that data.

As a data analyst, I spend a lot of time helping our customers gain a better understanding of their operations and business issues. This requires me to also learn a lot about diverse business areas, so that our reports provide usable information, not just data.

We can use this awareness of the value of information to merge different data sets in order to answer new questions, and even provide new information to help our users make better decisions.

My company supplies electricity to most of New Mexico, which is a big place. We service over 500,000 customers and have close to 100,000 transformers and 15,000 miles of transmission and distribution lines.

As with most electric providers, we have outages. We work hard to keep them as short and infrequent as possible, but the wind blows, trees grow, and, for some reason, our poles are surprisingly attractive to cars. All of these things, plus a number of other factors means that outages are inevitable. In addition to the outages where a root cause is pretty obvious, we also have some where the cause is less clear. There is a lot of interest in reducing the non-weather-related outages that may be preventable.

We collect a lot of information on our outages and try to see ways to improve our service level, with the goals of preventing outages before they occur and, when they do occur, keeping them as short as possible. Outages are tracked by when they start, when they end, what customers were impacted, and what caused the outage. This data is saved and viewed in different ways in hopes of achieving improved reliability.

On the demand side of the process, we collect a lot of information on where our customers are, how much power they use, and when they use it. We also track how we deliver power to them, as it is a complex series of connections to make the process work. Every customer is linked to a transformer and then to a line called a feeder, the feeders are organized into divisions, and the list goes on. Having all this information on 500,000 customers generates a large volume of data to be considered.

I recently had the opportunity to merge the supply and demand perspective on usage and outage data and “make” new information to better manage our systems. Making information from data is the process of sharing a new perspective on existing data that allows for better decisions or reduces risk. This new information will allow us to see if some outages are linked to unexpected demand changes, both the type where demand grows faster than expected (such as many people getting new air conditioners) and where demand shrinks (such as a customer installing solar panels and starting to supply us with electricity). 

This project came from talking with the operations team that deals with outages and gaining an understanding of the concerns around the percentage of outages that are unpredictable. I took this data and my previous experience with our customer data to make the leap (more of a hop) to the conclusion that transformers that have a higher than expected demand or are asked to run backward (due to solar customers) might have a higher than expected rate of failure. This just makes sense, and we are now working to validate the idea with more investigation.

As a data analyst, you have both the data and the knowledge of the information value represented in the data. You can add new value by not just responding to report requests, but also by going to visit your customers, taking the time to learn their operations, and asking about gaps not being addressed. Everyone has some issue where more information would help them do their job better, so think about your customers and look for places where combining or sharing data in different ways could reveal new information. This is not easy, and it will require you to think beyond the day-to-day job and look at the larger operational picture.

I would expect that your company, like mine, has similar types of information: supply and demand, vendors and customers. You need to talk with your data customers and listen for missed opportunities or gaps in your company’s understanding of why something is happening or not happening. Recognizing the gap in the current process is often hard, as users may have built workarounds to address the issue (or may not even see it as an issue), but this is an important step to glean additional data. 

At a previous company, I once had a supply issue where the warehouse kept running out of a key material. Upon investigation, we found an error in the scraping logic of the press operation that dramatically overcounted the scrap generated in an operation. The excess scrap was not seen by the production team as an issue, but made the supply team nuts because the material was always going negative. Their workaround was to cycle-count the inventory weekly and adjust up the “found” inventory. This really impacted the productivity score for the plant and caused some conflicts in the teams, including accounting. Once the issue was investigated and changes made, the teams began to work together, and it is a much better plant today.

When you have identified the opportunity where a new perspective on data would help the decision process, identify your target data sources and, if you can, collect the data needed.

As you are using data in a new way, be sure to check that you are not collecting data that shouldn’t be collected. This could include customer information (credit card numbers, email, etc.). You will not make friends if you suddenly create a new issue with personally identifiable information for data security. The same is true for anything that is considered confidential data for the company (cost, profit, etc.). The reports you currently supply go to users already authorized to see them, but your new information may require new permissions.

Now merge and assimilate the data using tools you already have. To do my project, I used SQL Studio, Access, Excel, and, finally, PowerPoint to summarize the information in a way users could consume.

You may have to make several different attempts as you work through your idea to get accurate information from the data as you manipulate it. Saving your data at each transformation can save you a lot of time because you can go back to data at one stage and then go in a new way. Take the time to document your conversion steps once you finally get it right, because if you have found something valuable, it’s likely that people will ask for changes or to use a different data set. 

Now that you have information, it is time to display it in a way that people can understand. You may be very familiar with your data and what it all means, but don’t assume your audience has the same level of knowledge or passion.

I always start my presentation on new information with a problem statement that explains what I was trying to understand or the customer issue being supported. Be a team with the customer: Using your new information gives you credibility and builds a relationship that can pay dividends.

Once I have the problem established, I don’t bore the audience with how I got the answer; I show what we found and what it means. A graph or chart really helps here. You may be asked how you got to this point, but it happens less often than I would like.

Now that you have shown the answer and explained what your information tells the company, provide details on what the next steps could be. It is likely that there are factors in place that will make the options for actions more complicated than the data can show. The great thing is your new information can be used to monitor the impacts of any changes you all decide to make.

About the author

StickyMinds is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.