If you're an analyst that utilizes data to build insights and perform analysis this should come as no shocker to you – studies have found that 80% of an analysts time is spent finding, massaging, aggregating, and preparing data for analysis and visualization. Recently there were two articles written for the New York Times and Forbes that highlight this issue.
In both of these pieces the data preparation aspect of the job is described as 'the least enjoyable' aspect of their job and is seen as ' a major hurdle to insights'. I relate to this from personal experience. In my time as an analyst, my partners (both internal and external) would ask me 'how long does it take to build a report?'. Talk about a loaded question! There are many variables that have to be considered when building a report or performing analysis, but I found the majority of the time is spent ensuring the data is staged in a manner to make analysis feasible. I usually go through the following steps:
- Identify the data needed for analysis. Is it available? If not, how do we get it?
- Structuring the data appropriately. Join data sources, aggregate, manipulate, etc
- Fine tuning the data. Optimize the data extraction process
- Automation. Utilize automation systems to take out the manual effort
Before we dive into ways to improve the data process, lets pause to discuss how analysts fall into the data quagmire. Humans are creatures of habit and when it comes to performing our jobs, it is no different. One of the reasons why people fall into the data abyss is they perform tasks a certain way because 'it's the way it's always been done'. From my consulting experience this answer came up time and time again when asking clients why they go through the pain staking task of gathering data through highly manual efforts. The person performing the tasks learned from their manager or peer who in turn perform the tasks as they did. There then becomes a fear of 'breaking' the process. Understandably, no one wants to be responsible for down time or reports being delivered late. This leads to another reason why this situation occurs – 'I didn't know there was another way!'. The fact of the matter is, many analysts are so focused on the task at hand that they're unable to get their head above water to think of alternative solutions, let alone become acquainted with new technologies and tools that may help. Also there is a comfort level working in the processes and technologies that are currently in place. Now that we have some reasons why this behavior occurs, let's move on to the downstream impacts that result from this complacency.
If you're an analyst or a manager, you might be asking – 'Why should I change?'. The affects of inefficient data collection and availability may not be apparent at the surface, but if you dig a little deeper you might find that you have one or more of the following problems:
- Multiple Versions of the truth. Analyst A may pull data differently than Analyst B resulting in conflicting analysis.
- Loss of Credibility. You have one chance to make a first impression. It's best to get it right the first time.
- Time Associated with Trying to Correct Errors. This is valuable time that could be better spent performing analysis.
- Employee Happiness. A common complaint of analysts is being a 'data janitor' cleaning and maintaining data.
Now you may be asking yourself 'What's in it for me?'. Having one version of the truth establishes credibility, accelerates troubleshooting, and instills confidence in the analyst that the analysis they're performing is accurate. By setting up proactive error handling, down stream processes are protected and time dedicated to tracking down and correcting errors is greatly reduced. Reducing the mundane and redundant tasks associated with gathering, aggregating, and manipulating data will result in more engaged employees. Most analysts desire to spend more time on advanced analysis and critical thinking. Focusing on these tasks not only provide analysts confidence in their analysis, but also directly ties to the company's bottom line. Better analysis enables better business decisions and better employee engagement reduces the time and dollar costs associated with turn over.
The benefits of having data staged in a manner that can be readily accessible, easily maintained, and flexible enough to meet your businesses needs can be a challenging endeavor. That is why Data Illuminations is here to be your trusted partner through the process. The key benefits to using Data Illuminations include our ability to:
- Standardize disparate data sources. Able to use your data in nearly any format, enrich with third party data and gather data from API services.
- Automate data processes. Including proactive error handling and data anomaly identification before they affect downstream processes.
- One Version of the Truth. Certified, trustworthy data staged in a manner to to be flexible to your analytic needs.
- Variety of Deployment Methods. Engine works on your systems on your networks. Hosted options are available for quick-start.