Every second of our lives data is being generated by around 4.5 billion individuals on this planet, who on an average spend around 6.5 hours on the internet. With the advent of the IoT and other connected devices, even machines are generating data. It has now become extremely easy to store the data. Petabytes of data can be stored in small hard disks. This data is then used by various organizations to generate insights about their customers, improve business efficiencies, and make better business decisions. These insights impact product development, targeted advertisements, predictive maintenance, operational efficiency, fraud identification, customer experience, and operational aspects. Organizations are looking to embrace the power of data to drive genuine value for themselves.
All these use cases need different data to be analyzed and studied statistically. Not only that, every organization has different needs for data. Let’s say the kind of data Netflix would analyze for delivering a better customer experience would have to be starkly different from the data a retail bank wouldconsider for delivering a better customer experience. In a similar fashion, an elevator company looking for data points for predictive maintenance would consider starkly different data points from a luxury automobile manufacturer attempting a similar result. One thing clear from these examples is that zeroing in on the data to analyze and the utility that analysis can deliver varies widely. So how do you ensure that your data initiative delivers meaningful, powerful, and purposeful insights?
Now there are a plethora of use cases which may make sense in an organization. The use cases will vary across the LOBs. But it is usually prudent to start small and see the results that emerge. Driven by that, the initiative can be extended to the other use cases. This could becalled keeping the tree as well as the forest in mind. The learnings are incorporated into the extensions and we benefit from the experience of previous implementations. The scale, scope, and projected impact grow organically while identifying and eliminating stumbling blocks at each organic iteration. That’s a strategy for growing the data initiative meaningfully within the organization and generate tangible, measurable, and scalable outcomes.
But what about the data itself?
Once the initial use case is identified it is imperative to identify the correct data set. For eg., if you are, say, an OTT media services provider and you want to see where the users are dropping off, or what kind of content a user is best poised to view next, your data requirementswould be different from an eCommerce organization that wants to draw insights about the behavior of its customers. One may want to know about the journey of the customer on the site, the other about the journey the customer has traversed before coming to the site. You would have to carefully define the specific data that will contribute to the insights you are looking for.Of course, technically speaking, different types of data would reside in different technological systems and extracting them would require different tools, tactics, and skills.
Data initiative tools
This brings us to the tools. Primarily from an IT standpoint, data analytics consists of three major elements. Extraction, management, and consumption of data. Each aspect has tools, both open-source and COTS version. Let’s say one wants to extract some data. Depending on the business requirement they may want to source real-time data or in batches. Different tools would be appropriate for data extraction in batch transfers and for real-time transfers. Similarly, there are tools to help manage the data. These play a key role since much of this data resides in different systems. The data from these systems would have to be pooled into a common data warehouse or a data lake and scrubbed to become ready for analytics.
Once the data set is ready,it’s the turn of the software folks. With the help of the appropriate language like Python or R,algorithms are coded, and statistical analysis is run on the data set. Once the data has been analyzed and insights generated, that needs to be presented. The target audience that consumes the analysis in the form of reports varies and that determines how the insights are presented. Reporting and Visualization toolsdrive the consumption of the data. The more accessible the reports, the wider they can be spread. At this stage, it’s about helping decision-making where those decisions could make the most impact.
To bring all this together, organizations often reach out to a partner with the expertise to use the data, apply advanced analytics solutions and then provide the relevant insights. The partner needs to have domain expertise, IT capability, and statistical knowledge to be able to create specialized solutions that can supercharge the organization’s business outcomes. Domain expert speaks the language of the company and act as thebridge the organization and the partner. The IT capability is necessaryto drive the coding and tooling. The statistical knowledge is required for crunching the data.
That apart, becoming a data-driven organization is as much a cultural change as anything else. In such a scenario the committed involvement of the top management becomes pivotal in the grand scheme of things.
Lastly, knowing the why and having a clear goal in mind may well be the most important first stepbefore the blocks start falling in place.
And those are the steps on the journey to creating a meaningful data initiative for your organization.