Roadmap for Implementing an Intelligent Data Lake

10.22.2021, Kyle Hirsch

An intelligent data lake is a central, structured, and integrated database. It’s where you can obtain information through analyses and then make informed decisions. For marketers, the data lake solves the challenge of providing the total customer experience, by answering several questions:

• How do I get a complete overview of my customer’s journey?

• How can I forecast and optimize my marketing results?

• How do I ensure personalized experiences across all touchpoints?

The intelligent data lake contains data from various sources over the long term. This enables companies to organize, analyze, and activate data centrally via marketing channels without having their own hardware or data center.

Getting Started: The Organizational Aspect

To implement it successfully, the organizational aspect is just as important as the technical setup. Before starting technical implementation, organizations should create a roadmap that sets out the goal of implementation and the approach to be adopted.

The Approach

To ensure successful implementation, organizations need to go through several phases. The first phase is to establish a clear scope for the project; the final phase is a pilot that demonstrates the value of the platform. The whole process can be broken down into five phases:

Scoping

The scope phase, consisting of several joint sessions, is intended to determine the objective of how the team will utilize the intelligent data lake. It’s important to clearly state your business case and establish the plan, where a distinction must be made between short-term, medium-term, and long-term projects. This is where it’s essential to create a broad support base and involve people from different layers of the organization.

Data Setup

The data setup phase consists of identifying, auditing, and linking the data sources required to implement your identified use cases successfully. Your organization must have a clear picture of the available data, data flows, and the way in which data is organized. To manage this effectively, it is necessary to identify the different data silos and determine who needs to work with the data in the future. Once this is clear, you can determine who needs to be involved at this stage.

Solution Design

Once the right data is centrally available across your organization, you can choose the way that new, incoming data is structured. This requires determining what the desired output is and what models are appropriate for the defined use cases.

Preparing the data and selecting the right models requires a close-knit team with knowledge of it. Defining the desired output also requires some knowledge of the platforms on which the data is activated. At this stage, it may be necessary to bring in external parties that have the right knowledge and experience. It is also important in this phase to create a schedule and delegate responsibilities.

Solution Development

From here, start on data preparation and begin by building and validating models. At this stage, a test-and-learn culture is key. The goal is not to build the perfect model in one go, but to test the different options in a short period of time to reach the optimal outcome. The focus here is technical refinement and implementation.

Pilot and Implementation

Once the infrastructure is ready, the first tests can begin. In technical projects such as a data lake, many organizations adopt an approach that puts the technology first. You can build a great platform this way; however, the actual performance can only be powered by the in-depth and thoughtful ways it will be used.

At this stage, it’s important to limit the time to value. Start with targeted use cases, and do not try to remedy all the data challenges immediately. As you get further into rolling out your data lake, you can fix any problems as they arise.

Want to learn more? Download dentsu’s new report The Intelligent Data Lake for Marketers, Powered by Google Cloud Platform here.