Listen to this article:
OK. You’ve convinced management to support a Data Products initiative. Maybe you want to do AI and not be one of the 95% that fail or underperform. Maybe you have an existing analytical environment that’s like the house at the end of the street that hasn’t been lived in for several years and whose yard has become overgrown and weedy. Maybe your analytics environment just isn’t as efficient or as effective as you’d like it to be.
It doesn’t matter why. Management support gives you a very small window of time to make progress and to demonstrate business value. It is imperative that you take advantage quickly. Otherwise, when you’re asked what you’ve been able to contribute to the bottom line through your efforts and you say that your team drew a really great conceptual data model, that window will slam shut.
Don’t get me wrong, a Conceptual Data Model is a great and useful thing. I love them. But when those who aren’t completely sold on the whole Data Product concept take a chance and provide support for your effort, you need to give them something to validate their confidence in you.
You need to think about quick wins. It’s trite, I know.
Trite, but true.
But where to find a quick win with business impact? Start at the end.
Ask your business users or management about their pain points, and offer to make their life better.
Everybody complains about something. Where is the absence of quality data or well understood data causing headaches? Those could probably be fixed with Data Products. Two benefits of approaching the question from this direction:
- First, you will make a meaningful impact, directly solving a business problem, making someone’s life better, and generating quantifiable business benefit. The line from your efforts to that benefit is straight and clear.
- Second, when you are successful, you will have created another advocate for your Data Product approach.
Once you select the pain point, the next step is to identify the root cause(s) of the problem.
They may or may not be obvious. Is the data inconsistent? Does this report not balance with other reports? Does it take a long time to gather the data? Is the summary hard to maintain? Can the users not get support when something goes wrong or questions answered when something is unclear? You may discover problems in process, ownership, data quality, support, understanding, etc. It could be any one (or more) of a million things.
Break it down. Attack one piece at a time.
This is particularly true if you’re targeting a Composed Data Product and the Foundational Data Products from which it is generated. And I highly recommend this approach. Especially if you’re just beginning your Data Product journey and are looking to prioritize domains. It will point you in the right direction.
You could build Foundational Data Products simply because they need to be built. That’s one way to do it, but it’s risky. The business benefit of delivering some Foundational Data Products may be apparent to you, and you are no doubt correct. Rationalizing master data is tremendously beneficial, but the benefit may not be apparent to the users and to management. And they are the ones that have to be convinced.
By starting with a Composed Data Product, you know that those constituent Foundational Data Products you choose will directly support business benefit.
Collect Data Product information as you’re searching for the root cause, but you don’t have to gather everything all at once.
Yes, you need ownership details, process details, data details, security details, technical details, and more. It’s a big fill-in-the-blank exercise where along the way you’ll discover that understanding isn’t complete, processes aren’t defined, and responsibilities aren’t clear.
But, deploying Data Products is equal parts research exercise, culture shaping, and technical implementation. It’s also equal parts art and science; improvisation and algorithm.
A good place to start is with provenance and ownership. Who is responsible for the data, summary, or report? Who is the decision-maker for its content? Those folks probably have (or know who has) a lot of the other information you need. Where does the data come from? How does it get to where it’s going?
If there’s a problem with Data Quality, gather that information. Who defines the data quality requirements? What are the expected content, definitions, actual content, existing quality measurement processes, and quality expectations.
The project can go in so many different directions.
Let the problem lead you.
It will tell you what needs to be done next. Let it. I know that sounds weird, but it’s true. Find a rough spot. Keep picking at it until you know what needs to be done to smooth it out. Find the next one. Rinse and repeat.
As you’re filling in all the blanks, it may be useful to use a metadata repository, Data Catalog, or Data Contract repository. It is worthwhile to have someplace to organize and store the data; however:
Do not wait until you have your metadata repository set up to start collecting Data Product information.
Don’t waste the precious little time when you have management support doing something like this. Gather the information and put it somewhere you can find it again and keep moving. Besides, you’re only going to have a couple Data Products right now. A spreadsheet or document or workbook or anything will do. It doesn’t take long to search through a small number of them.
Someday, though, your efforts will gain momentum. You’ll finish two or three or four or more Data Products. At that point getting the information into a properly organized metadata repository will be driven by the business need to be able to easily find your Data Products and information about them. The benefit will become self-evident. The project will follow.
The information you’re collecting is more important than where you’re putting it, but information stored in a repository is not the deliverable.
I can’t stress that enough. Gathering Data Product information is a means to a business benefit end. Again, neither management nor the business will care that you have collected a whole bunch of metadata into a repository. Well, they’ll nod approvingly, then toss your effort onto the heap of past disappointments, wondering when they’ll see some business benefit themselves.
You might not even want to mention loading metadata into the repository as part of the executive updates. You might proudly say, “Last week we had 40,000 objects loaded. This week we have 50,000!” To which management will reply (or worse, think), “OK. That helps me how?” Focus on the business problem being solved, how it is being solved, and progress toward the solution. The real deliverable is business pain relief.
The implementation of the Data Product is not in the repository.
This is a corollary to the previous point. You need to have collected the information, but the Data Product has to be actionable. Development, maintenance, data quality collection and analysis, and support processes need to be in place. Those don’t live in the repository. They live in the standard operating procedures of the participating groups.
And they require doing.
No more “you can do anything you want as long as you don’t impact anybody else.” You are going to impact somebody else. You’re not going to have a Data Product if you don’t impact somebody else. So, by all means, don’t impact anybody else. Just don’t expect to have a Data Product.
Management support at the beginning of a Data Product initiative is very valuable. Take maximum advantage of it. Solve a business problem. Maintain direct line of sight between your Data Product efforts and solving that business problem.
Do that with the first one.
That’s how you’ll get a second.