I’ve covered Data Product implementation requirements at length in several past articles. You need Information, Infrastructure, and Process. Without any portion of any of those you don’t have a Data Product. No metadata? Not a Data Product. No lifecycle management? Not a Data Product. And so on.
Data Product development encompasses all the activities that should be a standard part of information asset deployment.
Organizations that were diligent and disciplined in their development processes may discover that they already have or are already close to having a collection of Data Products. For most of us, though, probably not.
So, which requirements did you not implement? Chances are you did pretty well with the Infrastructure part. After all, you need to have someplace to put the data, a way to access it, and processes to provide ongoing care and feeding of it. Obviously, the data part is covered as well.
But what about metadata? What about lifecycle management? What about data content support? If your organization is like most, those are lacking. Why? Again, if your organization is like most, it was because those activities weren’t prioritized at the time. You got the data. You made it available to your users. Everybody was happy. No time or people to spend on this peripheral stuff. They have more important things to do.
The gap between information assets and Data Products is Data Debt.
One aspect of Data Products that I haven’t discussed yet is their usefulness when considering Data Debt. I describe Data Debt as the accumulation of unresolved data-related issues and unfinished information management activities. They’re the shortcuts taken in the interest of expedience that later directly lead to increased delivery times, defects, costs, and security vulnerabilities. And to not having Data Products.
Like financial debt, you end up having to repay it one way or another. Principal and interest. Development projects take longer to complete. Users make bad decisions based on incorrect assumptions. AI and ML models behave unexpectedly. The costs continue to add up.
To be clear, experiencing the negative consequences of accruing Data Debt does nothing to repay it.
I guess you could consider it like interest-only payments. You pay forever but the debt never diminishes. Or perhaps it’s like not being able to take out a car loan or having to pay a higher interest rate because your credit score is low. It is the ongoing cost of your Data Debt.
Data Products provide a framework for actually paying down your Data Debt.
It doesn’t matter whether you come at this exercise from the perspective of deploying Data Products or reducing Data Debt. The approach, endpoints, and benefits are all the same.
Illuminate Existing Data Debt:
It is possible that organizational leadership knew that it was incurring Data Debt when project prioritization decisions were made. Over time, even the fact that such decisions were made is forgotten. The starting point is to complete a gap analysis between your current in-scope information assets and your future state Data Products. The failure to complete those activities when the asset was implemented in the first place is the source of the Data Debt.
Quantify the Data Debt:
The next step is to quantify the work required to complete the activities identified in the gap analysis. In theory this will be the amount of work that was deferred in service of some aspect of the original delivery. Maybe resources. Maybe prioritization or discipline. Probably a date. In reality, it is almost always costlier to go back and remediate than to have taken care of it at the time. Maybe that’s a better analogue for interest on the Data Debt. When you put together the project plan, you’ll have the time, resources, and cost. This is the Data Debt principal that you’ll be repaying.
Repay the Data Debt:
Completing the transformation of that information asset into a Data Product is repaying Data Debt. You’re reducing the amount of Data Debt accrued to your organization. If your company treats information as an asset from a financial perspective, maybe you can treat the cost of these activities as debt service for tax or balance sheet purposes. (Disclaimer: These statements are for suggestive and illustrative purposes only. They should not be considered financial or tax advice. Always consult with your Finance or Accounting Department or certified professionals.)
Avoid Data Debt:
Finally, now that you’ve identified, quantified, and experienced what it takes to repay Data Debt, it should be clear that the best approach is to not incur it in the first place. That means deploying new information assets as Data Products. This will ensure that you’re not skipping key activities and accruing Data Debt again. Don’t take shortcuts. Don’t get yourself into more Data Debt. You don’t want to have to repeat this exercise.
Now you know the effort required to make Data Products out of your information assets. Now you know the amount of Data Debt your organization has accrued through the years. Are you shocked by the number? I wouldn’t be surprised.
Don’t let the magnitude of your Data Debt discourage you; you don’t have to repay it all at once.
Consider Dave Ramsey’s “Debt Snowball” approach: find a small debt that you can repay relatively quickly and easily. Start with that one. That gives you experience with the process that can be applied to the next one. And the next one. And the next one. As long as you’re always making progress (and not taking shortcuts elsewhere) you’ll always be reducing your Data Debt.