I hate paying more than I have to for something. And I really hate paying interest, whether it’s on a credit card … or a loan … or Data Debt. 

Data Debt is the accumulation of unresolved data-related issues and unfinished information management activities.

Think shortcuts taken in the interest of expedience. Shortcuts that directly lead to increased delivery times, defects, costs, and security vulnerabilities.

And it happens everywhere. Large examples and small. Here’s one where I’ve been guilty. How about you? I pull together some information into a spreadsheet, run some analysis, save the file, and quit. A couple months later, I need to run the same analysis and remember that spreadsheet. I open it. And I’m confronted with a wall of unlabeled numbers. I think they came from the transaction table joined with one of the product tables. The not very helpfully named Sheet2 has fewer numbers on it. A summary as I recall.

How long would it have taken to add column headers? How long would it have taken to copy and paste the query I used to retrieve the data? How long would it have taken to write a couple sentences describing the process? I can think of more than one instance where I could have saved myself a day of reverse engineering and rework if I had only invested two more minutes of effort at the time.

Data Debt sprouts from the root of most debt: capitulation to immediate pressures (or desires) by compromising what we know should be done.

We almost always know what should be done. We almost always recognize it when it’s happening. We almost always know the right thing to do and we still don’t do it. And then we have the nerve to complain that, for example, Data Scientists spend up to 80% of their time finding, understanding, and preparing the data to spend 20% of their time actually analyzing it. 

But it’s always the same: not enough time or people or funding, or the right data modeler isn’t available, or why do we need to waste time on this information management stuff, or whatever. 

We console ourselves with the promise that we’ll fix it later. 

The later always happens. The fix it, not so much. 

It’s an admission that we’re more interested in doing something now than doing something right, and that we’re willing to live with the result.

It seems odd, though, that we so often make the decision to accept the consequences of accruing Data Debt without even considering or quantifying that future effort. We are willing to live with the consequences, whatever they are. Usually expending disproportionately more effort or expense to correct whatever it is we didn’t feel like doing at the time.

It’s challenging to get specific examples of data debt because most companies either don’t realize that they’re accruing it, or they don’t want to air out their dirty laundry.

But a couple months ago, HFS Research in partnership with Syniti published a white paper entitled “Don’t drown in data debt; champion your data first culture.” They surveyed more than 300 leaders across the global 2000 to learn what enterprises can do to improve data management and realize their strategic ambitions. It’s a very interesting report and I highly recommend reading it. 

The overwhelming majority of respondents recognize that effective data management has a significant impact on their business and drives value, but fewer than a third believe that their enterprise data can satisfy their business objectives. They found that data quality deficiencies rendered up to 40% of their data unusable, impacting by 25-35% nearly every organizational success metric (e.g. net promoter scores, decision making processes, employee productivity, compliance costs, budget, and revenue). 

The report’s bottom-line observation:

Enterprises are drowning in data debt.

Yet, fewer than 40% even quantify the impact of bad data on their operations. To most companies, data debt is invisible. They say you can’t manage (or improve), what you don’t measure. And you really can’t manage something whose existence you don’t acknowledge.

Which negligence is worse: recognizing that you’re creating data debt and not doing anything about it, or not recognizing it in the first place?

The Chief Digital and Technology Officer at one of the world’s largest cruise companies said,

“Our company has grown dramatically faster in the last 10 to 15 years, and we have been pretty much focused on acquiring market share and building capacity. We didn’t really pay the right level attention to the data architecture and to the data framework to be in a more competitive position for the future.”

Near-term pressures conflicting with long-term objectives. Sound familiar?

So, what can be done? We all know what needs to be done. We need to decide to eat sensibly, exercise, and lose weight before the heart attack. 

Not after.

Management must first recognize the impact of Data Debt on the enterprise, and hold their teams accountable for the Data Debt that they create, even if it cannot not yet be quantified.

Many companies have recently started recognizing the analogous concept of Technical Debt; quantifying it, and incorporating it into their decision-making processes and project plans. The same is now required for Data Debt. Management has a critical role in establishing and supporting a data-first culture and enterprise data strategy. And it’s at least as important for business management as it is for IT management. 

Furthermore, to quote from the HFS paper, management must “give data management the respect and talent it deserves. [It] is a critical function that demands a unique blend of technical, business, and industry expertise to be successful.” Despite this, shortage of specialized talent was consistently related as a top challenge. Becoming a data focused enterprise will require cultural, conceptual, and organizational reorientation.

Each of us can play an active role in reducing the accrual of Data Debt by taking responsibility for our own data.

The necessary, prerequisite first step is to understand your data. Monitor its quality. Identify the root causes of bad data and measure its impact. Recognize when Data Debt is happening and capture the details somewhere. Quantify the perceived immediate benefit and the observed future costs. 

I’ve talked before about the misalignment between the Data Product producers who make the investments and the Data Product consumers who benefit from them. The same misalignment exists with Data Debt except it works in the opposite direction. In most companies, the teams creating the Data Debt experience no consequences while the downstream consumers pay the price. I got my project done without doing any of this other stuff. Celebratory chicken lunches all around. 

This makes identifying and linking observed future costs a little more difficult. Actually, the identifying is easy and everybody can do it. Did you have to spend extra time finding a data set because the tables weren’t defined? Observed future cost. Did you have to retrain your models because a field didn’t contain what you thought it contained? Observed future cost. Did you have to call the developer to find out the details of a particular calculation because you couldn’t figure it out just from the numbers? Observed future cost. Accumulate those into the Data Debt incurred by the development team. Maybe when the number gets big enough, management will stop excusing these challenges as part of the normal cost of running analyses.

I’ll conclude with the story of two friends at two different companies that were both acquisition targets. Each of the four companies had their own analytics environments. In one case, the data warehouses remained separate. Getting a complete view of customers and their interactions with the combined company required accessing data in two different locations on two different platforms. Eventually the data was consolidated into the same database, but remained in separate tables. The decision not to integrate up front impacted every user every time they accessed the data. This is a very, very common scenario. 

Contrast that to the other instance where the purchased company was given three months to integrate their analytical and operational systems into the parent company’s. Yes, it was a tough quarter. But when they finished, everything was rationalized and consolidated, and everybody could move forward together generating business value. Despite the enormous advantage in benefit to the business, this approach requires much more up-front effort and discipline, and remains the exception. 

Just like you and your first credit card in college, you’ll find that your cost, in terms of time and effort, to avoid going into debt will be far less than your cost to get out of it.