This is the first in a series of articles that explores the question of why we continue to see overwhelming numbers of analytics, artificial intelligence, machine learning, information management, and data warehouse project failures despite the equally overwhelming availability of resources, references, processes, SMEs, and tools…and what can be done about it.

 

Data is back in the corporate limelight. Again. Seems we’ve been here before. In years past it’s been data warehousing, metadata, big data, and advanced analytics. “Data Driven” has been the “new” buzzword for more than a quarter-century. Now it’s artificial intelligence and machine learning. Management is recognizing that Data Quality is required to produce quality AI and ML models. For data professionals, this is another opportunity to leverage executive attention to drive Information Management progress.

So, we dutifully revisit our Data Governance and Data Quality plans. We get some new books and read some new articles. We package up a comprehensive step-by-step method, get head nods, and start to implement. But problems emerge almost immediately. It’s too hard. It takes too long. We don’t have the resources. So, we ask the boss to help clear the path. And instead of making progress we get thanks for the good work but right now might not be the right time. The plans return to the shelf and the project team is dispersed. Again. We know the benefits that the effort could have. Again. We know the issues that could be resolved. Again. We know that development could be accelerated, and errors reduced if only…

Again.

This lack of progress is certainly not the result of a lack of knowledge or resources. So many experts, instructors, mentors, and practitioners willing to share. We have professional organizations, references, vendors, software products, process templates, subject matter experts, consultancies, dozens of books, thousands of articles and white papers, and innumerable PowerPoint presentations and strategic plans. The technology is getting better. The processes are getting better. AI is being applied.

We know what to do and we know how to do it. Most everybody understands that it’s important and recognizes the value. You would think that Information Management would be thriving everywhere. Yet, that’s not the case. We’re still fighting the same battles and we’re still making the same arguments twenty-five years later. And we’re still seeing the same failure rates.

Why?

Before we can sustainably realize the benefits of Information Management, we must first have a basic understanding of the data. And the most basic understanding requires that we know two things:

     1. What the data element means.
     2. The values that it’s supposed to contain.

In other words, its definition and its expected content. Without those, you can’t do anything else, or at least not easily, sustainably, or at scale.

Too often we bypass Data Understanding and jump directly to Data Quality or Data Security or Data Analysis. But Data Quality requires a standard against which to measure variance in the actual data content. Data Security and Data Privacy processes assume that enough is known about the data to assess risk. Artificial intelligence and machine learning are two of today’s most exciting and potentially impactful technologies, yet take for granted the existence of a foundation of Data Understanding that is often not being built. And without that foundation these efforts are likely to fail. Models trained using misunderstood data will yield unexpected, potentially misleading, and likely incorrect results.

In short, the benefits of Information Management lie at the far side of a chasm which many organizations have yet to cross. The results can be seen in the high failure rates of data warehouse, analytics, and AI/ML projects. It is a spectacularly poor track record which is too easily accepted as “normal.” 

I do not believe that we must accept that.

Data Understanding is the key to success. It’s the fuel that accelerates application delivery, business and operational analytics, and AI/ML model development. It enables faster responses to changing market conditions. It facilitates communication between development teams and business units.

A company’s most valuable asset is its understanding of its data.

Future installments will explore the challenges implementing and widely adopting Information Management despite the comprehensive foundational work and ample reference resources. Several barriers to Data Understanding will be examined, along with recommendations for building bridges to cross the Data Chasm.