You had blood drawn in the doctor’s office as part of your annual physical and just received a printout with a bunch of numbers. Cholesterol 142. Platelets 203. Glucose 93. And so on. And you probably believe them. Why? Do you even think about it? Consider all of the assumptions you’re making:
- The blood that was tested was actually yours and not somebody else’s.
- The sample was handled properly on the way to the lab.
- The tests measured the target factor correctly.
- The technician ran the tests and interpreted and the results properly.
- The results recorded were from your blood and not somebody else’s.
Maybe you believe the results because the doctor said so. After all, doctors generally know which tests are more or less reliable. Maybe the results are consistent with those you’ve received in the past. Maybe the result using one test is similar to the result using a different test.
Maybe it helps to know that the doctor’s office and laboratory have sample labeling and handling procedures, test standards, oversight, and training programs all designed to minimize errors. Maybe it helps to know that the doctor’s office and laboratory are subject to regulatory, accreditation, supplier, and internal quality management. For most of these, a trusted authority sets standards, certifies, and audits. The consequences for failure can be severe.
When you can’t validate the data yourself, you have to rely on the assurances of others and on mechanisms established to promote trust.
Now, instead of blood tests and your physical health let’s talk about data and your company’s health.
Why do you trust your company’s financial reports?
Probably for many of the same reasons, not the least of which is that the consequences for inaccurate financial disclosures can be severe. Especially since the passage of the Sarbanes-Oxley Act in 2002. The use of standard accounting practices, closing procedures, reconciliation, transaction monitoring, audit teams, analysts embedded in each organization, and a wide array of controls have all been put in place to ensure the accuracy and reliability of the data and the reports.
In other words, an expansive Data Quality effort.
See, companies do care about their data. When the stakes are high enough, resources will be allocated to Data Quality. In a large corporation, dozens or even hundreds of analysts are dedicated either directly or indirectly to Data Quality. The result is financial data that is at least trustworthy enough to report to the government and to the street.
Why do you trust your company’s sales or marketing or inventory or production or operational reports?
Hopefully it’s because your company has a robust Data Governance practice that ensures and enforces Data Quality, allowing you to use your data with confidence. Maybe you’re counting on those who generate the reports to do some Data Validation, balancing figures across multiple sources and tracking trends.
But maybe you just don’t worry about it. After all, the consequences for low-quality data in these scenarios don’t seem as severe. An incorrect business decision here. A faulty AI/ML model there. Development delays and months of bug fixes in applications that made different assumptions about an undocumented data set. Not great, but nobody’s going to jail. Furthermore, the consequences are rarely attributed to the data itself and certainly not back to the root cause of poor Data Understanding.
Data Governance and Data Quality initiatives tend not to have very good track records, which leads to executive skepticism and resistance. So, try this exercise. Ask your CFO whether s/he would be comfortable standing behind the company’s financial reports if the financial data was managed in the same way as the rest of the corporate data, without their army of analysts and auditors. I doubt they’d go for it.
Reframing the financial data preparation apparatus as a Data Quality effort can be a foot in the door of executive perception.
This allows you to start to close the perception gap.
Your company recognizes the benefits of Data Governance and Data Quality, it’s just not called that.
The data needed to be correct, and this is what it took to do it.
The assumptions required to believe your blood test results and your company’s financial reports are directly analogous to Data Quality dimensions: lineage, accuracy, completeness, consistency, and precision.
Again, if you can’t validate the data yourself, like the blood test, you have to rely on the assurances of others and on mechanisms established to promote trust. We’ll discuss these mechanisms in the coming months. In the meantime, think about what you would need to know about the data to really trust it. What would a Nutrition Facts label for data have on it? Would you need to see all ten key dimensions of Data Quality or just a few? I look forward to seeing your thoughts.