Literature on Data and Information Quality
The following is truely a remarkable book. I have been influenced by Larry for quite some years. And yet, the book is refreshing and contains lots of new stuff. For example the parts on quantifying the problems, the step-by-step checklists, and the subject area oriented chapters realluy contribute seriously to the area. I think this is going to become the reference work on information quality for a number of years. Highly recommended:

Information Quality Applied
Larry P. English
John Wiley & Sons, 2009
ISBN: 978-0470-13447-4

And this is another good book on Data Quality:

Data Quality, The Accuracy Dimension
Jack E. Olson
Morgan Kaufmann Publishers, 2003
ISBN: 1-55860-891-5


International Association of Information and Data Quality

Data Quality Pro

Data and Information Quality

The No 1 Showstopper!
The largest challenge of them all is no doubt the quality of data. Many organisations simply do not know the size of the problem until very late in the project. And by then it may be too late to turn back and fix the problems.

Information Quality

The highest level is that of Information Quality, which deals with the problem as a business issue on the business level. Nobody has described this better than Larry English in his new (2009) book Information Quality Applied - see the details in the sidebar to the right.

What's wrong with data?

There are many things to consider. Here are the most important dimensions of data quality:
As a consequence of low quality, many projects get seriously descoped - or effectively discontinued - because of this.
  • Accuracy (how well is the real world described?)
  • Completeness (how much is missing?)
  • Consistency (eg. across different systems / redundancies etc.)
  • Correctness (conformance with business rules etc.)
  • Integrity (eg. compliance with master data, eg. Product Master, and other relationships)
  • Timelines (are data too old?)
  • Uniqueness (for keys, identifiers etc.)
  • There are more aspects than these, but the above is sufficient for most. (The international standard ISO/IEC 25012 defines - in detail - these qualities: Accuracy, Completeness, Consistency, Credibility, Currentness, Accessibility, Compliance, Confidentiality, Efficiency, Precision, Traceability, Understandability, Availability, Portability, Recoverability).
The amount af "bad" data is most often larger than people think. As a rule of thumb maybe 5 % of all transactions are problematic in one or more aspects, but it can be much worse than that.

Why are data wrong?

There are a number of factors contributing to these types of problems - and they reinforce each other. Some of them are:
  • Lack of data validation in ERP-systems
  • Older technologies do not support eg. drop down lists
  • The precise business rules are not known to many
  • People are busy
  • Psychology: Correct data are not important to the user, who registers them, eg. the car sales person
  • Data may be loaded / corrected in one-off batch runs
  • The rules have changed, but users are not aware of it
  • And much more...
Basically it is a business problem. If data was treated as any other asset, this would not happen to the extent that it actually does. This is part of what I call Information-driven Business Analysis, and you can learn more from these eLearning offerings.

What can we do about it?

Many organisations do not invest in these disciplines. Maybe because they do not understand what they really are. They are: Asset management. Simple as that. Information is one of your most important assets and should be on our balance sheet. Think about the impact of losing all or even half of your data overnight...
So Data Governance really puts in place some Information Owners and some "Information Controllers". Master Data Management does to your shared data (such as customers, products and so forth) what standard schedule of accounts, standard costcenter structures and standard bookkeeping dimensions do for you financial assets. Why is it that many companies do not treat information as an asset?

Do not build on sand!

Fortunately software tools are now available to help business detect and monitor data quality problems. Even to the the extent of implementing repeatable business processes to manage the area. If you use tools such as these, you can scope and size projects much more precisely than ever before. And you can measure the data quality, also over time. See the Data Profiling page for more details.

Controlling the Information Asset

Quite a few people, including myself, are suggesting that you treat this area as core business processes and manage it using balanced scorecards and key performance indicators. See the Data Quality Scorecard page for details.