A data quality solution? I can't even see a problem
January 2005
Graham Rhind

“If you think education is expensive, try ignorance” (Derek Bok)

One of the issues faced by any data quality practitioner whose field is at all specialized, is how to persuade data owners that they have a problem with their data.

Many data owners realize that they have a problem with data quality, but the form that that problem takes is often less distinct in their minds. Some forms of data quality issues are more easily recognized than others. A sales system which suggests sales of 1 million while the accounts system indicates sales of half a million is likely to show a clear data quality problem. Engineers, building a bridge from both banks that fails to meet at the middle, know they have a data quality problem. But when the data quality issue is based on a subject area which requires a deep knowledge of that topic, the data quality issues are much more difficult to recognize – and are therefore not given priority by the data owners.

My own specialization is international personal name and address data management, and this is one of those topics. With over 6000 languages written in one of tens of different scripts, over a hundred different address formats and around forty different personal names formats, there’s a lot to know and learn. In a world where an alarming number of people cannot even locate the country that they live in on a map, the general knowledge level on this topic is shallow.