Skip to content

IAIDQ Publications

This login gives access to IQ International Members only

Join or renew your membership
now to gain full access

HELP! I forgot my login/password

Connecting Entity Resolution and Information Quality
April 2011
John Talburt

The term Entity Resolution (ER) has only been in use for a few years, but the concept has been around since information systems have been in use. Sometimes called record de-duplication, record matching, record linking, merge-purge, or the co-reference problem, ER is the process of determining if two references to real-world objects are referring to the same or to different objects.

ER is an important tool for achieving Entity Identity Integrity, a fundamental data quality rule that should hold in any information system. In his book Data Quality Assessment, Arkady Maydanchik describes Entity Identity Integrity in the context of a database system as strict adherence to the following conditions

  • Each row (entity reference) in a entity table corresponds to one, and only one, real-world entity; and
  • Distinct rows in the table correspond to distinct real-world entities.

Entity Identity Integrity is also another way of stating the Fundamental Law of Entity Resolution – that two entity references should be linked (merged or integrated) if, and only if, they are equivalent (i.e. both refer to the same real-world entity).

A more complete discussion of the Fundamental Law of ER and other ER principles can be found in my book Entity Resolution and Information Quality (2011, Morgan Kaufmann Publishers).