July 2006 meeting reading guide
Titles
Enterprise Knowledge Management: The Data Quality Approach by David Loshin, published by Morgan Kaufmann, © 2001, ISBN:0-12-455840-2
Three chapters will be discussed:
- 5 — Dimensions of Data Quality
- 9 — Measurement and Current State Assessment
- 12 — Rule-based Data Quality
Reading Intent
This reading is designed to introduce the important issues and challenges of defining and measuring data quality. The rules-based approach and rules engines are also briefly discussed as means to facilitate data quality definition, measure and assure data quality.
Reading Group Questions
The following questions are designed to stimulate and enhance our understanding and analysis of the reading materials. Though we will use some of them during our discussions, their only function is to provide a useful and common starting point. They are not meant to limit our discussions.
Chapters 5 and 9
- What are the five categories used by Loshin to group the many DQ dimensions?
- How many DQ dimensions are included in each category? What do you think of the distribution?
- In each category, which dimension is most useful to you? Which dimension is easiest to measure?
- Are there any categories or dimensions missing? Which ones?
- What synonyms to "DQ dimension" are also used in the literature?
- How do Loshin's five categories echo the IDQ definitions and improvement approaches used by English, Redman, and Wang (from the June readings)?
- According to Loshin, why is it important to understand and use the notion of DQ dimensions?
- How does Loshin's concept of "sentinel rules" help mitigate the risks of using too many or too few DQ dimensions? Or of using too few categories
- What DQ dimensions are used in your organization, and how are they measured?
Chapters 12
- What is Loshin's definition of "business rule"? Of "rules engine"?
- What are some of the attributes of a well-formed business rule?
- Does Loshin differentiate between DQ rules and business rules? Do you agree with him?
- State an example of a specific DQ expectation, phrased as an assertion or derivation, for two or three DQ dimensions your organization uses.
- Which DQ dimensions can easily be measured using a rules-based engine. Which ones cannot?
- According to Loshin, what are the benefits of the rules-based approach? What are its limitations?
- In what ways can rules-based systems and engines help assure data quality?
- What enhancements can you make to your organization's DQ definition and measurement practices, based on the concepts in these three chapters?
