Quality of Derived Data.
Part 1: Quality of Steps and Calculations Along the Path
Almost always, an item of raw data (especially without the context of other data) is not as useful as a cluster of observations, and/or derived data. Derived data is created when we take a raw fact, and adjust it (some call it “normalized”) for factors which may influence it, and the broader context. Derived data may be the sum of many observations. And/or it may be the result of division by some other general factor.
Derived data becomes even more valuable when expressed along a dimension. The most common dimension is time. When contemplating any large number (such as the U.S. federal deficit), I always want to see it in two contexts: the first is in the context of previous values of the same statistic, and the other is in the context of other measures of the total economy.
The quality of such data is dependent upon three factors: the quality and consistency of the raw data collected, the quality and consistency of the data used to normalize it (e.g. GDP or Gross Domestic Product), and the quality and appropriateness of the calculations. Generally, with automated methods, the calculations are fairly reliable. But almost all derived data (especially drawn from some kinds of aggregates) are vulnerable to inconsistencies in definitions and measurements over any dimension.
This first article in a two-part series will address the process of data derivation, and the introduction of other data sources for purposes of normalization and adjustment. The next article will discuss the ambiguities of nearly any aggregate measure in society and the economy.