From the IAIDQ President
christian [dot] walenta [AT] iaidq [dot] org
IQ Journal Vol 8 Issue 2. 2012
Dear fellow IAIDQ member,
“Big Data” and Business Analytics have become the new frontier in information management, driven by a new set of technologies that allow the processing and matching of huge data volumes of many different data types, running through algorithms that determine if there are any new insights or valuable patterns to leverage for business and competitive advantage. An exciting world with new capabilities has emerged, with proven successes and results!
As Big Data goes mainstream, we information quality professionals have to ask ourselves what’s needed to govern, to manage, and to ensure the quality of Big Data. Let me offer some thoughts for discussion:
We need sound quality management principles for Big Data. More than ever, we live in a digital world where unstructured information and data from sensors and devices are becoming as important as the structured data in our databases. Many companies are already using these technologies to create a “smarter” world. It’s just a matter of time before your organization adopts these technologies. If we are to use this data to make better business decisions or drive business actions, we must carefully manage it and apply sound quality management principles.
Data from automated devices require processes that monitor quality. When data is generated by automated sensors, the “human error” aspect of information quality may become less of a concern. However, we still cannot just blindly trust the data just because it’s been automated. We must know the source of the data, track changes to its format and structure, and create appropriate data management processes to judge and understand the data’s relevance and quality, whether it originates from within or from outside of our organization.
New challenges arise from the sheer data volumes and the timing of feeds and data. Big Data will, to a great extent, be processed in streams and at near real-time. The volume of data may be so overwhelming that it cannot be physically stored or retained in data repositories. While we have experience with statistical sampling in our field, I suspect that we will need a new set of approaches to sample and validate the quality of Big Data.
We will need to broaden the Stewardship accountabilities for Big Data considering the new data types, variety and speed of usage.
Despite these differences, I believe that the IQ principles that have served us well remain relevant; they are fundamental principles, after all. We still need to understand how the information consumer will use this new data, how business processes and decision-making will be affected, where the data comes from and what it means, what the critical quality dimensions are, and how best to measure them. And we’ll certainly need processes and roles in place to manage the Big Data, and address and fix quality issues. We will, however, need to add new and revised methods and approaches to embrace Big Data in our Information Quality world.