Skip to content

IDQ Webinars: with Gian Di Loreto


Topic:

An Overview of Data Profiling Tools Available From DataFlux, Talend and Trillium

Abstract:

Among the scores of commercially available data profiling software applications, we have chosen three to demonstrate during this webinar. DataFlux, Talend and Trillium have been generous enough to allow us to their products for this purpose and so we have the unique opportunity to see (and play with) these enterprise tools side by side.  We are grateful to have the latest versions of Trillium and Talend for this webinar.

Using a large database populated with HR data (with scrambled personal information) we will walk through a data profiling exercise and discuss each applications strengths and weaknesses. We will undertake similar exercised with each application and perform as fair a comparison as we can.

However, before we begin the actual hands on data profiling, we will discuss data profiling as a concept. We will introduce the audience to some modern profiling definitions and concepts with the goal of seeing these concepts in action during the active demonstration.

We will discuss basic data profiling techniques and as time allows dig into more advanced data profiling concepts including subject level data profiling and state transition analysis.  Subject level data processing is an integral part to modern best practice data quality techniques; we believe its inclusion in this webinar will be invaluable. 

Webinar recordings are available to IQ International Members only

Join or renew your membership
now to gain full access

HELP! I forgot my login/password

Recording:

Speaker:

About the Author

Gian Di Loreto

Gian Di Loreto is one of the USA's leading authorities on human resource data quality and is currently Senior Consultant at Profisee Group.

Gian holds a Ph.D. in particle physics from Michigan State University. He began his career as an experimental physicist at Chicago's world-renowned Fermi National Accelerator Laboratory (Fermilab), where he spent several years performing statistical analyses and identifying errors in massive databases generated by proton/anti-proton collisions. After his tenure at Fermilab, he leveraged the analytical skills he developed in a scientific context as a software developer for a firm that reconciled and corrected data generated by General Motors' pension funds.

He has worked at Loreto Services & Technologies, a consulting and IT outsourcing firm whose mission is to help companies understand what is in their databases, identify errors and discrepancies, integrate disparate databases and, in the process, eliminate historical and ongoing mission critical data errors using proven, scientific and statistical techniques.

View my profile on LinkedIn

Moderator:

About the Author

Robin Rappaport's photo

Robin Rappaport is the Data Quality Team Leader responsible for delivery of the Data Quality Initiative for Research Databases at the Internal Revenue Service (IRS). Her work and that of her team contributed to the IRS being awarded a Computerworld Honor and a Government Computer News (GCN) Gala Award. She has over 25 years of experience as a Data Quality practitioner. Her undergraduate degree was in Economics with Computer Science. Her graduate work was in Operations Research with a concentration in Mathematical Modeling in Information Systems. She has worked in both private (6 years) and public sectors (since June 1990). Her positions include Computer Programmer, Systems Analyst, and Operations Research Analyst.

In addition to IQ International, the International Association for Information & Data Quality, she is a member of the Institute for Operations Research and Management Science (INFORMS). She was Chairman, Individual Membership for the Washington, D.C. chapter from 1987- 1990. She was elected Secretary and served from 1990 - 1991.

Contact Robin by email at robin [dot] rappaport [AT] iaidq [dot] org

View my profile on LinkedIn