Informatica Data Quality Offering - AI to the fore

Written By:
Published:
Content Copyright © 2023 Bloor. All Rights Reserved.
Also posted on: Bloor blogs

Informatica Data Quality Offering banner

Informatica has long been a major vendor in the data quality space, going back to its acquisition of DQ vendor Similarity Systems back in 2006. Data quality is complementary to Informatica’s traditional strength in data integration, and also to its master data management capability, which was initially built on its acquisition of Siperian in 2010 and of Heiler in 2012. Informatica undertook a major rewrite of its code base to move to the cloud, and now offers a full suite of data quality, data integration, master data management, governance and catalog software on a multi-tenant cloud basis. Indeed, for some time all new Informatica sales have been cloud-based. The company now does over 1.5 billion dollars revenue.

The data quality offering of Informatica uses its “CLAIRE” machine learning engine to enhance its traditional capabilities in profiling, matching, data cleansing and enrichment. For example, the software can use the profiling statistics to detect likely anomalies in data, highlight these to business users, and suggest and generate new data quality rules based on how these anomalies are dealt with. Similarly, machine learning is used in matching to help suggest possible data duplicates, and again new rules can be suggested to improve the level of automation of duplicate detection in the future. The vendor uses the term “predictive data intelligence” to describe these capabilities. The software provides different interfaces for different defined user categories, such as data stewards, data analysts and data consumers. Amongst customer case studies, one bank in Brazil accelerated their credit decisions by 30% after implementing the software, while NYC Health and Hospitals used the software to improve their data quality and consistency across 25 separate data sources into a business glossary of 1,200 medical definitions and KPIs, accessible by all clinical staff.

Informatica has a regular cycle of three major and three minor software releases per year. Recent examples of new functionality include profiling for SAP HANA data, allow data consumers to see data quality indicators via a “Cloud Data Marketplace”, and the ability to generate data quality rules via a natural language interface.

Informatica competes with other data quality vendors such as Talend and Ataccama, as well as the data quality offerings of Collibra and others such as IBM. With its modern cloud-based platform, Informatica is clearly a major contender for customers who have a need for data quality improvement, particularly if this need is linked to master data management or data governance initiatives. Data quality is not a new field, and despite past investments, only 27% of data practitioners actually have complete trust in their data. There is therefore plenty of opportunity for vendors with to improve the automation of data quality processes, whether this is working on data that is on-premise, in one or more private or public clouds or, more likely, a combination of these places.