Ataccama
Last Updated:
Analyst Coverage: Andy Hayler, Philip Howard and Daniel Howard
Founded in 2007 in the Czech Republic, Ataccama initially focused on data quality before branching out into master data management (MDM) and data governance. It now has a broad suite of data management tools. The company has now grown to ten locations and over 500 staff, in June 2022 gaining a $150 million investment from Bain Capital for further expansion. In the last year it achieved 50% revenue growth year on year. Ataccama has 75 corporate partners including Snowflake, and 95% of new deals are subscription based. Customers include Aviva, T-Mobile, GSK and Heineken. Half of the company’s revenue is now from the USA, with just over a third from Europe and the rest in Asia-Pacific, the fastest growing region. Financial services is the largest vertical, but the company has plenty of pharma and life science customers, and a quite broad range customers in other verticals.
Version 14 of the software platform “Ataccama One” has just been released in early 2023. The latest release includes improved reporting of data quality status, specific support for the Snowflake data warehouse, improved collaboration features and a new module that allows you to take any set of data and put in within a managed framework, allowing build-in data quality, version management etc. The new release contains enhancements to the artificial intelligence it uses to suggest potential data matches, based on observation of human domain experts. The release builds on the Ataccama data catalog for compliance, which includes support for data governance and associated workflow.
Ataccama now has a well-rounded and quite complete data management solution competing effectively with vendors such as Informatica, Talend and IBM. Its data quality offering includes everything you would expect: profiling, matching, cleansing, anomaly detection, monitoring and reporting. Customers note its good performance, ease of implementation and high-quality professional services staff. It has always been a company based on strong product engineering, and its recent large cash injection should enable it to market more actively and bring the product to a wider audience.
Ataccama and Data Fabric
Last Updated: 23rd February 2024
Mutable Award: Gold 2024
Ataccama has its roots in the data quality market, which it later expanded to master data management and data governance. It has a data catalogue, a core component of a data fabric architecture, where data assets are mapped and represented to business users in some form, such as a knowledge graph, that will show business users the data landscape in some form of semantic layer rather than a physical structure. A user might be interested in something like “customer”, which may be actually stored in several underlying systems such as ERP, a sales force automation system and perhaps some marketing systems too. Because Ataccama’s core strength has been data quality and master data management, it is well positioned to present a clear representation of data to the customer. Their technology already has comprehensive data quality capabilities that can be applied at source. It also has the necessary survivorship rules within its master data management hub to derive a “golden copy” of a customer record from the multiple versions that may exist out in the source systems.
The Ataccama software goes beyond this though. It has an active metadata layer that can access the metadata of underlying source applications like SAP or SalesForce, and can detect changes in these and update its catalogue automatically. In this way the catalogue remains current. Furthermore it has an AI layer that began in 2016 based around machine learning, that was used for transformations and anomaly detection, as well as a natural language interface to its metadata. This software was trained on user interactions in order to generate better data quality rules. More recently it has incorporated the OpenAI generative AI technology ChatGPT 4. This can do things like populate descriptions, and generate business rule logic based on business descriptions. Since December 2023 this generative AI capability has been part of the core product, including generating SQL queries. Ataccama has built several safeguards around this, for example validating any generated code before applying it. Unlike a public AI, which will just about always give you an answer to a question if that involves making something up (an issue known as “hallucination”), the Ataccama AI layer will simply state that it cannot provide an answer if it fails the internal validation checks.
Customer Quotes
“Other vendors are transactional in their behaviour. With Ataccama there is a genuine belief of shared responsibility of success that we feel within T-Mobile.”
Daniel West, Data Management Lead, T-Mobile
“Ataccama ONE really is a one-stop-solution… that means everyone is capable of fully understanding our data asset inventory to power the next stage of our growth and ambitions.”
Catherine Yoshida, Head of Data Governance, Teranet
Ataccama provides a data catalogue with active metadata support as described earlier. It uses this to ingest metadata from source systems and can manage data pipelines. Its data quality firewall ensures that all data that it deals with is subject to data quality rules that have been established. Its master data management capability means that data records that may be duplicated in source systems are matched and merged into a golden record based on business rules and survivorship rules applied to the sources of data. This is a genuine strength of Ataccama compared to many vendors in the fabric space.
The knowledge graph of Ataccama has been around since 2016 and has several ways of being represented in order to show business users their data landscape in a way that makes sense in business terms. Ataccama can also generate queries to retrieve data as needed from source systems based on its understanding of the data structures stored in its data catalogue. It does this by pushing down queries to the underlying source systems, such as Snowflake or other sources. While it can cache certain regularly used data, it is not a database, and any data that it stores is transient other than master data and metadata. While the vendor does not describe itself as having an optimiser, it does in fact do many things that a database optimiser would do in terms of deciding the best way to satisfy a user query, including a cost-based optimisation approach.
The vendor does not pretend to provide an all-encompassing solution to all data management activities, and has links to a variety of partner technologies such as data visualisation and analysis tools.
Ataccama can provide many elements of a modern data fabric architecture. While many vendors have a data catalogue, few have an in-depth background in data quality and master data management that ensures that data served up to business users is accurate, complete and timely. The vendor had seven years of production code incorporating various types of artificial intelligence, and it has a well-thought-out approach to the use of generative AI that is already live within its product rather than just being on a technology roadmap. I have been following Ataccama since soon after its inception in 2008, and its technology stack has always seemed to me very well thought out, which is reflected in its successful implementations in some very demanding customer environments.
The bottom line
Ataccama can provide a significant part of a data fabric architecture. Its modern technology stack has hundreds of customers and its underlying focus on data quality makes it well suited as the basis for serving up data to business users. Ataccama should be seriously considered for customers wanting to implement a data fabric architecture.
Mutable Award: Gold 2024
Ataccama and Data Quality
Last Updated: 12th March 2024
Mutable Award: Gold 2024
Ataccama started in the data quality space but has gradually expanded its offerings over the years, initially to master data management and then to data catalogue/data governance. In its core data quality market it has a full and rich feature set, as you would expect from a product that has been developing for fifteen years. It covers data profiling, data cleansing, merge/matching, data enrichment, data lineage and more. It has a knowledge graph and has used artificial intelligence to enhance its platform in production since 2016. The product can also connect to a wide range of source systems such as databases and applications like Salesforce, either through its own 40 connectors or via third party connectors. Recently it introduced some data preparation and transformation capability also. Although there are a lot of features, the product is now licenced as just two separate modules, one for data quality and governance and the other for master data management.
Customer Quotes
“Other vendors are transactional in their behavior. With Ataccama, there’s a genuine belief of shared responsibility of success that we feel within T-Mobile.”
Daniel West,
Data Management Lead, T-Mobile
“Ataccama, as a vendor, has been a phenomenal partner: flexibility and understanding of our needs in a pricing structure, system integration needs, deployment support, training and ongoing customer engagement.”
Head of Enterprise Data, Finance Industry
Ataccama ONE covers the full range of data quality functionality. Users can connect to a wide range of data sources and ingest the metadata about these sources, as well as running a range of profiling statistics. This includes such things as record counts, recognising patterns in the data such as postal codes or social security numbers, detecting null values etc. Data quality rules can then be set up and applied to this data, either manually or via the underlying AI in Ataccama suggesting rules to the business users. The AI can also generate draft business descriptions from the data to populate the catalogue. Textual data quality rules can be entered by a business user in normal English and the AI translates this into Ataccama’s underlying scripting language as a business rule. Data quality can also be monitored over time. A new feature, released in January 2024 is data preparation, a way to build data pipelines. This has a visual interface showing the various stages of process or transformation to be applied such as “read table”, “remove sensitive data”, “standardise contact” etc. Another example of the use of AI in the product is the internal documentation, where you can chat with an AI to search the documentation and come up with answers to questions, saving you from having to search through the documentation yourself.
Ataccama’s products can be deployed either on-premise, in a public or private cloud or a hybrid environment. Ataccama’s data quality suite competes with products like Talend, Informatica and IBM in terms of platform suites, and more specialist products like Experian and others in certain areas like customer name and address validation. It now competes with Collibra in the data quality space, given the latter’s recent acquisition of the OwlDQ data quality product.
Ataccama customers, who range across industries such as financial services, often have quite challenging environments in terms of size and scale, and this has long been a strong point for Ataccama. The vendor recently commissioned a 3rd party market research firm to interview its customer base and this feedback reported a 30-50% productivity increase across customer data management teams, a 10-25% reduction in spending on data management tools, at least a halving of issues related to fraud and compliance breaches, and better outcomes from core business processes due to better data management.
The bottom line
Ataccama has a full-function data quality product with a rich feature set, proven high performance and a substantial customer base that has deployed the product in challenging environments. It should be high on your candidate list if you are looking for a data-quality solution.
Mutable Award: Gold 2024
Ataccama data governance
Last Updated: 9th November 2023
Mutable Award: Gold 2023
The Ataccama ONE data management platform covers data governance, data quality and master data management. Compared to specialist data catalogue products, it has a broader range of functionality since it has mature and highly functional data quality capabilities, as well as a master data management hub. The product can be deployed on the cloud or on-premises, with most newer customers opting for cloud deployment. Over three hundred customers use the data catalogue, over half of the complete Ataccama customer base. The product competes with specialist data governance products like Collibra and Alation as well as broader data management tools like Informatica. Customers are from a wide range of industries, though the largest verticals are financial services, telecommunications, pharmaceuticals and retail.
One unusual aspect of Ataccama is that they survey their customers in detail about the benefits they have derived from their projects. In a recent survey, for example, customers reported 30-50% better productivity, a 10-25% reduction in spending e.g. via quicker finding of defects, two to four times lower risk of fraud and data breaches and 2-8% efficiency increase from better outcomes of data management.
In my experience, it is quite rare for software vendors to carry out such in-depth analysis of customer benefits, beyond merely basic customer satisfaction surveys. It is something that should be encouraged, and doubtless gives customers confidence that the vendor is interested in the long-term success of their projects, and does not lose interest once the payment for the software license has cleared. Customers include T Mobile, who used Ataccama in a huge data classification project involving 22,000 databases and 5,000 applications. Other customers include RSA, Societe Generale, Avon and Heineken.
Customer Quotes
“Ataccama has been pivotal in helping T-Mobile secure our vision of understanding our customers so well, we know when they have problems almost before they do.”
Daniel West, Data Management Lead, T-Mobile
“Data lineage is one of the most critical elements, … enabling root cause analysis.”
Piotr Pietrzyk, Head of Data Governance, Avon
“The direct contact to the team here to the technicals, you do not always have in other companies.”
John Reimers, Master data program manager, Marti Group
In terms of the data governance area specifically, it has a data catalogue, business glossary, workflow, data lineage, policy management, data quality (profiling, merge matching, data enrichment etc), data observability and monitoring etc. In other words, it has pretty much the full range of capabilities that you would expect from a data governance tool. The product has a knowledge graph that can visually show relationships between data, and this is interactive, so for example you can focus on one particular data area and switch to a detailed view to see information like the data quality scores and statistics for that particular piece of data. Data lineage is provided organically to a degree, though there is also a partnership with data lineage specialist Manta. All this range of functionality is delivered via a unified user interface, since Ataccama has built its software from scratch rather than assembling it through acquisition. Other than the data lineage partnership, just about the only OEM of other software is the niche area of customer name and address validation, which it does via Loquate, software used by almost all data quality vendors these days.
Ataccama provides an integrated platform ranging from data governance through to data quality and master data management. All this capability, other than a couple of very specialist areas such as name and address validation, is provided by software that was developed from the ground up to work together, rather than being patched together from various acquisitions. Consequently, Ataccama ONE is well suited for companies that want to implement a data governance solution that will not just be a stand-alone solution that documents the state of play with their data. Using this technology, they can move beyond this into data quality improvement and remediation, which will be a foundation for initiatives such as digitisation, and enable artificial intelligence initiatives with a sound basis of good quality data.
The Bottom Line
Ataccama ONE is well suited for companies that want to implement a data governance solution, but also want to move beyond this into actual data quality improvement and remediation, all using the same product suite.
Mutable Award: Gold 2023
Ataccama ONE
Last Updated: 3rd March 2021
Ataccama ONE encompasses data integration, data cataloguing, data profiling and data quality, and both reference and master data management. Specific elements of each of these is illustrated in Figure 1. Note that there is no explicit data governance product or module though various relevant data governance capabilities are spread across the environment. It is worth commenting that all these capabilities have been developed in-house and all of them share metadata, which makes a pleasant change compared to offerings from some other vendors that consist of products that were not initially designed to work together. There is also a single engine underpinning everything, with an extension for big data processing. The only exception to this rule is for data lineage, for which Ataccama embeds the Manta Unified Lineage Platform. Manta is a leading provider of specialist data lineage solutions.
Ataccama ONE is available to run on all the major cloud platforms (the product supports implementations using Kubernetes and Docker containers) and can be provided as a fully managed service if required. On-premises and hybrid deployments are also supported.
Customer Quotes
“Because we had very high performance expectations, Ataccama was a clear winner.”
Miroslav Umlauf, Data Management Director at Avast
“The cost vs business value is significantly better than competitors and the power of the AI engine and capabilities is industry leading.”
Head of Data in the Finance Industry
The other notable result of the deep integration provided by Ataccama ONE is that the various user interfaces that are provided, which are targeted at different personas, are consistent across all of the platform’s capabilities. As illustrated in Figure 2, this is visually appealing. More notably, this screenshot shows the forthcoming release (2.0) of Ataccama ONE, in which the company is introducing its Knowledge Catalog as a superset of both its Data Catalog and Business Glossary.
More generally, the company describes its platform as “self-driving”, by which it means that it has built significant levels of machine learning and AI into Ataccama ONE. As an example, the data catalogue, which includes full text search and supports natural language processing, will learn from past experience of relevant searches, and come up with notifications if it detects an anomaly or other alert-worthy information, such as “looks like a table with GDPR content”. While on this topic, it is worth noting the company’s support for data masking (both static and dynamic), which includes consistent masking and the ability to preserve referential integrity, amongst other algorithms. In addition, consent management is provided within the MDM module. Unfortunately, there are no features to support the automation of data subject access requests (DSARs). While the company is by no means unique in building machine learning into its platform we particularly like the fact that it has also built explanations of its AI into the offering. We have not seen this level of explainability from other vendors.
Other features of note include the fact that the IDE is Eclipse-based, there is support for REST APIs and, where relevant, jobs can run natively on Spark, Databricks and so on, so that transformations can either be supported in ETL or ELT mode. Beyond this, the company provides a Reference Data Management module, with approval-based workflow capabilities and governance workflows built in to manage and distribute reference data. Data quality rules can be exposed as a web service.
Ataccama ONE is a comprehensive platform for establishing and managing (cloud-based) analytic environments. It has all the fundamental requirements that one could ask for, though from a marketing perspective it might be advantageous to the company if it called out data governance as a module in its own right. That said, spreading governance capabilities across the environment makes a lot of sense so this is a marketing argument rather than a technical one.
Ataccama ONE has the distinct advantage of being a natively integrated, single engine solution, as opposed to a bunch of individual modules that have been stitched together. This is not only relevant when it comes to the performance and scalability of the platform but also to the end-user interface, which is consistent – by persona – across the environment. The product’s “self-driving” approach to automation is also significant and perhaps ahead of some of its competitors. That said, all major market players are investing into AI and automation capabilities, and one area where Ataccama stands out is in its explainability.
The Bottom Line
Ataccama, while by no means one of the biggest names in this space, is a significant contender with a proven track record. It is certainly worth serious consideration as a potential supplier.
DQ Analyzer
Last Updated: 22nd January 2013
Ataccama's data profiling tool is DQ Analyzer (version 8) and it is available as a free download from the company's website. This is compelling for users that want to demonstrate to their organisations that they really have a data quality problem but who don't believe it. Of course the purpose of making this available for free is to entice those users to subsequently license the company's DQ Center, as well as other products, in order to resolve the problems identified by DQ Analyzer.
Ataccama DQ Analyzer as a pure-play profiling tool is decent without being exceptional. On the other hand, given that it is free, it is sensational value for money.
While it isn't fussy about it, Ataccama tends to focus on financial services, insurance, healthcare, telecommunications, government and the public sector. Its customers include both major institutions and small or medium-sized businesses. Ataccama's software is resold by iWay (a division of Information Builders) and iWay has embedded both DQ Center and Master Data Center into its enterprise service bus.
Ataccama has over 100 customers from large multi-national corporations to mid-sized businesses in a variety of industries including Allianz, a number of medium sized banks and insurance companies, Orange, Telefonica and T-Mobile.
The product is Windows-based and supports JDBC for database access and there is also a text file reader though no comparable facility for Microsoft Excel. There is a special version of DQ Analyzer available for use with Teradata (with which Ataccama is a partner).
As a product DQ Analyzer is clearly in the profiling and not the discovery camp. This is somewhat surprising given that Ataccama markets Master Data Center as an MDM offering: one would expect that features such as the ability to identify matching keys would be a useful ability in support of MDM. Also noteworthy is that DQ Analyzer only provides statistical analysis of your data. Unlike other profiling products it has no ability to monitor data quality on an on-going basis. This is provided through a separate product called DQ Dashboard, which is chargeable. One feature of the product suite that is worth special mention is DQ Issue Tracker, which supports data governance (by tracking your remediation efforts) and is unique in the marketplace.
Apart from training, Ataccama offers data quality consulting, master data management services and information governance. Note that the emphasis is on the company's broader solution set rather than data profiling or discovery per se.