Anomalo was founded in 2018 and is headquartered in Palo Alto, California. Its software first appeared on the market in 2021. The company is venture capital-backed, raising a $33 million series B round in January 2024, with investors including SignalFire, Foundation Capital and Two Sigma Ventures. The company has grown rapidly and its customers include Buzzfeed, Casey’s, Discover Financial and Block. The company has partnerships with several complementary technology providers including Databricks (whose venture capital arm is also an investor), Snowflake and Alation. At the time of the briefing in March 2024 Anomalo had around 50 employees and annual recurring revenue just under $10 million, with 130% growth over the previous year.
Figure 1 – For Data Quality monitoring Anomalo offers a unique AI-first approach
Anomalo is firmly in the data observability sector of the data quality market. It aims at automating the monitoring of data quality of large-scale data pipelines in enterprises. It does this by using machine learning to analyse data in enterprise data warehouses and automatically spotting issues with such data by comparing it to its historical baselines and trends. For example, a daily feed of new transactions will have typical volumes of data by sales region. The software will detect and alert business users to data feeds that significantly differ from what was expected. For example, a particular data feed might be missing entirely or suddenly have null values in a field where normally there are values. In such cases the software will issue an alert, and can interface with common ticketing software to aid in resolution.
Moreover, the product goes deeply into root cause analysis, not just highlighting that something is amiss in a data feed but trying to work out why the problem has occurred. It will try and see whether a particular issue is restricted to a certain subset of the data, for example, if the issue only occurs in a certain set of data records, perhaps ones associated with a specific location. In this way, it tries to speed up the process of resolving the issue. Workflow within the product can track the speed of resolution of issues when they are reported, and potentially escalate where needed. The aim is to detect issues before they cause major problems where possible. The product is modern, and in the numerous publicly available customer case studies and testimonials that I examined, the ease of use of Anomalo was a common thread from the customer comments.
Customer Quotes
“With Anomalo, we’re able to automatically detect data issues as soon as they appear in our information; it helps us understand the root cause before business users get impacted so that we can resolve things quickly.” Prakash Jaganathan, Senior Director of Enterprise Data Platforms, Discover
“We were in need of two core platform competencies, we didn’t need ten. We wanted those things to be best of breed at what they did – it’s a great benefit that these two things integrate with each other so seemlessly.” Cliff Miller,Enterprise Data Architect, Keller Williams
Figure 2 – Rich, visual alerts optimised for data quality insights
As well as what it does, which is outlined above, it is important to understand what it does not. Anomalo is aiming at machine-operated data pipelines, so does not have functionality to do traditional human-entered data deduplication, customer name and address validation and enrichment, or merge/matching of duplicate regards. It leaves this kind of thing to more traditional data quality and master data management tools. In the future the company is likely to explore additional functionality such as automatic detection of personally identifiable data, and also monitoring the data quality of unstructured as well as structured data. Interest in unstructured data is a hot topic at the moment as enterprises explore the training of large language models on their own data in order to deploy artificial intelligence projects. In many cases the data that will need to be trained is unstructured, and such data has just as many issues as structured data in terms of its quality. AIs are sensitive to the quality of the data that they are trained on, so any product that can significantly help in this area is likely to have a bright future.
Certain types of data situations are hard to deal with if you rely on manually entered data quality rules. One Anomalo customer has five million tables that it needs to monitor, and estimated that it would take them over 1,700 person-years of effort to manually code all the data quality rules that it would need. In such large-scale cases, some form of automated approach is the only practical way forward.
Anomalo competes directly with Monte Carlo and with more traditional data quality software suites from vendors like Informatica and Ataccama. It focuses primarily on data that is machine-captured and high in volume, where manual data quality rule creation is impractical to scale. Traditional data quality vendors tend to focus on areas like customer name and address validation or product information management, where data like a new customer is typically entered by a human being, with all the potential risks for error and duplication that implies. Anomalo is particularly suited to situations where the number of records in the data pipeline is huge, and where automation is the most practical way of detecting data issues rather than manually coding rules. One product differentiator is the ability of the product to not just monitor data pipeline flows but also dig into the data itself to highlight unusual data values, something that not all its competitors can do at present.
The bottom line
Anomalo is a fast-growing vendor in the emerging data observability space within the broader data quality market. If you have a high volume of machine-collected data to deal with and monitor, then Anomalo should be on your shortlist of vendors to consider.
We use third-party cookies, including Google Analytics, to ensure that we give you the best possible experience on our website.I AcceptNo, thanksRead our Privacy Policy