CluedIn is a master data management vendor founded in 2015 and based on Copenhagen. It has over a hundred corporate customers and over 80 staff. It is a cloud-native product that is written to take advantage of Microsoft’s Azure cloud platform. CluedIn is unusual (though not unique) in using a graph database called Neo4J as its base rather than a relational database.
The company grew 34% last year and had revenues of around $10 million. Customers include Sega, Svevia, Nykredit, Gallagher and Bayer.
Fig 1 - CluedIn, bring the business into the supply-chain of data
CluedIn is a modern master data management (MDM) product aimed firmly at business users rather than technology professionals. It includes data quality features such as data enrichment, as well as support for data governance and metadata management. The software was designed from the ground up for the cloud, and in particular, there is a deep relationship between CluedIn and the Microsoft Azure platform. Microsoft themselves are a customer, using CluedIn for internal MDM purposes, as indeed is SAP.
Customer Quotes
“As a Graph-based system, we have experienced firsthand how easy it is to ingest and map data, and how simple it is to define our data model.” Robert Brown, Enterprise Architect, GuardRisk
“We chose CluedIn for its powerful AI capabilities and seamless Azure-native integrations to empower our business and technical teams with data that’s ready for insights & AI-driven innovation.” Felix Baker, Head of Data Services, Sega
Fig 2 - The World’s First Master Data Management and AI Integration
CluedIn makes heavy use of AI for match/merging, and has a natural language interface in addition to the usual drag and drop screen interface. CluedIn has connectors to data sources, but in practice many customers move data into CluedIn from a data lake, avoiding the need for complex ETL processing and data integration.
CluedIn is based on the Neo4J graph database rather than a relational database. This underpinning means that there is no need to set up a data model or formal schema, and the nature of the graph database means that it is excellent at dealing with relationships between data. Although in principle, this flexibility with graph databases can be at the expense of performance, this depends on the use case. Graph databases are very efficient at complex queries as they avoid JOIN operations by storing relationships natively, and MDM queries are frequently complex. Graph databases will tend to be slower at writing records than relational, but that is of limited relevance in the case of master data. In practice, one CluedIn customer has over 350 million records, and in reality most master data management projects deal with millions of records rather than billions, as they are only dealing with data like customers and products, not storing vast numbers of transactions.
CluedIn makes heavy use of artificial intelligence (AI). Unlike some competitors, which use machine learning for this purpose, CluedIn uses large language models (LLMs). It uses a blend of standard LLMs like Char GPT and smaller locally hosted language models such as Llama from Meta, Mistral and even DeepSeek. In one customer example, games company Sega is fed purchase transactions from broader gaming industry platforms, from complete games (Sega own 800 game titles) to accessories used within games. Sometimes the stock-keeping unit (SKU) is not present on the transaction, just the name of the item purchased, which may not even refer to the name of the game, which may itself be referred to in different local languages like Korean or Japanese. Formerly, a team of staff had to manually assess such transactions and allocate each ambiguous transaction to a particular game to pay the commission correctly. They now use the LLM called from within CluedIn to assign the transaction record to the correct game. In practice the LLM has proved very effective in doing this, though its recommendations are checked by human beings for quality control purposes. Nonetheless, this use of AI has dramatically speeded up the time taken for this work. In general, using an LLM for merge/matching does have some practical issues, such as limited context windows for bulk transactions, and consistency issues by the very nature of LLMs, which are inherently probabilistic. Extensive prompt engineering can improve the results, though there will always be consistency issues. However, the LLMs can be used to take a first pass and then refer problematic cases to a human expert, saving time.
One of the issues for large corporations is how to implement MDM effectively. MDM solutions have been around since the early 1990s, but have typically had mixed success in implementations, with limited business engagement in many cases. This has been partly due to the need for data modelling skills and due to the complexity of some aspects of the technology. CluedIn has from the start aimed at business users, and with its unusual database underpinning and extensive use of AI, can credibly make the claim of bringing MDM to end users rather than to technologists. The product is a plausible alternative to more traditional MDM solutions such as those from Informatica and Profisee etc.
The bottom line
CluedIn is an innovative MDM vendor, with its intuitive user interface aimed at business users, and its ability to circumvent formal data modelling due to its graph database underpinnings. With over a hundred corporate customers, heavy use of AI and strong recent growth, it should be seriously considered by companies that are reviewing their MDM needs, particularly if they want to get greater business engagement.
We use third-party cookies, including Google Analytics, to ensure that we give you the best possible experience on our website.I AcceptNo, thanksRead our Privacy Policy