Update solution on September 17, 2024

data.world
Mutable Award: Gold 2024

The data.world technology has a semantic database and knowledge graph at its core, on top of which is a data catalog and a data governance application, as well as an artificial intelligence capability. This is a cloud-native product, and the company partners with a range of complementary technologies, such as Snowflake, Monte Carlo Data and Matillion. The core product is aimed at business users rather than technologists, and it is notable that many of their customers have unusually deep penetration of users of the software: several customers have literally thousands of end-users actively using the product, rather than just a small number of super-users and administrators, as can happen in many data catalog implementations. One client, a well-known global consultancy, has 35,000 users of the tool.

The data.world technology competes directly with leading data catalog vendors such as Collibra, Alation and informatica.

Customer Quotes

“Our business needs to be ready for change during these turbulent times, …we will be able to apply graph analytics to find bottlenecks and achieve operational excellence.”
Luke Slotwinski VP, Data & Analytics, Prologis

“We are thrilled to have the opportunity to innovate with data.world. By taking full advantage of the knowledge graph capabilities of data.world’s data catalog, we are able to accelerate metadata enrichment and recommend complementary datasets, inspiring the creative uses of data.”
Vip Parmar, Global Head of Data Management, WPP

The data catalog has a quite full set of functionality, including a business glossary, connectors to capture metadata, data discovery tools, data lineage and an AI context engine. There is also some data quality and observability functionality, via their acquisition of Mighty Canary in May 2023. At the heart of the product is a semantic model and knowledge graph architecture.

Users are presented with a shopping-like experience, with data assets grouped in collections of related material e.g. there might be a collection of data around “customer information”.  The product has a full search and discovery capability with keyword search, and can also show the data quality scores of the data being presented, as well as the sources of that data, and relationships of data to other data. This search capability can be embedded via an API in other tools, so for example, a user of Tableau could invoke this discovery capability without leaving Tableau. The meaning of phrases like “net revenue” can be accessed from the business glossary within the tool.

The technology uses proprietary AI to allow users to populate descriptions of data objects. Since this AI has been trained on the internal data catalog, its descriptions are more accurate than a general-purpose public large language model would be. The model can explain itself, including showing the SQL that it generated and tables that it accessed in order to build up the basis for its descriptions. There are also tools to automate the creation and import of metadata, and the completeness of the metadata can be actively monitored.

As well as software, data.world has a built a large open data community with over 2 million users, dedicated to the sharing of publicly available datasets from, for example, governments and NGOs.

data.world has rapidly built up a base of prestigious customers and has achieved a deep level of penetration within many of those customers, something that is unusual within most data catalog implementations. Its pioneering of openly available datasets via its active on-line community is a useful and complementary capability. The technology has a modern, appealing user interface and uses artificial intelligence in a controlled manner in order to improve productivity.

The bottom line

The data.world technology is a modern and differentiated approach to the world of data governance and data catalogs. It has been adopted by some prestigious companies and appears to be widely used within those customers, which is not always the case with data catalog technology. If you are looking for a data governance solution then you should carefully consider data.world as an option.

Related Company

Connect with Us

Ready to Get Started

Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."

Connect with us Join Our Community