Memgraph

Last Updated: 20th October 2023
Analyst Coverage: Philip Howard and Daniel Howard

Memgraph was founded in 2016 and has offices in London and Croatia. While suitable for many environments the company targets complex graph analytics, often where multiple graph algorithms need to be used in conjunction. For example, in the industrial sector where complex production networks need real-time performance analysis and optimisation, and in the energy sector where power grids need to be managed in a similar fashion. The company also addresses common real-time problems such as customer intelligence (recommendations), cybersecurity, and fraud detection. More generally, the company has historically focused on making operational graph analytics easier to use, especially for developers and data scientists.

As of 2021, the product – also called Memgraph – is available as open source using a dual licensing model: the Community Edition is freely available under the Apache license, while the Enterprise Edition is proprietary. The two editions use the same codebase, but the Enterprise Edition comes with additional features, such as LDAP, multi-tenancy, and additional disaster recovery options, as well as enterprise support. There is also a cloud-based managed service offering available on AWS.

The company partners with Cambridge Intelligence, Graphileon, FactGem, and Linkurious, amongst others.

Company Info

Headquarters: 20 Ropemaker Street, London EC2Y 9AR

Memgraph

Last Updated: 31st July 2023
Mutable Award: Gold 2023

What is it?

Memgraph is an in-memory, ACID-compliant, graph database written in C++. In another context it could easily be described as an HTAP database, since to supports both transactional and analytic processing. It uses a property graph model and emphasises high performance, scalability and, most notably, real-time processing.

Fig 1 - Memgraph Marketecture

Memgraph’s approach to technology is that it will reuse or integrate with existing standards and market leading products wherever it can, as illustrated in Figure 1. Thus, Memgraph uses openCypher to query the data (for which the company has built its own cost-based optimiser); it integrates tightly with Apache Kafka; leverages LDAP, Active Directory and Kerberos for authentication purposes; supports containerisation via Docker and supports both Kubernetes and OpenShift. It also supports machine learning models built using TensorFlow or PyTorch or written in R, Python or Julia. The company has also developed various graph algorithms that are shipped with the database, such as breadth first search and weighted shortest path.

Customer Quotes

“For analysis of our production networks we apply complex graph analytics. Until we found Memgraph, no other service met our needs in terms of flexibility, performance and custom analytics. Now we are able to integrate complex graph analytics into our internal applications, and deploy them with ease at global scale, and ultimately generate value.”
Fortune 500, Chemical Company

What does it do?

One of the key differentiators for Memgraph is its high performance. There are a number of ways it achieves this. For starters, it is written in C/C++. Consequently, the product enjoys an extremely small footprint: on start-up, it only consumes approximately 30MB of RAM, which means that Memgraph can easily run on edge devices, whether in IoT (Internet of Things) or mobile environments. The fact that Memgraph is an in-memory database is also significant since it will often mean that the entire graph can be held in memory. Not only will this aid performance in general but it will be particularly useful when the database needs to support mixed workloads.

Memgraph’s focus is on algorithm scalability and extensibility. In other words, you can extend and implement high-performance user customised algorithms and procedures. This is enabled through integration with the data science and machine learning ecosystem. Specifically, Memgraph allows you to extend its query language and implement your own custom procedures. These procedures are grouped into ‘Query Modules’, which can be loaded on start-up. Although the most performant and scalable way to implement these procedures is by using the Memgraph C Query Module API, in an effort to make quick development and iteration possible for data scientists, Memgraph also exposes a Python Query Module API. With an embedded Python interpreter inside the database to make it easy for data scientists to leverage libraries like Scikit Learn, TensorFlow and PyTorch, and run analytics directly on data stored inside Memgraph. Finally, Memgraph can be combined with more than 300 graph algorithms from NetworkX and works with machine learning libraries such as www.stellargraph.io.

Another way in which the product enables high performance is concurrency. Memgraph data structures are lock-free. For concurrency, Memgraph has implemented MVCC (Multi-Version Concurrency Control) with snapshot isolation to ensure that, for example, reads never block writes and writes never block reads. Not only does this contribute to performance, but the snapshotting used within MVCC combines with write-ahead logging to prevent data loss from occurring during system failure, hence providing a guarantee of durability. Together with the company’s extensive investment in testing and test-driven development, this makes for an eminently robust solution.

Fig 2 - The Memgraph Lab user interface

We should also mention Memgraph Lab, illustrated in Figure 2. This is a lightweight visual user interface for developers, designed to help openCypher query and graph development. It provides visualisation (of both graphs and schema), exploration capabilities, and the ability to tune queries through query profiling (with diagnostics and query plan details).

High availability replication is available in both the Enterprise and Community editions of the product. This is notable in that you would typically expect this feature to be reserved for the Enterprise Edition. Indeed, this is the case for several of Memgraph’s competitors. The fact that Memgraph does not follow suit helps make the Community Edition a real, and particularly appealing, option for small teams and startups.

Why should you care?

When choosing to use any (graph) database, you should be striving it optimise its performance. This is where Memgraph’s design decisions are relevant. It has been developed using C/C++, which will always outperform (with a smaller footprint) products written in other high-level languages such as Java. It has been designed from the outset to support real-time updates, and it has been designed as a “Smart Graph”.

We should also add that Memgraph is targeting the most complex graph problems, which other vendors typically ignore. Its emphasis on making graph analytics easier for data scientists is also noteworthy.

The Bottom Line

High performance, real-time capabilities, a focus on complex operational graph analytics, support for open standards, and an environment designed to make analytics as easy as possible, sounds to us like a winning combination.

Mutable Award: Gold 2023

Commentary

Coming soon.

Solutions

Memgraph

Memgraph

Company Info

Memgraph

What is it?

What does it do?

Why should you care?

Commentary

Solutions

Research

Graph Databases (2023)

Memgraph (2023)

Graph Database (2020)

Memgraph (2020)