One of the key differentiators for Memgraph is its high performance. There are a number of ways it achieves this. For starters, it is written in C/C++. Consequently, the product enjoys an extremely small footprint: on start-up, it only consumes approximately 30MB of RAM, which means that Memgraph can easily run on edge devices, whether in IoT (Internet of Things) or mobile environments. The fact that Memgraph is an in-memory database is also significant since it will often mean that the entire graph can be held in memory. Not only will this aid performance in general but it will be particularly useful when the database needs to support mixed workloads.
Memgraph’s focus is on algorithm scalability and extensibility. In other words, you can extend and implement high-performance user customised algorithms and procedures. This is enabled through integration with the data science and machine learning ecosystem. Specifically, Memgraph allows you to extend its query language and implement your own custom procedures. These procedures are grouped into ‘Query Modules’, which can be loaded on start-up. Although the most performant and scalable way to implement these procedures is by using the Memgraph C Query Module API, in an effort to make quick development and iteration possible for data scientists, Memgraph also exposes a Python Query Module API. With an embedded Python interpreter inside the database to make it easy for data scientists to leverage libraries like Scikit Learn, TensorFlow and PyTorch, and run analytics directly on data stored inside Memgraph. Finally, Memgraph can be combined with more than 300 graph algorithms from NetworkX and works with machine learning libraries such as www.stellargraph.io.
Another way in which the product enables high performance is concurrency. Memgraph data structures are lock-free. For concurrency, Memgraph has implemented MVCC (Multi-Version Concurrency Control) with snapshot isolation to ensure that, for example, reads never block writes and writes never block reads. Not only does this contribute to performance, but the snapshotting used within MVCC combines with write-ahead logging to prevent data loss from occurring during system failure, hence providing a guarantee of durability. Together with the company’s extensive investment in testing and test-driven development, this makes for an eminently robust solution.
Fig 2 - The Memgraph Lab user interface
We should also mention Memgraph Lab, illustrated in Figure 2. This is a lightweight visual user interface for developers, designed to help openCypher query and graph development. It provides visualisation (of both graphs and schema), exploration capabilities, and the ability to tune queries through query profiling (with diagnostics and query plan details).
High availability replication is available in both the Enterprise and Community editions of the product. This is notable in that you would typically expect this feature to be reserved for the Enterprise Edition. Indeed, this is the case for several of Memgraph’s competitors. The fact that Memgraph does not follow suit helps make the Community Edition a real, and particularly appealing, option for small teams and startups.