Redis Labs is the ‘open source home’ and official sponsor of Redis, a leading in-memory database platform that uses a key-value storage paradigm. It is also the commercial provider of Redis Enterprise, an enhanced version of Redis that provides additional functionality designed for the enterprise. This is available both on-premises and in the cloud (Amazon, Microsoft and Google). Redis itself is often rated as one of – if not the – most popular NoSQL databases and Redis has more than 7,900 paying customers.
The company also extends the native characteristics of Redis and Redis Enterprise to many popular industry use cases implemented through modules that the company has developed. By “module” the company means functionality embedded into the product as opposed to something tacked on top.
Redis Labs is a privately held company backed by venture capital. It was founded in 2011 and its corporate headquarters are in Mountain View, California. It has additional offices in London, Tel Aviv and Bangalore.
Company Info
Headquarters: 700 E El Camino Real Suite 250 Mountain View, CA 94040, USA Telephone: +1 415 930 9666
Redis Enterprise is an in-memory, distributed (automated partitioning), NoSQL database with a key-value store as its underpinning. The core open source and commercial capabilities are shown in Figure 1. However, this description does Redis a disservice because the company’s approach to modules – see Figure 2 – is such that, in reality, Redis is better thought of as a multi-model database that can be used to support document processing, graph traversals, stream processing , machine learning, time-series, search and so forth. It is also possible to write your own modules.
Fig 02 Redis Modules
From a transactional perspective, Redis Enterprise is ACID compliant within a single cluster that is not geographically distributed. Both synchronous and asynchronous replication are supported and therefore both Active/Active and Active/Passive deployments, in the former case relying on CRDT-based (conflict-free replicated data types) strong eventual consistency.
Customer Quotes
“We have a very high concurrency: about 40,000 or more at peak times logging into our system to utilise our services. We’ve had very good results with fetching data in pre-populating forms and the user experience has been very good.” India’s National Informatics Centre
“Redis Labs delivered on its commitments to demonstrate how its application can scale and provide the level of performance needed for high speed, high volume transaction processing. The Redis Labs solutions teams were committed to finding a solution to the problem that we presented and they were dedicated and diligent in following up on providing the solution that we needed.” Fortune 100 Financial Services
From the perspective of time-series data the key issue is how Redis modules work. Note that these are embedded into the database engine and not just layered on top. The relevant options are illustrated in Figure 2 and, of these, it is the time-series module, along with RedisGears and possibly RedisAI, that are of interest here, though Redis Streams may be useful in supporting in high ingestion rates.
RedisTimeSeries provides built-in aggregation functions such as calculating minima, maxima, sums, averages and so on. More significantly, you can label data – based on timestamps – on either an individual or global basis and then you can use those labels to support analytics. This will be particularly useful in Internet of Things (IoT) environments where you are doing initial analytics at the edge. Additional capabilities include the ability to compress data across different time series functions, specific time-series indexing, support for time buckets, programmable retention policies, and downsampling (reducing the sampling rate so that you can determine the granularity – typically of sensor data – you require). There are also built-in Prometheus and Telegraf interfaces and visualisation support through Grafana.
All Redis modules are designed to interoperate with one another but RedisGears takes this one step further by facilitating inter-module transformations. It provides a serverless environment and allows you to aggregate data across multiple Redis database instances and react to activity based on pre-defined triggers. In other words, you get event-driven data transformations from one model to another, in real-time, in memory.
Then there is RedisAI. This enables the deployment of machine learning models and model serving, with the database supporting relevant languages such as Python, R and Scala plus deep learning support with the ability to embed TensorFlow, PyTorch and TorchScript models into your analytic workflows.
Finally, Redis is well-known for its performance, not least because it has historically run everything in memory. However, as the company moves away from focusing on caching use cases into wider environments, it can no longer assume that all of its clients can afford the amount of memory that may be required. For this reason, warm (as opposed to hot) data may be stored on SSDs and the company has been working with closely with Intel on its Optane Persistent Memory technology.
Apart from its performance and scalability – which are obviously major factors – the most outstanding thing about Redis Enterprise is the flexibility that its support for different data structures and modules provides. From a time-series perspective the relevant module has some significant features. However, we would like to see more geo-spatial support as the product is limited to supporting latitude and longitude, which may be enough for some IoT applications, but it doesn’t offer the breadth of capability that some other databases in this space do. Conversely, Redis offers many other functions that competitive time-series offerings do not do.
The Bottom Line
It is interesting to observe how Redis has managed to leverage its initial success as a caching technology, into something more general-purpose. It is now a major contender across a range of functionality and we expect that to be also true with respect to time-series.
RedisGraph
Last Updated: 11th September 2020 Mutable Award: Highly Commended 2020
RedisGraph is the graph database module for Redis where by “module” the company means functionality embedded into the product as opposed to something tacked on top. It is available via a Docker container, downloadable software, and as an optional part of Redis Enterprise. As a relatively young product, it currently lacks some of the advanced features of more mature products (for example, strong consistency is not yet available). That said, it offers a major point of distinction from other graph products: it represents and stores graphs as sparse adjacency matrices instead of adjacency lists. This enables much faster ingestion and query performance.
The product itself is a property graph. Queries are written in (a subset of) the Cypher query language and, further, Redis Labs is participating in the GQL project to create a standardised graph querying language.
Customer Quotes
“We tried several graph database technologies and we really found that RedisGraph is the one that gave us the speed to solve instant real-time problems, yielding a minimum 5x improvement in query speed.” IBM
Fig 01 - A graph and corresponding adjacency matrix
Generally speaking, there are two ways to store and represent graphs. The first of these is the adjacency list, which consists of a list of all the nodes in the graph it represents, each paired with the set of nodes with which it shares an edge. This is effectively the industry standard. The second is the adjacency matrix, which represents its graph as a matrix with one row and column for each node within the graph. Within this matrix, nonzero values indicate the presence of an edge between the nodes represented by the corresponding row and column. An example of an adjacency matrix, alongside the graph it represents, is shown in Figure 1.
This method has several advantages in terms of performance. For example, determining whether two nodes share an edge is significantly faster when using matrix representation. Queries can be performed using direct mathematical operations such as matrix multiplication, which is often much faster than the traditional approach using adjacency lists. Performing a self-join, for instance, is simply a matter of multiplying a matrix by itself. Similarly, ingestion rates can be improved using adjacency matrices.
The vast majority of matrices representing real world graphs are ‘sparse’ – meaning that almost every value is zero – and RedisGraph stores adjacency matrices in Compressed Sparse Row (or CSR) format, meaning that they are, effectively, only storing nonzero values. This almost always results in a very large saving in terms of memory. Notably, storing matrices in CSR format does not impact RedisGraph’s ability to use them in mathematical operations.
Fig 02 - Breadth-First-Search via adjacency lists and linear algebra
Going further, RedisGraph implements the GraphBLAS engine. GraphBLAS is an open effort whereby ‘BLAS’ stands for Basic Linear Algebra Subprograms. It provides the ability to use linear algebra running against sparse (compressed) matrices and this combination of optimises and simplifies many different graph queries and algorithms. For example, a comparison between implementations of Breadth-First- Search using linear algebra and the standard approach using adjacency lists is shown in Figure 2. It should be clear that the algebraic approach is easier to write and to understand. It is also computationally simpler.
There is one more advantage of the matrix representation that is worth noting. Matrix operations (such as multiplication) are extremely and easily parallelisable, and this property carries over to queries and graph algorithms based on linear algebra. This means that RedisGraph benefits very significantly from parallelisation. It will, in the future, be able to take full advantage of the massive parallelisation offered by GPU based processing.
Fig 03 - Graph visualisation in RedisInsight
The 2.0 and 2.2 releases of RedisGraph offer a number of new features over previous versions. This includes various performance improvements, enhanced support for Cypher features, full-text search via RediSearch, and graph visualisation leveraging either RedisInsight (see Figure 3), Linkurious or Graphileon (Redis is partnered with the latter two). RedisGraph has also adopted the SuiteSparse implementation of GraphBLAS, which has positive implications for performance, as well LAGraph, an open source collection of GraphBLAS algorithms developed primarily for academia. A growing number of community created drivers and connectors are also available.
Firstly, we should mention the recently release of RedisAI and RedisGears, which are other Redis modules in the same way that RedisGraph is. As may be imagined, RedisAI is designed to serve ML/DL models that were trained over standard platforms like TensorFlow and PyTorch, while RedisGears is a fully programmable engine that enables orchestration and data-flow across modules, data-structures and cluster shards. In context to this paper that means that RedisGraph should be able to interoperate with RedisAI and, for that matter, with RedisStreams.
Secondly, RedisGraph’s CSR storage format mitigates the problems with storage and memory usage that the matrix representation of graphs has had in the past, providing a sixty to seventy percent reduction in memory usage.
The Bottom Line
Although RedisGraph has matured significantly since its release in 2018, it is still very much a product in its infancy. That said, while it is still early days in terms of features, the theoretical advantages of using adjacency matrices are considerable. Even if you are not already using Redis, it is certainly worth looking into.
We use third-party cookies, including Google Analytics, to ensure that we give you the best possible experience on our website.I AcceptNo, thanksRead our Privacy Policy