Aerospike
Update solution on June 24, 2019


Fig 01 Systems of record
Aerospike is a distributed, massively parallel, NoSQL database with a three-tier architecture, as illustrated in Figure 2. The client layer is cluster-aware and includes open source client libraries, which implement Aerospike APIs, track nodes, and knows where data resides in the cluster. The distribution layer manages cluster communications and automates fail-over, replication, cross data center synchronisation, and intelligent re-balancing and data migration. The data layer stores data in schema-free, key-value format, typically with indexes in memory (DRAM) and data on SSDs though you have the option of putting everything in memory if you prefer or everything on SSDs. That said, Aerospike supports Intel Optane Persistent Memory technology, which is already available on Google Cloud. This is superior to conventional DRAM. Note that Aerospike was specifically designed, from its inception, to run on SSDs rather than rotating disks.

Fig 02 Aerospike architecture
“Working with Aerospike helps us analyze data faster than ever – evaluating billions of data points, across 75 million daily transactions, all in real time.”
ThreatMetrix
Customer Quotes

Fig 03 Aerospike Connect for Spark
From a transactional perspective Aerospike provides strong consistency guarantees (linearizability) within a single clustered instance of the database. If you have multiple clustered instances in different locations (for geo-location purposes or disaster recovery) then linearizability is only guaranteed for an individual database instance. The company uses XDR (Cross Data Center Replication) to automatically replicate records to remote instances of Aerospike in other locations, and in this case, records are shipped asynchronously to the remote instance. More generally, you can choose either strong consistency or high availability though this is not a completely either/or proposition since you can mix and match your choices within a single cluster. Note that in a sharded system – which Aerospike provides – it is not even theoretically possible to guarantee both high availability and immediate consistency, so this is something that all NoSQL and NewSQL vendors struggle with. The company is working on providing write (but not read) consistency in conjunction with high availability.

Fig 04 Aerospike Connect for Kafka
With respect to supporting analytics, the company provides Aerospike Connect for Spark, as illustrated in Figure 3, though we should comment that the Flink integration is currently in development and is not available as yet. As an alternative to using the embedded Aerospike Connect for Spark users may opt to use any existing Spark instances for analysis purposes. There is also a Kafka connector that supports both inbound and, relatively unusually, outbound traffic. This is illustrated in Figure 4.
As far as language support is concerned, Java, Go, Python and various other languages are supported but not Scala or R. While the support for Spark is sophisticated, there is no support for either TensorFlow or PMML (predictive modelling mark-up language).
The key thing about Aerospike is the combination of its performance with how it scales, along with single-record consistency guarantees. Note that this terminology is specific: it is not just a question of the fact that Aerospike scales but how it does so. This is because it does so extremely efficiently, requiring far fewer hardware resources than some competitive systems. In general, Aerospike requires far fewer servers to support the same level of scale and performance compared to rival products, which results in a lower total cost of ownership. While there are multiple factors that effect this, one that we have not previously mentioned is that the SSD optimisation designed into Aerospike means that you avoid the complexity and cost of caching layers that bedevil other solutions.
The Bottom Line
What do you want from a hybrid system that supports both analytics and transaction processing? Low latency, performance, scalability and a low cost of ownership. Aerospike offers all of these. We would like to see support for non-Spark machine learning and analytics but that caveat aside Aerospike is impressive.
Related Company
Connect with Us
Ready to Get Started
Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."
Connect with us Join Our Community