ScyllaDB
Update solution on December 19, 2019

It is easy to think of ScyllaDB as Apache Cassandra with go faster stripes. It is written in C++ instead of Java so of course it is much more performant (and has a smaller footprint). However, while ScyllaDB is and will remain compatible with Cassandra, it is more than just a fast copy. For example, when ScyllaDB was first introduced it had an architecture based on sharding per core while Cassandra was based on multi-threading. DataStax (which provides a commercial implementation of Cassandra) has since moved to adopt Scylla’s design (though the Apache version has not) but is still limited by the fact that it requires a JVM to run (and has all the garbage collection issues that that implies). In other words, it is Scylla that is now leading development in this space rather than Apache Cassandra or DataStax. Moreover, ScyllaDB is proving to be not just a replacement for Cassandra but for other NoSQL databases such as Redis and Aerospike. This is even though these products have extensive in-memory capabilities that Scylla has not yet – this is planned – introduced. We understand that only 30% of Scylla’s user base consists of Cassandra refugees.
ScyllaDB is available, with an open source license, both on premises and as a cloud-hosted option. The company will also be launching a Database as a Service offering in Q3 2018, having recently acquired Seastar.io technology that was focused on this market.
Customer Quotes
“Thanks to Scylla’s superior performance over Cassandra we saved operational and capital costs. More than a year in production, Scylla is serving 600,000 requests per second.”
Mogujie
“In the end, performance improved by 5X using Scylla as compared to MongoDB. It’s clear that Scylla solves constraints and optimizes performance within the operating system and TCP layer, for example, areas where other database architectures punt.”
Snapfish
ScyllaDB, currently in version 2.2 (released in June 2018) is a wide column store that scales both up and out. You access it using CQL (Cassandra Query Language, not to be confused with Contextual Query Language or Common Query Language), which is SQL-like. It runs on Linux-based systems and supports both eventual and immediate consistency. Because everything is asynchronous there is no locking, which is good for performance, but the product has lightweight transaction capabilities so that it remains suitable for transaction processing. The product also supports both secondary indexes and materialised views, which will further improve performance, as will the fact that ScyllaDB uses its own caching technology rather than relying on Linux. Another major feature is dynamic self-tuning, which works with the database’s task scheduler. As far as we know, this is the only scheduler in the NoSQL market that knows fine-level details such as disk speeds, so that queues (one per core) can be prioritised dynamically.
Other notable features include what the company calls “heat-weighted” load balancing, access security defined down to the object level, and the fact that you can implement JanusGraph on top of ScyllaDB, something that IBM has done.
From a usage perspective you might choose to use ScyllaDB for much the same reasons that you might select other NoSQL databases such as DynamoDB, MongoDB, Redis or Cassandra. Given the smaller footprint allowed by using C++ it is also reasonable to consider the use of ScyllaDB in edge and other devices within an Internet of Things environment.
However, the main reason why you might select ScyllaDB in place of any competitive database is because of either performance or cost, or both. As an example of the former, Figure 1 shows a benchmark comparison with DynamoDB, with ScyllaDB clearly outperforming its rival.

Figure 1 – Benchmark comparison with DynamoDB
However, it is not just about ultimate performance but about consistency of that performance. Figure 2, shows a comparison between Cassandra and ScyllaDB, conducted by one of the latter’s customers, showing the relative latency of the two products.
Finally, there is the question of cost. Because of its architecture you tend to require fewer nodes in a ScyllaDB implementation compared to other possible products. The company quotes – and users bear this out – a factor of five when it comes to the number of nodes required for a Cassandra or other NoSQL deployment, compared to ScyllaDB. Thus, you would expect the hardware costs to be one fifth of those of other products. A commensurate reduction in administration costs would also be expected.

The Bottom Line
It is difficult to think of a good reason not to prefer ScyllaDB to any of its competitors: it performs better and costs less. You might be able to hunt around and find some particular feature that you desperately need that isn’t in ScyllaDB but most of the time that won’t be the case.
Related Company
Connect with Us
Ready to Get Started
Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."
Connect with us Join Our Community