VoltDB uses a shared-nothing architecture to achieve database parallelism, with both data and processing distributed across all the CPU cores within the servers composing a VoltDB cluster. In other words, the product should scale as the number of cores increases, not just by the number of servers. Horizontal partitioning also enables scalability. SQL access is supported via pre-compiled Java stored procedures. The unit of transaction is the stored procedure, which is Java interspersed with SQL. Further, stored procedures are executed in the partition containing the relevant data, which means that it is possible to eliminate round trip messaging between SQL statements, which is another reason for the performance of VoltDB.
Fig 01 VoltDB exporters
More generally, VoltDB leverages in-memory techniques to improve performance, but also supports spill-to-disk capabilities where memory capacity is insufficient. For data that is merely warm or cold, as opposed to hot, VoltDB provides exporters, as can be seen in Figure 1. As can be seen in this diagram, VoltDB supports ingestion from streaming sources (Apache Kafka, Amazon Kinesis plus various message buses) and export to data lakes and other dedicated analytic environments. From the latter, machine learning models may be imported into VoltDB using PMML (predictive modelling mark-up language), where they are treated as stored procedures. While we applaud the support for PMML as an interoperable standard, it is unfortunately true that many machine learning and AI models are not PMML compatible, so we would like to see VoltDB supporting models built by language (R, Python, Scala and so forth).
While on the subject of machine learning and stored procedures, it is also worth noting that some types of analytics have fairly static requirements compared to machine learning, which is typically dynamic. An example, would be compliance with MiFID II. We comment upon this because one of VoltDB’s customers has embedded the Drools rules engine into VoltDB as a stored procedure precisely as a mechanism to enforce compliance with MiFID II.
Finally, VoltDB supports containers and Kubernetes is used for orchestration. Multi-site active/active deployments are available.