Amazon Neptune

Update solution on September 11, 2020

What is it?

Amazon Neptune is a graph database delivered as a fully managed service, offered by Amazon as part of AWS. As a graph database, it allows you to navigate and query relationships within a connected structure and thus enables you to create graph applications that would be difficult or impossible to build on top of other types of database (notably relational databases). Compared to other graph databases, Neptune promises high scalability, availability, and affordability. Moreover, Neptune has been designed to offer the chief benefits of both RDF and property graphs, allowing you to store and query data in either model. The company particularly targets use cases involving knowledge graphs and identity graphs, where the latter support recommendation engines and similar functionality; while the former are often deployed to support fraud detection in financial services and gaming, and in constructing digital twins.

Customer Quotes

“Our customers are increasingly required to navigate a complex web of global tax policies and regulations. We need an approach to model the sophisticated corporate structures of our largest clients and deliver an end-to-end tax solution. We use a microservices architecture approach for our platforms and are beginning to leverage Amazon Neptune as a graph-based system to quickly create links within the data.”
Thomson Reuters

What does it do?

Neptune is a durable, ACID-compliant graph engine that offers immediate consistency. As shown in Figure 1, it sits of on top of a cloud-native storage service provided by Amazon that was purpose-built by AWS for highly available databases. The storage service supports graphs of up to 64TB. It features a distributed storage architecture, in which it creates and delivers a primary/master access node as well as a number of ‘read replicas’ of that node, all of which are used to access a shared storage layer spread across multiple availability zones (see Figure 2). Read replicas are, as the name suggests, read only, so any writes must be performed on the master node. Even so, they enable a large degree of scale out (you can scale up to 15 read replicas) by parallelising your reads. Moreover, network traffic is minimised: the master node transfers logged changes – and only logged changes – to each replica as those changes are made. This results in low network traffic and correspondingly low replica lag, which is usually measured at less than 10 milliseconds. Failing nodes are automatically detected and replaced, and if the master node fails a replica can quickly take over, typically in less than a minute. Neptune can therefore provide high availability.

n Neptune, both RDF graphs and Property Graphs are stored in a “quad” representation using a custom data model. They can be accessed as a property graph (à la TinkerPop) via Gremlin or as an RDF graph via SPARQL. In essence, Neptune takes a “best of both worlds” approach to the question of property graph vs. RDF graph. This means that a) although Neptune uses quads for both graphs, it still gives you access to one of the major draws for property graphs (labels); b) any of your graph users will be able to interact with and query it readily, regardless of which type of graph they have experience with; and c) more advanced users will be able to leverage whichever query language they prefer, or that is most suitable for the situation they find themselves in.

Finally, Neptune offers encryption of your data at rest via the AWS Key Management Service (KMS) and encryption in transit via TLS (Transport Layer Security) 1.2. It is eligible for compliance certifications with ISO, HIPAA, SOC and PCI/DSS. It offers API endpoints for databases and query management operations such as query explanation, bulk loading data, and query cancellation. You can use services like AWS AppSync to build GraphQL interfaces. It also offers full-text search integration with Elasticsearch, and a Neptune Workbench integrated with Jupyter notebooks. Federated query capabilities are also supported. If you have data in a relational database, Neptune is a destination for the AWS Database Migration Service (DMS). You can specify a mapping in JSON for your property graph or the W3C’s R2RML for RDF graphs and use DMS to do a batch migration of data into Neptune.

Why should you care?

Neptune offers a number of reasons to choose it over other graph databases. The most prominent are its scalability and high availability (enabled by its read replica driven storage architecture), its relative affordability (a T3 instance in Neptune is under $0.10/Hr.), and its compatibility with both property graph and RDF graph query languages. In particular, as a graph database that ameliorates the anxiety of choosing between a property and an RDF graph, that is less expensive than its peers, and that is offered as a fully managed service on top of AWS, it significantly reduces the barrier to entry for the space. This makes it highly appealing.

On the other hand, Neptune does not offer any innate graph visualisation capabilities. Other facilities, including auto-scaling (a significant omission in the current version), and pre-built graph algorithms are part of the company’s roadmap.

The Bottom Line

Amazon Neptune is a graph database that is offered as a managed service. It features a custom graph data model that can be queried using either Gremlin or SPARQL, and it is scalable, durable, and relatively inexpensive.

Related Company

Amazon

Connect with Us

Ready to Get Started

Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."

Connect with us Join Our Community