Blazegraph
Update solution on May 6, 2016
Unlike the majority of vendors in the graph database market, which tend to target operational and hybrid operational/query environments, Blazegraph is squarely focused on graph analytics and query. Typical use cases would include recommendation engines, cyber and other forms of security (including fraud detection), community detection and clustering, drug discovery and genomics, and fault prediction in industrial and IoT (Internet of Things) environments, amongst others.
In addition to Blazegraph Database open source graph database, the company also markets Blazegraph GPU, which is an add-on to the database enabling graph analytics to be accelerated using NVIDIA graph processing units. The company also offers a product called DASL (pronounced “dazzle”) that supports the development of analytic and statistical algorithms that will specifically run using GPUs. Further, the company intends to introduce an appliance based on Blazegraph GPU.
The products are available for both cloud and on-premises deployments. Blazegraph GPU is available today via Cirrascale’s GPU Cloud offering. Note that (a number of cloud providers (for example, Amazon) do not support GPUs. High availability, failover and load balancing are built in for the product’s cluster-based solution. Single server and embedded editions are also available.
Blazegraph is US-based and does not have offices elsewhere. Its primary go-to-market approach is via direct sales though the company also has a number of partners apart from NVIDIA, which gives the company some presence outside North America, notably in Germany and Russia. We would like to see this partner ecosystem expanded. The funding from DARPA is significant because it gives Blazegraph an introduction to the defence and government markets in the United States.
The company has a number of significant users including DARPA, Yahoo!, The British Museum, Harvard Medical School, Autodesk, Dell (EMC2), and Wikimedia. The last of these is particularly interesting because it has made its evaluation criteria and its assessment of various graph database products available for public consumption. There are significant applications of Blazegraph in Precision Medicine and cancer genomics with customers such as Syapse and Seven Bridges Genomics.
Blazegraph and Blazegraph GPU are specifically targeted at large scale, complex graph analytic environments, especially where relationships are unknown in advance.
Blazegraph is an (extended) RDF graph database that also supports property graphs. It leverages SPARQL 1.1, the Tinkerpop/Blueprints and Sesame APIs and a graph mining API. You can also use the Gremlin graph traversal language and the product supports OWL (web ontology language). Further, the product supports the development of domain specific languages whose syntax is converted into SPARQL queries at run-time. With Blazegraph GPU SPARQL queries are translated into suitable code for the GPUs by the software so there will be little or no change required when upgrading to a GPU-based environment.
Existing Blazegraph implementations, as well as new deployments, may be extended by the use of NVIDIA GPUs. Currently support is provided for hundreds of GPUs, enabling exploration of hundreds of billions of edges. There are plans to expand these to thousands of GPUs supporting trillions of edges. As a general principle we would expect approximately an order of magnitude greater cost-effectiveness compared to an all-in-memory approach as offered by some other vendors in the graph market. The next release of NVIDIA processors (Pascal) is expected to improve performance by another four times. We would also expect that adding GPUs to an existing Blazegraph implementation should improve performance by approximately two orders of magnitude.
At present Blazegraph can only be extended through the use of NVIDIA GPUs. In principle there is no reason why other vendors’ GPUs should not be used but suppliers in this market tend to use their own, proprietary, development environments. In the case of NVIDIA this is CUDA. This stands for Compute Unified Device Architecture, and it is a parallel computing platform and application programming interface (API) model that allows software developers to use a CUDA-enabled graphics processing unit for general purpose processing. The CUDA platform is a software layer that gives direct access to the GPU’s virtual instruction set and parallel computational elements. However, despite the fact that you can program in C, C++ or Fortran, parallel programming is hard. As a result, Blazegraph has developed DASL to make this process easier. This is a functional, domain specific language for developing graph and machine learning algorithms. Potential applications include not just graph algorithms but also recommendation systems using collaborative filtering, neural network techniques, clustering algorithms, and a number of others. DASL leverages both Apache Spark and Scala and then the relevant code is automatically converted into CUDA so that developers do not have to worry about understanding parallel programming.
Blazegraph provides developer support, production support, and custom development services for all versions of its open source and enterprise software products.
Related Company
Connect with Us
Ready to Get Started
Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."
Connect with us Join Our Community