Big Data - Further Information
This page shows up to 100 pieces of content (newest at the top):
Financial Trading Technology and RUMI from N5
IT is critical for financial services companies & often the prime or only source of competitive advantage.
Master Data Management (2021)
This report summarises the current state of the master data management (MDM) market at a high level, assessing the leading vendors in the space.
(Cloud) Data Management Platforms
This (Cloud) Data Management report compares platform-based approaches that support data integration to/from cloud-based deployments.
Talend Data Fabric
The basic concept behind the Talend Data Fabric is to allow you to collect, govern, transform, and share your data.
Solix Cloud Management
SOLIXCloud is a cloud data management platform from Solix Technologies that provides four primary solutions.
Qlik Data Integration Platform
The Qlik Data Integration Platform is essentially a melding of the capabilities provided through the acquisitions of Podium Data and Attunity.
Oracle Unified Information Management Platform
Oracle’s data management capabilities span the establishment of (cloud-based) data warehouses and data lakes.
Informatica Intelligent Data Platform (2021)
The Intelligent Data Platform encompasses data integration, quality, governance, MDM, cataloguing, privacy, application integration, and more.
IBM Cloud Pak for Data
IBM Cloud Pak for Data places particular emphasis on deploying, developing and managing AI and machine learning models.
Hitachi Vantara Lumada
The Lumada DataOps Suite from Hitachi Vantara supports everything from ingestion through to analysis and dashboards.
Cloudera Data Platform
Cloudera Data Platform offers data integration, data governance, data cataloguing and transformation capability but not data quality per se.
Ataccama ONE
Ataccama ONE encompasses data integration, data cataloguing, data profiling and data quality, and both reference and master data management.
MarkLogic Data Hub
MarkLogic Data Hub is a fully cloud-native platform for data storage, integration, operationalisation and governance.
N5 Rumi
Rumi is a software platform that enables enterprises to embed rich, real-time analytical data processing directly into their transactional applications.
Cambridge Intelligence Keylines InBrief
KeyLines is a graph visualisation product that allows you to examine relationships between entities and/or events.
Graph Database (2020)
This is Bloor's fourth Market Update in this space, which discusses the state of the graph database market as of early 2020.
Amazon Neptune
In Amazon Neptune, both RDF graphs and Property Graphs are stored in a “quad” representation using a custom data model.
Cambridge Semantics AnzoGraph (2020)
AnzoGraph is a massively parallel graph database that runs on HDFS, NFS and other big data platforms and is ACID compliant.
DataStax Enterprise (DSE) (Graph Engine)
The DSE Graph Engine is a property graph that is built into DSE and leverages DSE’s capabilities for storage, search and analytics.
AllegroGraph (2020)
AllegroGraph from Franz Inc. is a semantic graph database focused on generating semantic knowledge graphs.
Grakn Core and Grakn KGMS (2020)
Grakn consists of a database, an abstraction layer and a knowledge graph, which is used to organise complex networks of data and make them queryable.
MarkLogic Data Hub Service and MarkLogic Server
MarkLogic Server is a multi-model database that can be used to store documents, relational data via tables, rows and columns, and graph data.
Memgraph (2020)
Memgraph is an in-memory, ACID-compliant, property graph database written in C++ that supports openCypher.
Neo4j (2020)
Neo4j is a property graph database with a native engine that is targeted at operational, hybrid operational/analytic (HTAP) and pure analytic use cases.
Objectivity ThingSpan (2020)
Objectivity/DB has proven scalability and performance credentials in highly demanding environments. ThingSpan, which is built into the database.
Ontotext GraphDB and the Ontotext Platform (2020)
GraphDB from Ontotext is a native RDF database with dynamic indexing that integrates with various search technologies, document stores, and text mining.
RedisGraph (2020)
RedisGraph is the graph database module for Redis where, by “module” the company means functionality embedded into the product.
Sparsity Technologies Sparksee
Sparsity Technologies Sparksee is a property graph database that focuses on high performance deployment at scale and on embedded systems.
Stardog (2020)
Stardog is an RDF database with strong support for SPARQL and OWL that can be extended to provide labelled property graph capabilities.
TigerGraph (2020)
TigerGraph uses a property graph paradigm and has been designed specifically to support real-time (less than one second) analytics.
Actian Avalanche
Actian Avalanche is a hybrid cloud/on-premises, columnar (with compression) data warehouse offering provided as a managed service when in the cloud.
Cazena
Cazena is a single tenant massively parallel analytics platform primarily targeted at providing a data lake as a service.
Cloudera Data Warehouse
The Cloudera Data Warehouse is based on the Cloudera Data Platform (CDP), which involves more than 30 open source technologies.
Exasol
Exasol is a massively parallel, shared-nothing, columnar (with compression), in-memory data warehousing solution.
Greenplum Database
Greenplum is a massively parallel shared-nothing data warehouse based on a PostgreSQL kernel.
IBM Db2 Event Store
IBM Db2 Event Store is an in-memory database built on top of Apache Spark, intended to support both near real-time and deep analytics on historic data.
Starburst Presto
Starburst Data provides commercial support for Apache Presto as well as Starburst Enterprise.
Teradata Vantage (July 2020)
Teradata Vantage effectively consists of a merger between what was previously simply Teradata Database, and Aster Analytics.
Vertica Analytics Platform (2020)
Vertica is a massively parallel, columnar database with advanced compression capabilities.
Yellowbrick Data
Yellowbrick Data Warehouse is a massively parallel data warehouse available on-premises as an appliance or there is a multi-cloud option.
Time-Series and Temporal databases and analytics
Time-series databases represent the fastest growing database sector over the last two years.
CrateDB (February 2020)
CrateDB is a NewSQL multi-model database supports JSON documents, relational data, geo-spatial, full text and binary large objects (BLOBs).
TrendMiner, a Software AG company
TrendMiner is a self-service analytics solution designed for domain experts within the process manufacturing space.
IBM Informix
IBM Informix is an object-relational database with native support for both time-series and geospatial data.
Teradata Vantage 4D (February 2020)
Teradata Vantage is the only product that we are aware of that has all the 4D capabilities that you might need.
Trendalyze (February 2020)
Trendalyze describes its core capability as the discovery of motifs (micro-trends) and anomalies within time series data.
InfluxDB
InfluxDB is a time series database that has been designed that way, as opposed to a relational (or other) database that supports time series.
QuasarDB
QuasarDB is a NewSQL column-oriented, time-series, distributed database that uses a peer-to-peer approach to support QuasarDB clusters.
TIBCO Spotfire (for time-series)
Spotfire is a broad, general-purpose analytics offering that encompasses data visualisation, business intelligence, analytics and data preparation.
Kx Systems kdb+ and Kx Technology
Kx is built on the kdb+ database, an in-memory columnar database with both streaming and timeseries capabilities.
Interana
Interana is a self-service platform for “behavioural discovery and analysis” that is intended for use by business analysts.
FaunaDB
FaunaDB is a serverless cloud database that offers global access to data via APIs such as GraphQL without sacrificing data consistency.
Victoria Metrics (Prometheus)
VictoriaMetrics offers long-term remote storage for Prometheus, which is a Linux Foundation open source time-series database and monitoring system.
TimescaleDB
TimescaleDB is built on top of PostgreSQL as a database intended to specifically support the requirements of time-series data.
Redis Enterprise (February 2020)
Redis Enterprise is an in-memory, distributed (automated partitioning), NoSQL database with a key-value store as its underpinning.
McObject eXtremeDB
McObject eXtremeDB is an embedded (less than 200 KB footprint) hybrid in-memory and persistent database designed specifically to support time-series data.
Dimensional Analytics
This paper explores “dimensional analytics”, by which we mean analytics that requires an understanding of the dimensions of time and space.
MemSQL (2020)
MemSQL is a scale-out distributed database, with a lock-free architecture that supports both row and column storage. They target Fortune 1000 companies.
ScyllaDB
ScyllaDB is a Cassandra compatible database developed using C++ rather than Java. As a result in has a smaller footprint and better performance that Cassandra.
IBM Cloud Pak for Data 1.2
Limited, or no, technological capability with respect to AI is holding many companies back. This paper discusses how IBM Cloud Pak for Data can help.
Graph Database Market Update 2019
This is the third Market Update into the graph database market, considering and comparing both property graph and RDF databases.
Software AG Apama and the Internet of Things
While this paper focuses on Software AG Apama it is not a review of Apama per se, but rather of Software AG’s approach to IoT Analytics.
Synerscope Ixiwa
Ixiwa might best be described as a data lake management product that covers everything from automated ingestion, through discovery and cataloguing to data preparation.
Memgraph (2019)
Memgraph is a property graph database targeted primarily at hybrid analytic and transactional environments.
Cray Systems and the Cray Graph Engine
The Cray Graph Engine is an RDF database that runs on a variety of Cray hardware platforms.
ArangoDB
ArangoDB is a multi-model database that supports document (JSON), key-value and property graph capabilities with one database core and one declarative query language.
Cambridge Semantics AnzoGraph (2019)
AnzoGraph is a massively parallel RDF database targeted primarily at large scale analytic environments
Neo4j (January 2019)
Neo4j is a labelled, property graph database with a native engine that is targeted at operational and hybrid operational/analytic use cases.
Microsoft Azure Cosmos DB
Cosmos DB is a distributed multi-model database that is provided as a service. It supports key-value, column store, document and property graphs.
Grakn Core and Grakn KGMS (2018)
Grakn is a graph-based platform for developing cognitive and other applications leveraging artificial intelligence.
Data Catalogues
Data catalogues are hot. Why? Why should you care? What can they do for you?
Managing data lakes: building a business case
This is a companion paper to one we published in 2017. We outline a methodology for building a business case in support of implementing suitable data lake management software.
Trendalyze (June 2018)
Trendalyze describes its core capability as the discovery of motifs (and anomalies) within time series data. You can think of a motif as a micro-pattern but it is more accurately a shape. Once a motif of interest is discovered, or…
What’s Hot in Data
In this paper, we have identified the potential significance of a wide range of data-based technologies that impact on the move to a data-driven environment.
SQL Engines on Hadoop
There are many SQL on Hadoop engines, but they are suited to different use cases: this report considers which engines are best for which sets of requirements.
Data Lake Management
There are various factors needed to prevent a data lake becoming a swamp.
Big data and the mainframe - issues and opportunities
The purpose of this paper is to examine those issues, which arise when big data implementations transition beyond skunk works and into general-purpose use.
The Chief Data Officer: getting the basics right
Before a CDO can think sensibly about what data the business might want to leverage they must get a handle on the data assets that the company already possesses.
Managing Data Lakes
This paper discusses why data lakes need to be managed and the sorts of capabilities that are required to manage them.
All about graphs: a primer
Over the last few years graph databases have been the fastest growing sector within the database market ...
Graph and RDF databases 2016
This Market Report discusses the latest trends in this market, along with a detailed assessment of the leading vendors in the market
Graph and RDF databases Market Update 2016
This Market Update discusses the latest trends in this market, along with our assessment of the leading vendors in the market.
IBM Informix and the Internet of Things
This paper discusses the IBM Informix database and its suitability for deployment within Internet of Things (IoT) environments.
Total cost of ownership
TCO should be more important in decision making than either license fees or subscription costs.
DATUM - a value-driven approach to building the digital enterprise
In this paper we will discuss why we believe that understanding the business value of data is fundamental to a successful digital transformation.
All things Hadoop
Discussing the Open Data Platform and Apache Spark
The Internet of Things Reference Model
The World Forum Architecture Committee has published an IoT reference model
Product Information Management (PIM)
I often get emails from vendors talking about a whitepaper or other sales document. Sometimes these are very useful simple guides to a subject.
IBM: enhanced 360° view
IBM is in the vanguard for what it calls an enhanced 360° view and it is clearly well positioned to capitalise on the future growth of this market.
Extending a 360° view
In this paper we will discuss why we believe that extending the traditional 360° view makes sense and we will give some uses that demonstrate why the extended it represents an opportunity.
Kdb+ and the Internet of Things/Big Data
Kdb+ is a column-based relational database with extensive in-memory capabilities, developed and marketed by Kx Systems.
Creating confidence in Big Data analytics
There has been some significant criticism of the concept of big data recently, notably in the Harvard Business Review criticising the Google Flu Trends...
Considering the small in big data
Not all of the issues addressed by big data need big data solutions
Kognitio: clarifying misunderstandings
There aspects of Kognitio and its offering that are sometimes misunderstood, so I thought I should clear some things up.
Big data security
The third issue for big data is ensuring that the data is secure and compliant. There are also ethical issues.
Big data context
The second issue for big data is understanding the context of the data
Big data trust
The first issue for big data is how much you trust the data
TIBCO transforms big data into big opportunity
TIBCO came to London for their user conference (transFORM2013). This year's theme was all about big data and TIBCO's senior executives outlined their strategy for their platform.
Calling a spade a spade
Preventative maintenance and asset optimisation are not the same thing