Data warehousing update
This being the week of the Data Warehousing Institute (TDWI) annual bash in Las Vegas there have been, funnily enough, a rash of announcements from data warehouse vendors, including new releases from Vertica (2.0) and Greenplum (3.0), Kognitio’s opening of a US-based operation and new Teradata-based capabilities from DATAllegro. More of all of these in a moment.
One other thing to catch my attention was a new market leading TPC-H benchmark from EXASOL for 100Gb data warehouses. 100Gb! A little small but further research indicates that the company, which is German, also holds the record at 1,000Gb, beating the previous record held by ParAccel (which had also held the 100Gb record). EXASolution (the product) is a cluster-based, shared nothing data warehousing product running on a version of Linux that has been extended by EXASOL. It makes extensive use of in-memory processing and compression and in its technical details (I have not had a briefing with the company yet) EXASOL refers to its data storage as “automatic (vertical, element-wise in memory access)” and compression as using “special in-memory algorithms”. This suggests, though it is not clear, that the company is using columns, at least in-memory. If this is the case then it would increase the number of column-based vendors to nine.
As far as the main announcements are concerned, Kognitio’s move into the States comes with all the usual caveats about taking technology from here to over there: success is always dependent as much or more on marketing as it is on technical excellence but the company has made some headway in the UK so it will be interesting to see if it can maintain momentum in the States.
Vertica and Greenplum, on the other hand, are both gathering customers with the former claiming around 20 to date and Greenplum twice that number though, in the case of Greenplum, some of these are quite small, though some of them are also quite large. In both cases, perhaps the most interesting aspect of their performance is that achieved through partners. While both companies partner with various hardware suppliers, each has a special relationship with one vendor: HP in the case of Vertica and Sun (on Thumper) for Greenplum. The figures here are impressive: almost one third of Vertica sales have been through HP resellers while half of all sales (and two-thirds of revenues) of Greenplum are based on Thumper platforms. Conversely, Greenplum is the biggest driver of Thumper sales. Responding to a recent article on another subject, a respondent asked when Oracle was going to buy Netezza—perhaps more pertinent would be: when (if?) is Sun going to buy Greenplum?
DATAllegro has made a couple of announcements. First, last week it announced a partnership and OEM deal with Coffing Data Warehouse, which has hitherto been a company dedicated to Teradata environments, which will provide migration assistance to anyone migrating from Teradata to DATAllegro. The company’s Nexus query tool will also ship with every instance of DATAllegro in future. Following that, this week DATAllegro announced a set of utilities that it is providing specifically to assist in the process of migrating away from Teradata. All of which should make it pretty clear where DATAllegro thinks its target market is.
Finally, not everybody times their release schedule to coincide with TDWI. For example, Calpont has not announced its CNX product here despite the fact that it is scheduled for release this quarter. Conversely, SAND announced SAND/DNA Analytics 5.1 around a month ago. This is interesting because it supports federated queries across multiple instances of SAND databases, thereby supporting a measure of scalability that was not previously possible.
All in all: interesting times. But, like the Chinese proverb implies, it doesn’t make life easier for users—while there are certainly lots of opportunities for doing things that you couldn’t afford to do before, more choices from more vendors makes product selection more difficult.