DATAllegro: an update on Version 3
When DATAllegro announced earlier this year that it had partnered with Dell to provide node processing capabilities and EMC to be its disk provider I was not particularly impressed. In particular, I heard the words about EMC salespeople earning commission on the disks sold within DATAllegro systems but didn’t understand the subtext. Also, I didn’t appreciate how the company had changed its architecture or how it could maintain margins and I was concerned that it didn’t seem to be winning any customers.
Anyway, let me start with the Dell partnership. What this means is that instead of having processing nodes that consisted of Intel processors and memory and which were put together by DATAllegro, the nodes are now complete Dell systems based on quad core processors, which also include (non-EMC) disk storage, albeit of limited capacity. These are used for such things as temporary tables during query processing or for trying out re-partitioning schemes. All the direct attached storage is provided by EMC.
As far as EMC is concerned this appears to be (potentially) much more significant than I had imagined. I guess because I don’t work in that space I had not thought about how much leverage EMC account managers have with their customers. However, when you discover that some of these people have only a single account and virtually live on site then you can start to appreciate that they do indeed have significant influence: so when they recommend trying a proof of concept with DATAllegro their customers are inclined to listen.
On the architecture front the most notable additional feature that the company has introduced is what it calls its landing zone. This is an alternative to streaming data directly into the appliance via a bulk loader into the master processing node. What the landing zone does is to use change data capture and ETL procedures that trickle feed data into the landing zone, which are then batched up prior to loading into the warehouse. This has two main benefits: first of all, it is faster than using a bulk loader (the landing zone updates the warehouse at around 1.2Tb/hour) and secondly it insulates the environment from any problems with the loading software.
Finally, what about those margins? Well, DATAllegro supports compression, which is not yet available from any of the other leading appliance vendors (though it is obviously a technology whose time has come what with it being in DB2 and Vertica already and is coming in a number of other products including Oracle 11g). This allows the company, at least for the present, to maintain a price advantage over the likes of Netezza, while still retaining a decent margin on sales.
And speaking of sales: one of the reasons why I have not been sanguine about the future of DATAllegro for some time has simply been because of a lack of observable customers. However, I am now happy that it really does have such customers but that it has been unfortunate with its users not wanting to go public on their use of DATAllegro. Moreover, most of these deals tend to be large deals worth significant sums of money, so the company appears to be (very) healthy from a financial point of view and it has some major potential deals in the offing. So, far from being down and out, DATAllegro is actually, secretly, thriving.