1010data

1010data is one of the better kept secrets within the data warehousing community. We all know about these new kids on the block, for example, but 1010data, which has been around for much longer (since the turn of the century), is far less familiar.

There are a couple of reasons for this. The first is that the company’s client base is almost exclusively concentrated on Wall Street (HSBC, Goldman Sachs, JP Morgan, UBS and others) though it does have a couple of European clients as well as customers in the pharmaceutical space (Procter and Gamble) and retail (Pathmark). The second is that it does not sell software licenses like other data warehousing vendors do but software as a service (SaaS).

Actually, this is not quite fair to other warehouse suppliers because 1010data does not just provide data warehousing as a service but also the business intelligence that goes with it, and it is this combined offering, which the company refers to as hybrid BI, that makes the company different.

Before I go, we had better be clear about the BI capabilities offered. What 1010data provides is an analytic tool for business analysts, which is provided through a web-based query interface. It is intended specifically to support ad hoc queries and you do this via the company’s own query language. This is based on XML and is claimed to be much more powerful than SQL. It has more than 100 statistical functions built into it and it supports time-series analysis so this does not seem an unreasonable claim.

The one thing you don’t get with 1010data is fancy visualisation. If you want to put a third party tool for that on the front-end you can (there’s support for ODBC), but that’s not what 1010data is about.

At the back-end, 1010data uses a column-based relational database that offers all of the advantages associated with columnar processing particularly with regard to database size (smaller), tuning requirements (minimal), and query processing (much faster for unpredictable and complex queries). The software will execute queries in parallel on as many processors (commodity hardware) as there are available. As is common with columnar databases there is no requirement for pre-building cubes or aggregates. Data can be loaded (fast) while the warehouse is being used, and without interruption to query processing. However, the product only supports batch loads, which can be as frequent as you like, and the company does not offer real-time updates.

Finally, 1010data’s value proposition is not just that it enables rapid processing of sophisticated queries but also that does that on very large datasets. For example, at the New York Stock Exchange (one of 1010data’s customers) over 750 million records are added daily, all of which are available for analysis on a ‘next-day’ basis.

For a company that most of us have not previously heard of this is impressive stuff, as is the company’s full client list. Definitely worth a look.