IBM, Hadoop, Initiate and other things

I’ve just got back from IBM’s Information on Demand EMEA event. As you would expect there were several interesting announcements. For me, perhaps the most interesting was that IBM is going to be providing support for Hadoop or, more specifically, a full range of professional services to support the planning, use and implementation of Hadoop. IBM sees Hadoop as a solution to the problem of handling (very) big data but that it needs this support from a major organisation for it to seriously gain traction in large commercial enterprises. It is likely that IBM will extend beyond support as Hadoop starts to become more popular but what exactly that will be, or when, will depend on take-up.

The second thing that was interesting (perhaps less interesting for me than some others, because I had been pre-briefed on it) was the joint IBM/ANTs announcement of really easy portability from Sybase ASE (version 12.x) to DB2. However, this requires a significant discussion so I will write a separate article on the subject.

Next was the positioning of Initiate versus MDM Server. The official line is that Initiate, while it has sometimes been implemented as a hub, is really a registry-based solution and that MDM Server, while it has sometimes been implemented as a registry is much better at being a hub. You can see where this is going. There is some truth in it but I think it’s also IBM trying to spin some sort of plausible story out of what is essentially a healthcare play. It begs the question of what you do if you want to start with a registry and migrate to a hub later (do you start with the inferior registry to maintain consistency or do you migrate from one to the other?) or if you want a registry for financial data and a hub for product data (do you really want two products?). I think the company could make a coherent story but it will take some time.

I had a long chat about entity analytics (EAS) and global name recognition (GNR). This is often thought of as being in the same category as Informatica’s Identity Systems or Infoglide. But it isn’t really: the latter are about name matching (which you get with IBM’s data quality products) but the former is really about fraud. This opens up a number of areas of interest and potential integration. For example, it would make sense to integrate this technology with SPSS: then you can look for patterns of potentially fraudulent behaviour AND check to see who these people are and what you know about them. A similar integration with Tivoli SIEM would also make sense, for the same reason, though integration between Tivoli and SPSS would probably be a more sensible first step or, at least, support for PMML (predictive modelling mark-up language). On that note InfoSphere Streams supports PMML already and it would also make sense to integrate EAS and GNR into this too.

Some things I didn’t get news about, notably when and if Guardium will be integrated with Tivoli SIEM and what’s happening to Showcase but in general it’s a very worthwhile conference: if you didn’t go this year, you should start planning for 2011.