The evolution of integration platforms
Once upon a time there were ETL (extract, transform and load) tools and then, quite separately, products for data cleansing and matching started to appear. However, it took some time before the vendors of the former realised the synergies that existed with the latter. The (partial) exception was Prism, which developed its Quality Manager product as a tool primarily for monitoring the process of data cleansing as opposed to the actual activity of cleansing. However, when Ardent bought Prism it did so to get its hands on what is now DataStage MVS Edition, rather than for Quality Manager.
Nevertheless, it was the descendants of Ardent (subsequently Informix, then Ascential and now IBM) that were the first to recognise the benefits of an integrated platform comprising both ETL and data quality capabilities, which by this time had been augmented by data profiling and analysis.
This trend has continued ever since with, in turn, SAS, Informatica and Business Objects all acquiring major data quality vendors. These companies have been relatively late to follow what is now IBM’s lead, but they have done so for a reason in that SAS and Informatica, in particular, have concentrated on metadata management and a unified (as opposed to merely integrated) environment in the first instance and then acquired complementary software, whereas Ascential/IBM did it (is doing it) the other way around.
However, this isn’t the end of the story because data integration has also expanded into, first, support for semi-structured data (SWIFT messages, EDI and so forth) and, more recently, unstructured data. In addition, EII (enterprise information integration) and federated query capability has been added into the mix, with Informatica embedding source code from Composite Software, Business Objects acquiring Medience (recently launched as BusinessObjects Data Federator), IBM integrating with WebSphere Information Integrator, Sunopsis building its own facilities, and so on.
This is, effectively, the state of play today (or soon will be) in terms of data integration platforms: ETL, EII, data quality and metadata management in a unified platform. However, things do not stand still and the next step will be to include master data management (MDM) within this platform. As usual, some vendors (notably IBM and SAS) will include their own capabilities while others, at least initially, will partner with third parties. For example, Purisma already runs in conjunction with Business Objects platform.
Now, if you include MDM along with data integration I think that extends beyond what most people think of as data integration and we might prefer to call that an Information Management Platform (IMP). So, I believe that we will see the extension of data integration platforms (DIP) into IMPs.
Having come to this conclusion I am planning research for the remainder of this year that will mirror this trend: we have already published an ETL report this year, and an MDM report (written by Harriet Fryman) will shortly be available, while I am starting work on a Data Quality report, which I will follow with a renewed look at the EII/Data Federation market. Finally, I will put all of that together into a Platforms report that will cover both DIPs and IMPs.