Talend first came to market in August 2005 with the beta version of Talend Open Studio, a data integration tool. This was subsequently extended with new products, so that by 2010, the company was offering a complete Data Management Platform. The company uses an open-core (Apache licenses) business model. Its products are largely home grown but it has made a number of acquisitions during its lifetime, most notably Amalto (for master data management, though this is no longer a focus for Talend), Sopera (ESB and SOA development services), Restlet (cloud-based API development and testing) and, most recently, Stitch (cloud-based, self-service data ingestion).
Initially, Talend was privately owned, based in France, and backed by venture capital but it subsequently, in 2016, floated on NASDAQ. It has nineteen offices worldwide with seven of these located in Europe, five in Australasia and the remainder in the United States. Head office is now in Redwood City. The company has more than 4,000 customers spread across all industry sectors, many of which are household names.
Company Info
Headquarters: 800 Bridge Parkway, Suite 200, Redwood City, California 94065, USA Telephone: +1 (650) 539 3200
Fig 01 - Personas supported by the Talend Data Fabric
The basic concept behind the Talend Data Fabric is to allow you to collect, govern, transform, and share your data. Where “you” in this equation may have any one (or more) of the personas illustrated in Figure 1.
As can be seen, the Data Fabric is designed to support multi-cloud, hybrid and on-premises environments and there is a managed service offering. As far as the tools included in the data fabric are concerned, these are all cloud-native and include the Stitch Data Loader, which is targeted at frictionless (self-service) data loader and is primarily aimed at smaller businesses and the mid-market. For enterprises there is Data Integration and Pipeline Designer (the inheritors of Talend Open Studio); Data Quality; Data Inventory (cloud-only), which allows you to see who is using your data and how; Data Catalog; Data Preparation; Data Stewardship, which enables data governance in conjunction with other relevant modules; and API and Application Integration, which is another cloud-only offering.
Customer Quotes
“We’ve become an e-commerce company that sells pizza. Talend has helped us make that digital transformation.” Domino’s Pizza
“Integrating online and offline data with Talend helps us develop more ways to communicate with our customers across channels. That kind of interaction drives loyalty.” Office Depot Europe
There are several noteworthy features of the individual components within the Talend Data Fabric. Perhaps the most fundamental is that its data integration technology is based on a code (Java) generating engine. This has both advantages and disadvantages. On the one hand, it should perform better than using SQL. Moreover, you don’t need to use, or pay for, an intermediate server. On the other hand it means you need to regenerate your code whenever something changes. And even though you may be able to automate a lot of this process it will still mean manual intervention at least sometimes. More generally, Talend is well endowed with native connectors and other components, with more than a thousand of these altogether. However, automation is generally limited and in-built machine learning capabilities for things like recommendations, metadata discovery and identifying sensitive data are still at an early stage of development: there are some capabilities provided but they need building out, which is what the company plans to do. Indeed, the company’s ultimate goal is to provide a platform that is (almost) completely autonomous, as illustrated in Figure 2.
Fig 03 - Talend Data Fabric Architecture
One innovative idea that the company has introduced is what it calls the “Talend Trust Score”. This is presented as a single score – see Figure 3 – that is calculated based on data quality and popularity metrics plus any user-defined criteria. It’s a nice concept to help business users understand the trustworthiness of their data, based on the “5 Ts”. That is, that data should be thorough, timely, transparent, tested, and traceable. This is a function of Talend’s Data Inventory module, which runs on either AWS or Azure. We haven’t seen anything else quite like Data Inventory, which provides a “single pane of glass” to support collaboration, self-service, and exploration about datasets. In other products, where they have comparable features, they tend to be spread across different products.
Finally, we should mention the fact that the company is planning to implement data masking as a capability alongside data integration (currently it is only available within Data Preparation) and it also intends to further build out its policy management capabilities to support data governance.
Talend Data Fabric is a work in progress. Its primary advantages are that it represents a unified platform – something that is often claimed but rarely delivered – it is complete (even of some features could be improved), and has a breadth of coverage for different personas that is as broad (perhaps broader) than any other current offering. We especially like the Data Inventory. Where the Data Fabric loses out is that it is not as advanced as some other platforms in providing automation (through machine learning), though the Data Inventory and Talend Trust Score are arguably exceptions to this statement. In any case, we applaud the company’s vision and like the fact that the Talend Data Fabric will allow you to gradually evolve towards a holistic data management solution at your own pace.
The Bottom Line
Talend has a significant history in providing data management solutions and it is continuing that tradition with the Talend Data Fabric. Work remains to be done but it is a laudable undertaking.
We use third-party cookies, including Google Analytics, to ensure that we give you the best possible experience on our website.I AcceptNo, thanksRead our Privacy Policy