K2view
Last Updated:
Analyst Coverage: Daniel Howard
K2view (named for the famously difficult-to-climb mountain) was founded in 2009, and has offices in the US, Germany, Israel, and the Netherlands. It is the developer of the K2view Data Product Platform. This platform originated as an ETL tool, but has since grown into a widely-encompassing data management offering capable of addressing a variety of enterprise use cases.
The K2view Data Product Platform is a unified data management platform that offers numerous capabilities within a single overarching product. It includes solutions for many different data tasks, including data integration, data governance, data fabric, cloud migration, 360-degree customer view, test data management, and more.
K2view Test Data Management
Last Updated: 2nd April 2024
Mutable Award: Gold 2024
The K2view Data Product Platform is a unified data management platform that offers numerous capabilities within a single overarching product. It includes solutions for a wide range of enterprise data management problems. Key capabilities include data integration, data governance, data fabric, cloud migration, 360º customer view, and more. Most notably, at least for the purposes of this article, is its ability to provide effective test data management (TDM). This subset of the platform’s functionality is referred to as K2tdm, and it includes data subsetting, data masking, sensitive data discovery, and synthetic data generation.
Customer Quotes
“K2view test data management tools provide a self-service approach for our teams to provision test data on demand – without impacting production source systems.”
AT&T
“We’re collaborating closely with K2view in order to evolve our test data management tools for the purpose of driving our business agility, IT velocity, and attention to customer experience.”
Vodafone Germany
K2tdm organises test data management into three key phases: extraction, organisation, and operation (as shown in Figure 1). Test data access follows. Notably, these phases can be repeated – either on-demand or on schedule – to selectively refresh your test data and keep it aligned with changing production environments.
In the extraction phase, K2tdm ingests data from any source, whether structured or unstructured. Business entities (such as customers) are automatically classified within the platform’s data catalogue, aptly named K2catalog. This includes a sensitive data discovery process, where what qualifies as sensitive data is controlled by a series of customisable rules and parameters. These rules can match against column names and/or the values and format of the data itself. Any discovered sensitive data is automatically masked within the platform, including highly unstructured data such as documents and images. Persistent masking is available at rest and in-flight, as is dynamic masking with role-based access control (RBAC). Masking is always consistent, maintains referential integrity, and a large number of prebuilt, customisable masking functions are provided. Facilities for data compression and versioning are also offered during this phase.
In the next phase, your imported data is organised into a structure that makes test data (sub)sets easy to create and then provision to the target environment. Specifically, it is partitioned into discrete, customisable business entities (determined by your data model, but typically representing customers), with each entity then used to create a unique “Micro-Database” that contains all the data associated with that entity, often centralising data from several different sources in the process. Each Micro-Database can be likened to a miniature data warehouse, offering a 360º view of the specific entity it represents. Notably, the centralised nature of these Micro-Databases ensures that referential integrity is always maintained further down the TDM process, because the Micro-Database itself is always referred back to as the ultimate source of truth for the entity it represents. Therefore, if you change the data in a business entity, those changes will automatically propagate elsewhere in your testing environment.
The third phase is operation. Now that your source data has been appropriately ingested, masked, compressed, and organised into entities, you can use it to create and provision your test data sets. There are two primary ways to do this in K2tdm: data subsetting and synthetic data generation. For data subsetting, you create a subset of your business entities by filtering them through various customisable business rules, which are created via dropdown menus (in other words, no SQL – or other query language – required) and as a consequence are simple to build. The corresponding Micro-Databases are fetched and combined into your test data set, then provisioned into the target environment. For example, a tester could rapidly select 1,000 customers based on location, purchase history, and loyalty program status.
For synthetic data generation, there are four methods available. The first is rules-based data generation, in which a series of specific, manually-created business rules are used to generate data sets. The second uses machine learning and generative AI to analyse your production data and create a “lookalike” data set, such that the individual entities are completely fabricated but the overall makeup of the data is very similar to the original. The third leverages data masking techniques to create “new” data, while the fourth (described as data cloning) involves duplicating business entities while changing their identifying features. These techniques vary in sophistication, and we would generally consider the latter two to be relatively ancillary: in most cases, the choice will be between rules-based or machine learning-based data generation, depending on whether fine-grained control or automation is favoured for a particular use case. Even so, the fact that they are all offered within a single product is a point in K2tdm’s favour and empowers you to choose the best technique for each use case. Data sets can also be created by blending different types of test data, such as masked production data and synthetic data.
Various other test data management functionality is available, including reserving data for individual testers, versioning, performing rollbacks, and so on. The platform’s ETL roots enable various capabilities, including loading and moving test data from any environment to any environment – useful for QA teams performing regression testing – and a wealth of data transformations are possible, including sophisticated masking techniques, data aging, and data enrichment.
Test data is exposed for access via a self-service web portal (see Figure 2) designed to isolate its users from the complexities of test data provisioning. This allows your testers, and potentially other users such as developers, to access your test data without worrying about what is going on under the hood. APIs are also provided, allowing you to integrate K2tdm into your existing CI/CD pipelines (among other things).
K2tdm is a modern TDM solution that has made a big impact on the space in recent years. In particular, the fact that it is a single, unified platform that provides the lion’s share of the test data functionality you could ask for (and plenty of more general functionality, besides, such as the K2catalog), all built on its core framework, makes for a compelling offering. In addition, the organisation of data into business entities and the corresponding creation and usage of Micro-Databases is a standout feature: few others in the testing space offer anything similar. We also appreciate the emphasis on self-service test data access evident in the product’s web portal user interface design.
The bottom line
Between its unified approach and its offering of data masking, subsetting and synthetic data generation techniques driven by entity-driven data modelling, K2tdm is more than capable of addressing a diverse array of test data management use cases, up to and including those found within complex, enterprise-level data environments. In short, it is a powerful test data management solution that is more than worth your consideration.
Mutable Award: Gold 2024
Using K2View in a Data Fabric
Last Updated: 25th April 2024
Mutable Award: Gold 2024
K2view is a data management vendor, with its flagship product being the “K2view Data Product Platform”. This enables customers to organise their data around business entities like customers and products, helping to unify data from multiple source systems, and making it accessible to authorized consumers in near real time. The software provides functionality in the areas of customer 360, test data management, synthetic data generation, data masking, data pipelining and data migration, as well as master data management and retrieval-augmented generation for GenAI. At the core of its platform is a proprietary technology called a Micro-Database™ that unifies and governs everything known about a certain business entity. In the case of a customer, for example, this might include phone records, emails, website visits chatbot chats, invoices, orders and corporate master data such as consent information or assessments of the customer’s propensity to churn. It extracts this information from various source systems, with rules specifying how often each data element is to be updated. Some data is cached in memory and some may persist in the Micro-Database, which is compressed and encrypted for secured and low-footprint delivery to consuming applications (such as a data warehouse or data lake). You can think of a Micro-Database rather like a filing cabinet with drawers, where each business entity (like an individual “customer”, for example) has its data organised and accessible in its own filing cabinet. Two patents have been granted including on the core technology, granted by the US Patent and Trademark Office in June 2019 (patent number 10,311,022).
Customer Quotes
“We’ve gone from 24+ hours for data updates to having data available in simply minutes or even seconds, boosting the customer experience and better supporting our customer care agents.”
Marc Schmeetz, Program Manager, Vodaphone Ziggo
“With its unique approach to data management and rich feature set, K2view Customer Data Hub simplified the complexity of integrating and unifying the data from our disparate systems, allowing us to deliver world-class customer experiences.”
Ronen Horowitz, CIO, Pelephone, Yes and Bezeq International
As well as the Micro-Databases, there is a software layer with a core data catalog and data governance, data integration, and a semantic knowledge graph above this presented to business users. The user interface is modern and graphical and allows data products to be created and sourced from a wide variety of underlying systems, from SQL databases like Oracle and PostgreSQL, applications like SAP and Salesforce, and various other formats including files, documents and NoSQL databases. The software can read the metadata and catalogs of the various source systems through connectors, and display the structure of the source data in a graphical form. Data corresponding to a specific data object, like a customer, can then be collected together, with rules defined as to how often the data is to be refreshed. There is a clever data catalog feature that allows different versions of the data structure to be displayed graphically. For example, if an underlying database has a table added or dropped, or a column changed, then the display shows this in an intuitive fashion. Rules can be defined for each Micro-Database, so for example data can be classified as personally identifiable data that may need masking, and survivorship rules can be defined to allow de-deduplication of records, an essential feature given the amount of duplication of information that exists in large corporations. In this way, the data that is then presented to an end-user is not only in a business-friendly form but has already been validated. The data product (such as a specific customer record) can then be provisioned onto other downstream systems such as a data lake or data warehouse. Micro-databases are created in milliseconds on-the-fly, the first time a data object is queried. There is also an option of batch creating and populating the Micro-databases. Data can be virtualized or persisted within the Micro-Database for better performance as desired, so customers can decide how virtual or otherwise they want their data to be.
K2view offers a modern and innovative approach to productising data, a core capability that makes it well suited to use cases such as customer 360 situations where traditional master data management tools play. It is also suited to preparing data pipelines, test data management and data masking, as well as supporting the generation of synthetic test data for retrieval augmented generation for large language models in generative AI implementations.
The bottom line
K2view offers an innovative technology based on its patented Micro-Database technology. It has been proven at some very large corporations, particularly in the fields of Telco and financial services. The core technology nicely supports both data mesh and data fabric architectures with its emphasis on a semantic layer that serves up data products to end users. It is well worth considering if you have decided to go down this route.
Mutable Award: Gold 2024
Commentary
Coming soon.