Database virtualisation in Windocks serves virtualised copies (clones) of databases for the purposes of testing, either within a Docker container or a standard database instance. The former is one of the major selling points of the platform, and enables a clear path to Kubernetes-based test pipelines. Virtualisation in Windocks can either be delivered via storage volumes or Windows Virtual Hard Drives (popular for removing storage dependencies) and in either case creates and delivers clones in seconds. Moreover, the ability to choose between the two methods is a differentiator within the TDM space.
Fig 1 - Configuring a data subsetting process in Windocks
Windocks’ data subsetting is very highly automated, and includes the automatic resolution of circular dependencies, composite keys, and other challenges that would otherwise need to be addressed manually. As you would expect, referential integrity is always maintained. This is all accomplished by a bespoke, patented subsetting algorithm.
The platform’s synthetic data capability is also highly automated, and can create synthetic data sets that accurately represent your production data in their overall makeup while still being fabricated in the details. This results in essentially realistic data that is nevertheless entirely fake. This data is generated using an in-platform engine as well as external Python libraries, incorporating various AI and machine learning techniques and models. A ‘fast’ synthetic model is available, which accelerates the creation of synthetic data but results in a somewhat less representative data set. In addition, Windocks supports the Synthetic Data Vault open-source project, and can work with your own Python libraries or other custom code. Notably, the product’s synthetic data capability is also highly scalable: it can generate everything from an individual table to an entire database. This allows it to serve a variety of different use cases, even outside of the testing space – supplying training data for AI models, for instance.
In terms of data access, the Windocks platform is primarily available through a web UI. For starters, this UI is designed to abstract out many of the complexities of managing Docker containers. Accordingly, manual coding and configuration is kept to an absolute minimum. In fact, only very simple configuration files (Dockerfiles, specifically) are required to build your Docker images, and image builds are one-step processes that can include data subsetting, masking, and synthetic data generation. Integration with source control (Git, for instance) is built-in as part of this single step, and no staging server is needed at any point. When combined with the product’s database virtualisation functionality, the result is a versioned, test-ready database repository. The platform also supports incremental image updates, meaning that you can keep your database clones up to date whenever the underlying data changes without needing to completely rebuild and redeploy them.
Data subsetting and synthetic data generation are similarly easy to action. In both cases it is almost literally as simple as telling the platform what data you want to use as your source, where you want to put the result, and how large the result should be as a percentage of the initial data set (see Figure 1). Practically everything else, barring the initial (very minimal) configuration of each database, is automated, resulting in a highly streamlined process for generating test data. Note that the aforementioned percentage can be set to 100% (useful if, for example, you want to migrate the data set without changing it) or even over it, which allows you to generate a synthetic data set larger than the original data set (which can, at your option, also include the original data set). You can also incorporate SQL statements into this process if you want to, say, add edge cases that may not be covered by your production data.
All that said, the web UI is not the only way to interact with Windocks. It also offers a CLI (Command Line Interface) for use by developers (and other more technical users) as well a restful API. In particular, it provides an open interface that is accessible to other products, notably the company’s partners, to add additional functionality to the build process. REST APIs are also provided to integrate outgoing images with various CI/CD pipelines, source control, and orchestration systems, such as Jenkins and Azure DevOps.