Windocks

Last Updated: 6th June 2024
Analyst Coverage: Daniel Howard

Windocks is a test data management (TDM) vendor specialising in database virtualisation. It was formed in 2014 by ex-Microsoft employees, and launched its first public product in 2015.

Windocks includes support for Docker database containers. To wit, it allows you to create virtualised copies of production data attached to cloud-native SQLServer, Oracle, MySQL or PostgreSQL containers, as well as conventional database instances, on standard Windows and Linux VMs. Database images are versioned, and the platform can be managed using a web application, command line, and/or a restful API.

Although Windocks includes data masking functionality, it is not the focus. Instead, Windocks has been built as an open platform designed to readily integrate with other TDM products and thus provide best-of-breed masking, subsetting, synthetic data, and so on, through partners. It has already announced partnerships with IRI and Curiosity Software, and more are reportedly underway.

Moreover, it is storage agnostic, and the containers it deploys feature full compatibility with Active Directory, Windows Authentication, and other infrastructure. The aim is to render them effectively indistinguishable from their non-containerised counterparts, thus removing barriers to their use and enabling you to leverage containers and more traditional database instances both simultaneously and seamlessly.

Company Info

Headquarters: 1130 NE 140th Ave NE, Suite 100C, Bellevue, WA 98005

Windocks

Last Updated: 13th March 2024
Mutable Award: Gold 2024

What is it?

Windocks is a platform for containerised, enterprise-level TDM designed to both support and be supported by AI and machine learning technologies. In addition, it has been built as an open platform designed to readily integrate with other TDM solutions, primarily through its network of partner products.

Its flagship capability is its offering of database virtualisation that includes support for Docker database containers. To wit, it allows you to create virtualised copies of production data attached to cloud-native SQLServer, Oracle, MySQL or PostgreSQL containers, as well as conventional database instances, on standard Windows and Linux VMs. Database images are versioned, and the platform can be managed using a web application, the command line, and a restful API. The platform is storage agnostic, and the containers it deploys feature full compatibility with Active Directory, Windows Authentication, and various other kinds of infrastructure. The aim is to render them effectively indistinguishable from their non-containerised counterparts, thus removing barriers to their use and enabling you to leverage containers and more traditional database instances both simultaneously and seamlessly.

The product also offers data subsetting, data masking, sensitive data discovery, and synthetic data generation capabilities. These capabilities are all highly automated, and support a diverse range of data sources, including SQL Server, PostgreSQL, MySQL, Snowflake, Azure Managed Instance, Azure SQL, AWS RDS, Aurora, DB2, and Oracle. Moreover, Windocks can help you to migrate data (up to and including complete databases) across or between any of these platforms. This has proven particularly useful for migrating to and from Snowflake, for preventing data from being locked to a single platform, and for moving data – especially test data – to locations that maximise accessibility.

What does it do?

Database virtualisation in Windocks serves virtualised copies (clones) of databases for the purposes of testing, either within a Docker container or a standard database instance. The former is one of the major selling points of the platform, and enables a clear path to Kubernetes-based test pipelines. Virtualisation in Windocks can either be delivered via storage volumes or Windows Virtual Hard Drives (popular for removing storage dependencies) and in either case creates and delivers clones in seconds. Moreover, the ability to choose between the two methods is a differentiator within the TDM space.

Fig 1 - Configuring a data subsetting process in Windocks

Windocks’ data subsetting is very highly automated, and includes the automatic resolution of circular dependencies, composite keys, and other challenges that would otherwise need to be addressed manually. As you would expect, referential integrity is always maintained. This is all accomplished by a bespoke, patented subsetting algorithm.

The platform’s synthetic data capability is also highly automated, and can create synthetic data sets that accurately represent your production data in their overall makeup while still being fabricated in the details. This results in essentially realistic data that is nevertheless entirely fake. This data is generated using an in-platform engine as well as external Python libraries, incorporating various AI and machine learning techniques and models. A ‘fast’ synthetic model is available, which accelerates the creation of synthetic data but results in a somewhat less representative data set. In addition, Windocks supports the Synthetic Data Vault open-source project, and can work with your own Python libraries or other custom code. Notably, the product’s synthetic data capability is also highly scalable: it can generate everything from an individual table to an entire database. This allows it to serve a variety of different use cases, even outside of the testing space – supplying training data for AI models, for instance.

In terms of data access, the Windocks platform is primarily available through a web UI. For starters, this UI is designed to abstract out many of the complexities of managing Docker containers. Accordingly, manual coding and configuration is kept to an absolute minimum. In fact, only very simple configuration files (Dockerfiles, specifically) are required to build your Docker images, and image builds are one-step processes that can include data subsetting, masking, and synthetic data generation. Integration with source control (Git, for instance) is built-in as part of this single step, and no staging server is needed at any point. When combined with the product’s database virtualisation functionality, the result is a versioned, test-ready database repository. The platform also supports incremental image updates, meaning that you can keep your database clones up to date whenever the underlying data changes without needing to completely rebuild and redeploy them.

Data subsetting and synthetic data generation are similarly easy to action. In both cases it is almost literally as simple as telling the platform what data you want to use as your source, where you want to put the result, and how large the result should be as a percentage of the initial data set (see Figure 1). Practically everything else, barring the initial (very minimal) configuration of each database, is automated, resulting in a highly streamlined process for generating test data. Note that the aforementioned percentage can be set to 100% (useful if, for example, you want to migrate the data set without changing it) or even over it, which allows you to generate a synthetic data set larger than the original data set (which can, at your option, also include the original data set). You can also incorporate SQL statements into this process if you want to, say, add edge cases that may not be covered by your production data.

All that said, the web UI is not the only way to interact with Windocks. It also offers a CLI (Command Line Interface) for use by developers (and other more technical users) as well a restful API. In particular, it provides an open interface that is accessible to other products, notably the company’s partners, to add additional functionality to the build process. REST APIs are also provided to integrate outgoing images with various CI/CD pipelines, source control, and orchestration systems, such as Jenkins and Azure DevOps.

Why should you care?

Database virtualisation is increasingly in-demand within the TDM space, and it’s not difficult to see why: it allows for extremely fast provisioning of entire (virtual) databases for the purposes of testing without needing to worry about whether they are representative (as would be the case for subsets or synthetic data). Moreover, it allows each tester (or developer, for that matter) to rapidly provision their own data sets that they can play around with and modify as they wish without negatively impacting anyone else. The fact that Windocks offers this capability at all is a large point in its favour. What’s more, compared to the other database virtualisation tools we are aware of in the TDM space, it is either significantly more widely applicable or vastly more cost effective.

Containers are also increasingly in-demand, not just in regards to TDM but in general. Moreover, database virtualisation and containers form a natural fit: database clones are usually created, delivered, and torn down repeatedly and en masse, a lifecycle that is well suited to (and enhanced by) deployment as a container. To put a point on it, Docker is able to deliver virtual database instances in less than 30 seconds and supports up to 100 containers per virtual machine. In other word, it helps to maximise both speed and utilisation. It then almost goes without saying that Windocks, a platform that combines these two technologies, should not be dismissed lightly. To add to this, Windocks claims that its customers have been able to reduce lower-level database environment costs by up to 50% by utilising fewer VMs and minimising storage consumption. This stands in contrast to the other vendors in the space that primarily offer database virtualisation as part of a server/client architecture and hence cannot easily reap the benefits of containerisation.

The product’s other capabilities – data subsetting, sensitive data discovery, data masking, and perhaps most of all synthetic data generation – make up the other major features that are often desired in the TDM space, and make Windocks one of the few TDM solutions that can, at least in principle, serve all comers. More than that, its subsetting and synthetic data capabilities are automated to a very impressive degree.

In addition, the product is highly portable and offers an extremely fast time to return. Its web UI is perhaps a little plain, but it is certainly functional, and can be subsumed by other front-ends if necessary.

The bottom line

Windocks is an effective TDM solution that offers several standout features, including database virtualisation, very substantial container support, and highly automated data subsetting and synthetic data generation. In short, it poses a compelling and forward-thinking alternative to many of the players in the space.

Mutable Award: Gold 2024

Commentary

IRI Voracity and Test Design Automation

Solutions

Windocks

Windocks

Company Info

Windocks

What is it?

What does it do?

Why should you care?

Commentary

Solutions

Research

Test Data Management (2024)

Windocks (2024)

Test Data Management and Mage

Test Data Management (2021)

Windocks (2021)