Ab Initio Test Data Management

Update solution on July 12, 2021

Ab Initio Test Data Management

Ab Initio Test Data Management (TDM) is a test data management application within the broader Ab Initio data management platform. Said platform can be deployed on-prem or in-cloud; operates on structured, unstructured, and semi-structured data; and features a highly portable ‘build once, run anywhere’ architecture. It also offers solutions for several additional areas within data, including data integration, data governance, data quality, and more. What’s more, all of these solutions are extensible and centrally managed.

TDM is presented as a simple, visual flowchart accessed within Ab Initio Express>It (the platform’s web interface, shown in Figure 1) that allows you to choose a data source (which could be anything from a file, to a database, to a data set derived from some other Ab Initio app: anything that the Ab Initio platform as a whole can read), augment it with generated data, and mask, subset and export it to create your test data set. You can also leverage Ab Initio’s data generation capabilities to create all of your test data from scratch. All of this can be done in bulk. We should also note that TDM as shown here is essentially a friendly UI layered on top of the platform’s underlying functionality for masking, subsetting and so on. This functionality is not restricted to TDM, and in fact can be used practically wherever you like within the Ab Initio platform.

Fig 01 – Test Data Management in Ab Initio Express>It

Test data generation is accomplished either by manually specifying the fields you want to generate using which algorithm and in what quantity, or by reading in an Excel file and choosing the rows within it that you want to generate data for (again choosing which algorithms to use). In either case, you can add overrides if you want to exert additional control over your generated data, perhaps forcing a field to take a specific value or ensuring that all generated values are unique.

Fig 02 – Masking rules in Ab Initio Express>It

Masking is rules-based (see Figure 2), allowing you to apply out-of-the-box or user-created masking functions to your data. It is static, format-preserving, irreversible, and consistent across multiple systems and platforms. (Reversible) encryption is also offered. Instead (or in addition), you could shuffle – meaning randomly reassign – the values within your data set, possibly according to some constraints to maintain consistency or preserve the original distribution. You can also apply rules after shuffling.

Data classification takes place within Metadata>Hub, the platform’s data catalogue, via Semantic Discovery, the platform’s data discovery solution. It is used to automatically attach appropriate business terms to your data (including terms that indicate it is PII) and thence mask automatically on this basis. Ab Initio also provides access to the Ab Initio data profiler, which can be helpful for deciding how to approach masking, subsetting et al, as well as to validate these processes (particularly masking) once they’ve been applied.

Finally, when subsetting you have the option to create virtual fields (representing curated or amalgamated versions of existing fields) and to highlight any data that must be included or that should definitely be excluded. You then have two methods to reduce the size of your data set: sampling your records, by either a percentage or a count, and/or defining ‘field groups’ that (unsurprisingly) group fields together. Moreover, they allow you to specify the maximum number of different records that should exist in your subset for each unique combination of values within any given group. This is 1 by default, meaning that each record in your subset will necessarily contain (and hence allow you to test) at least one unique set of field group values. In essence, field groups exist to limit the number of conceptually meaningful combinations of fields (and field values) within your test data. This helps to ensure that all important combinations of values are represented and therefore that your test data set itself is meaningfully representative.

The above outlines the process for generating subsets that pertain to a single data source (a database table, for instance). You can also generate test data sets in acknowledgement of multiple sources (say, an entire database) by grouping these individual processes together as part of a subject area. This allows you to specify a root data set and effectively use it as a driver for creating subsets that are consistent and that maintain referential integrity, foreign key relationships and so on across every subset contained within the subject area. It can also be used to preserve bad data (non-matching key relationships, for example) if that is required for testing. This may well be the case, since bad data is a fact of life, and you will want to know whether your system can handle it appropriately.

As a test data management solution, Ab Initio is competent without being exceptional. It provides everything you need – and most of what you could want – in a flexible and easy-to-use package. On the other hand, it lacks some of the more advanced functionality found in its competitor products, and setting up its processes (creating your field groups, for example) is less automated than we might like. That said, TDM is only one part of Ab Initio’s much broader platform, and it performs well in that context. TDM’s masking functionality, in particular, can be leveraged directly using other parts of the platform, such as its data integration component.

Taking a broader view, the Ab Initio platform provides powerful and flexible raw ingredients for a wide variety of applications that are, consequently, highly scalable, performant, and customisable. Test data management is no exception to this.

The Bottom Line

If Ab Initio’s overall offering appeals to you, or if you are already an Ab Initio customer, its test data management capability will more than likely meet your needs. Although we would not recommend TDM as a point product (at least to customers who are not already – and do not want to become – more deeply invested in Ab Initio) we would certainly recommend the platform as a whole.

Related Company

Connect with Us

Ready to Get Started

Learn how Bloor Research can support your organization’s journey toward a smarter, more secure future."

Connect with us Join Our Community