Discovery (DgSecure Detect), covers structured, semi-structured, and unstructured data, using a range of techniques – pattern recognition (some hundred or so sensitive datatypes are provided out of the box, and you can also add your own), regular expressions, proximity matching, natural language processing, and machine learning. It can discover sensitive data both in on-premises and cloud-based data stores. A major problem with discovering sensitive data is that you can get a lot of false positives and negatives. Dataguise has addressed the former issue by building machine learning into its product, learning initially from sample data or from examples of false positives. Remediation workflows for false positives are provided and the company also provides features to reduce the number of false negatives. Facilities include support for industry and customer specific ontologies.
As far as DgSecure Protect is concerned, encryption – both AES and format preserved (FPE) – and decryption (which is role-based) are available in addition to masking. Both full and partial redaction is possible and the product supports masking for both structured and unstructured data. In the case of dynamic masking the company has historically leveraged native capabilities for this purpose. However, the company has recently released a Privacy on Demand (POD) library that is accessible via an API. This not only supports dynamic masking by providing access to Dataguise’s masking algorithms (around 35 of them) but also supports masking within streaming environments such as Kafka and Streamsets. The Audit and Monitor capabilities provide policy-driven monitoring (in real-time) and recording of who accessed data, when, where and what they did with the data. Many relevant policies (for GDPR, CCPA, PCI, HIPAA and so on) are provided out of the box but you can also create your own. Alerts are all actionable. Display is via a persona-based dashboard, an example of which is shown in Figure 3. Monitoring capabilities include breach reporting.
Finally, DgSecure DSAR is based on finding identities. This is typically run as a background task, creating an indexed inventory of individuals. For the actual processing of requests there is a scheduling facility that allows these to be run on a batch basis. There is an API-based interface to OneTrust. One other key feature of DgSecure DSAR is that it allows both hard and soft delete options for right to erasure requests.