The technology, or technique, used to enable this functionality is known as data masking. Data Masking is a data security technique in which a dataset is copied but with sensitive data obfuscated. The replica is then used instead of the authentic data for testing or training purposes. Data masking does not just replace sensitive data with blanks. It creates characteristically intact, but inauthentic, replicas of personally identifiable data or other highly sensitive data in order to uphold the complexity and unique characteristics of data. In this way, tests performed on properly masked data will yield the same results as they would on the authentic dataset.
Maintain critical data relationships within databases, across different databases, between different database platforms, and over time. Many techniques including deterministic and random approaches are used to ensure consistency in how sensitive elements are masked and enable repeatable masking process
Mask large volumes of data quickly and easily. Data Masking needs to be engineered to meet the demands of your data-driven business. Platform-specific optimizations enable efficient and scalable masking regardless of database platform or dataset size. An open architecture allowing easy adaptation to your enterprise environment and existing automation tools
Classify sensitive data
Automatically locate and categorize sensitive data in your databases. Integrated data classification uses heuristics and statistical analysis to locate personally identifiable information (PII) like name, email, date of birth, SSN and more. Leverage classification results to configure data masking rules.
Data masking is essential in many regulated industries where personally identifiable information must be protected from overexposure. By masking data, the organization can expose the data as needed to test teams or database administrators without compromising the data or getting out of compliance. The primary benefit is reduced security risk.
Data masking is going to be of primary concern for the chief security officer and the CSO should be able to mandate the use of appropriate data masking techniques for all relevant projects. In addition, data governance policies may require that certain data (such as financial data) is masked even where that is not required by law. Thus data masking will be of concern to the data governance council, data stewards and all others that have responsibility for data. Further, those responsible for developing, testing or migrating applications should be aware of the need for data masking when relevant data is being used. This especially applies when development is outsourced and companies will need to practice due diligence to ensure that partners are masking the data in an appropriate fashion.
Static Data Masking is usually performed on the golden copy of the database, but can also be applied to values in other sources, including files. In DB environments, production DBAs will typically load table backups to a separate environment, reduce the dataset to a subset that holds the data necessary for a particular round of testing (a technique called “subsetting”), apply data masking rules while data is in stasis, apply necessary code changes from source control, and/or and push data to desired environment.
On-the-Fly Data Masking is the process of transferring data from environment to environment without data touching the disk on its way. The same technique is applied to “Dynamic Data Masking” but one record at a time.
Dynamic Data Masking is similar to On-the-Fly Data Masking but it differs in the sense that On-the-Fly Data Masking is about copying data from one source to another source so that the latter can be shared. Dynamic data masking happens at runtime, dynamically, and on-demand so that there doesn’t need to be a second data source where to store the masked data dynamically. Dynamic data masking is attribute-based and policy-driven. Policies example:
- Doctors can view the medical records of patients they are assigned to (data filtering)
- Doctors cannot view the SSN field inside a medical record (data masking).
Dynamic data masking can also be used to encrypt or decrypt values on the fly especially when using format-preserving encryption. Several standards have emerged in recent years to implement dynamic data filtering and masking. For instance, XACML policies can be used to mask data inside databases.
Data masking and the cloud Organizations develop their new applications in the cloud more and more often, regardless of whether final applications will be hosted in the cloud or on- premises. The cloud solutions as of now allow organizations to use Infrastructure as a Service or IaaS, Platform as a Service or PaaS, and Software as a Service or SaaS. There are various modes of creating test data and moving it from on-premises databases to the cloud, or between different environments within the cloud. Data masking invariably becomes the part of these processes in SDLC as the development environments’ SLAs are usually not as stringent as the production environments’ SLAs regardless of whether application is hosted in the cloud or on-premises