What is Data Scrambling?

The term data scrambling is commonly used in the database industry. Prudent Cloud describes data scrambling, sometimes called data masking or data obfuscation, as “a technique used to mask critical data sets and attributes so the critical data is not visible to the users of the cloned/non-production database copied from production.”

According to Oracle, “data scrambling is a process to obfuscate or remove sensitive data. This process is irreversible so that the original data cannot be derived from the scrambled data. Data scrambling can be utilized only during the cloning process.”

Data scrambling is typically undertaken when it is necessary to clone a database so it can be used by others.  Cloning a database is necessary when you need to create a production support environment or a custom development environment, or when you need to perform volume testing or integration testing. In most cases, the database will contain some sensitive data, such as HR data, payroll information or customer data, that should be scrambled when making the clone.

Common methods typically used for data scrambling today include cryptographic data scrambling and network security data scrambling. However, neither of these methods are completely foolproof and IT researchers have recently proposed a stronger method called Nearest Neighbor Data Substitution (NeNDS). NeNDS offers significantly more robust privacy protection features as well as the ability to preserve data clusters.