How We Anonymize Your Data

Direct Identifier Suppression

Suppressing direct identifiers is the first and most obvious step to anonymizing data. Privacy Dynamics offers a number of different suppression methods, which you can configure when setting up your Project or Dataset.

Redacting

Redacting is the default suppression method. When direct IDs are Redacted, the column containing a direct identifier is dropped from the Destination Dataset. This is the fastest method, and the one that creates the smallest Destination Dataset.

Masking

Masking is the process of obfuscating values. Privacy Dynamics supports static masking by substition, in other words, replacing sensitive values with a placeholder and writing those placeholders to the Destination Dataset.

There are several flavors of masking. To illustrate the differences, we will mask the example phone number +19995551234.

Safe Masking

Safe masking suppresses all information from the source data. When data is safe-masked, it is replaced with a fixed-length string (that may or may not be the same length as the source string). When safe-masked, our example phone number becomes ######.

Full Masking (Coming Soon!)

Full Masking substitues every character in the input string with a placeholder character. It differs from Safe Masking because it preserves the length of the input string. When fully masked, our example phone number becomes ############

Partial Masking (Coming Soon!)

Partial Masking only substitutes some characters in the input string, and preserves others. For example, a partially-masked phone number could maintain the country area codes; commonly a partially-masked credit card number reveals the last four digits. Our example phone number when partially masked could be +1999#######.

Faking (Substituting Realistic Values)

Faking is another form of suppression. When faking values, we substitute values in the source data with randomly-generated but format-consistent values. Faking is the slowest form of suppression, and generates the largest Destination Dataset, but can be extremely powerful for Development and Testing use cases (especially testing data migrations).

Currently the following types of Direct Identifiers can be faked:

Identifier
Person Name (Full, First, Middle, Last)
Email Address
Phone Number
Mailing Address (US Street Only or Street plus City/State/Zip)
Credit Card Number
US Social Security Number (SSN)

If you are interested in faking other types of PII, please reach out.

Hashing (Coming Soon!)

Hashing is the process of transforming input data using a one-way cryptographic function that produces deterministic output but makes it impossible to recover the input data. Hashing can be valuable because hashed values can be compared with each other for equality (and even joined together).

Previous
Entity Recognition
Next
k-member Micro-aggregation