Projects and Datasets

Distortion Metrics


Any privacy treatments on a Dataset inherently remove information from the Source data. However, the specifics of the treatments have a major impact on the utility of the treated data. The Privacy Dynamics anonymizer uses a proprietary process that is designed to minimize the distortion of treated data and therefore maximize utility while achieving privacy targets.

We use Distortion as a proxy for utility. After each treatment, we measure the Distortion of the treated data, and include Distortion metrics on the Dataset Report.

Metric Definitions

  • Rows Treated: A row is considered "treated" if any field on that row was changed.
  • Cells Treated: The overall share of cells (instances of a field on a record) that were changed.
  • Distribution of Values: The profile of the values in the selected field, before and after treatment.

The Impact of k

Since Privacy Dynamics uses a group-based micro-aggregation approach, setting k impacts the Distortion of the Destination dataset. A higher k means larger groups, more privacy, and more distortion. A lower k (down to 2) provides less privacy, but may be appropriate for applications that are particularly sensitive to distortion.

Risk Assessments