Changelog

2023 Updates

December 22

Support for PostgreSQL hstore type

The PostgreSQL hstore data type is a specialized feature designed for storing sets of key/value pairs within a single column. We added support for hstore columns which means all key/values pairs will be anonymzed during treatment.

December 5

Improved categorical PII classifier

PII data is often categorical in nature (ex., religion, nationality) and we added added a cardinality-based heuristic to dramatically improve the PII classification during treatment.

November 7

Register webhooks from the UI

Webhooks are now a first class citizen within the application. Previously, webhooks had to be registered by calling the REST API but we added a page in the settings section that makes it easy to register webhook URLs from the UI.

October 12

Easily navigate to job run details from the dataset card

It's now easier to navigate to the job runs page from the project page to view logs and runtimes across groups of jobs. A link was added to datasset cards that points to the corresponding job run that created the dataset. This makes it possible to compare job run times for other datasets in the same or different projects.

Support writing PostgreSQL ARRAY types

PostgreSQL can be used to store a number of unique data types, among them columns variable-length multidimensional arrays. This release adds support for anonymizing string and numeric ARRAY columns.

Improved MySQL writing performance

Performance for writing large, anonymized tables back to MySQL was significantly sped up. This improvement will speed up the overall job run times for large tables in MySQL.

September 27

View detailed logs for job runs

From an individual job run you can view the all of the individual tasks that were executed. The task log is viewable from the slide-out on the right side of the screen on the job runs page.

September 8

Read From SFTP and Write to S3 or GCS

A common customer use case is to anonymize data in multiple points during a data pipeline. This might mean removing DIDs early in the pipeline and treating QIDs much later in the process. With this release, it's now possible to use an SFTP source to anonymize and write treated data to blog storage (AWS S3 and GCP GCS).

Add support for identity columns for Postgres & MySQL

With this release, identity columns (auto-increment) in PostgreSQL and MySQL are preserved in the anonymized output. For MySQL, AUTO_INCREMENT is set in an ALTER TABLE statement during the write_indices_and_constraints step.

July 31

Add Run Project & Re-Run Dataset to Project Page

Improved connection tests for SFTP

The "Test connection" button on the new data connection screen was improved to more thoroughly and quickly verify the connection credentials for the SFTP server.

July 21

Disable Anonymize radio button for Primary Keys

In the dataset configuration settings, the option to anonymize is disabled for primary and foreign key columns. Disabling the anonymization option ensures that column values are not adjusted by mistake and table references are preserved during data treatment.

July 12

Add tabs to filter by treat & passthrough

When anonymizing databases with lots of tables, it's eas to lose track of the treatment plans for each table. This release adds filtered pages to the

June 14

Create columns with PK and FKs

Referential integrity for PostgreSQL, MySQL, and Snowflake

During the anonymization process, the application will duplicate primary and foreign key constraints from source tables to destination tables, maintaining data integrity and ensuring the consistency and accuracy of the anonymized data. This is a new feature and PostgreSQL, MySQL, and Snowflake are suppported.

Support for Google Cloud Storage (GCS)

Data connections to S3 compatible APIs, including GCS, are now supported. S3 compatible APIs can be added as data connections and used as source and destination connections to anonymize data. Initial functionality must be enabled via the REST API and UI support will be available in the next release.

June 2

Assessment only project checkbox

SSH support for data connections

Support for SSH tunneling is added in this release, allowing SaaS customers to securely connect to their data stores. SSH connection details, like host and private key, can be added to each data connection's properties.

Copy source indices and constraints to destination tables

Particularly in development and testing scenarios, it can be critical that anonymized database copies exactly match production database indices and constraints. This release is a first step in copying over all table metadata. The new functionality copies metadata one table at a time, so information at the table level is brought over. Relationship information, specifically foreign key definitions, is not supported at this time.

  • Copy indices and constraints from origin tables and apply to destination tables during the treatment process.
  • Column not null constraints cannot be carried over at this time.

May 10

Assessment only project checkbox

Assessment-only projects

Sometimes you just need to scan some databases or buckets to determine if they contain sensitive data without actually treating the data. This is a fast way to inventory risk across an entire company or organization. We have added a new project option to only assess risk, present in the project wizard. Enabling the assessment-only mode means that destination locations don't need to be provided and assessments can be created quickly.

February 21

Dectect primary keys

Preserve values of primary and foreign key columns

The project wizard will automatically detect columns that contain primary or foreign keys and preserve their values by default, ensuring that data remains the same in source and destination tables and table relationships are preserved. Locking column values is only a suggestion by the application and can be overridden within the UI.

Previous
2024
Next
2022