Connecting Your Data

Connecting to Google Cloud Storage

Privacy Dynamics can connect to your data lake hosted in Google Cloud Storage (GCS). This guide helps you authenticate and authorize Privacy Dynamics to access your data in GCS.

Requirements

To complete this guide, you will need the following:

  • Two GCS buckets (one to read data from, the other to write data to).
  • An IAM user with Administrator privileges to create service accounts and assign them storage roles at the Project level.
  • A Privacy Dynamics account.

Instructions

Privacy Dynamics connects to GCS using its S3-compatible XML API.

Before you can connect to GCS in Privacy Dynamics, you will want to create a new IAM Service Account for Privacy Dynamics to use.

Configure IAM

  1. Create a new service account in your relevant Google Cloud Project for Privacy Dynamics.
  2. Grant the service account the Storage Admin role for the Project, by selecting "Edit Principle" on the IAM page for your Project. This Project-level access is required for our service to list the buckets in your Google Cloud Project.
  3. Create an HMAC key for your newly-created service account. Privacy Dynamics will use this HMAC key to authenticate with GCS. Save the Access Key and Secret Access Key in a password vault; you will need these later, and any time you would like to edit the Privacy Dynamics connection. Screenshot showing how to create an HMAC key

Add the GCS Connection in Privacy Dynamics

  1. Sign in to your Privacy Dynamics account.

  2. Go to the Connections page.

  3. Select Add Connection.

  4. Choose S3 and select Next. Note: since we use the XML API, the setup modal is shared with Amazon S3. Screenshot showing that you must select "S3" to create a GCS Connection

  5. Enter the connection details:

    • Name - a name for you to identify the connection.
    • Is S3 compatible? - DO check this box.
    • AWS Key ID - The Access Key associated with the IAM Service Account you created above.
    • AWS Secret Access Key - The Secret Access Key for your IAM Service Account.
    • Region Name - The name of the Google Cloud region where your bucket is located (e.g., us-west1).
    • Endpoint URL - Enter the value https://storage.googleapis.com.

    Screenshot of Connection Details page

  6. Select TEST CONNECTION to verify the credentials.

  7. Select ADD CONNECTION and your connection saves if there are no errors.

Create a Project Using Google Cloud Storage

  1. Select the Anonymize button on the top nav bar.
  2. On the "Choose Data" screen, select the new GCS connection as the Origin Connection, then select the Origin Bucket and Destination Bucket.
  3. Optionally, enter a Bucket Prefix to filter the list of objects below. If the prefix is a folder, enter the trailing slash.
  4. To use the same IAM Service Account to write the treated data to GCS, select Destination Connection: Same as Origin. Select the bucket and optionally a prefix (or folder, with a trailing slash) to prepend to the object name. Screenshot of project details page
  5. Select the files from the list of datasets, and then click Next to configure your dataset.

Other Configuration

If you have network access controls in place that limit connections to Google Cloud Storage, you will need to add Privacy Dynamics' IP addresses to your Allowlist. You can find those IP addresses in this public JSON file.

Previous
Connecting to Amazon S3
Next
Connecting to SFTP