Connecting Your Data

Connecting to Amazon S3

Privacy Dynamics can connect to your data lake hosted in Amazon S3. This guide helps you authenticate and authorize Privacy Dynamics to access your data in S3.

Requirements

To complete this guide, you will need the following:

  • An S3 bucket.
  • An IAM user with Administrator privileges to create users (and optionally policies).
  • A Privacy Dynamics account.

Instructions

Before you can connect to S3 in Privacy Dynamics, you will want to create a new IAM User and, optionally, IAM Policies for Privacy Dynamics to use.

Configure IAM

Tip

The fastest way to get started is to assign a new IAM user the permissive AmazonS3FullAccess policy. If that is not appropriate for your security posture, see below for a setup with more restrictive policies.

  1. Log into your AWS Console using a user with the Administrator policy (or, at a minimum, a user that can add users and attach policies).
  2. Visit the IAM Users page and click Add Users. Screenshot of IAM Users page
  3. Provide a User name (we like svc_pvcy but this can be anything) and check the box for Access key - Programmatic access, then click Next. Screenshot of IAM User Details
  4. Select "Attach existing policies directly" and then type "S3" into the filter box before selecting the AmazonS3FullAccess policy. Click Next. Note: This policy allows the user to list, read from, and write to all folders in all buckets on the account! For a more restrictive policy, see below. Screenshot of IAM Attach Policy page
  5. Add any desired tags for this user, then click Next.
  6. The user has now been created. Leave this window open or copy the Access Key ID and Secret access key, since you will need those in the next step. Screenshot of IAM User Success page

Tip

If your bucket is encrypted with a KMS-managed key, you will need to add a policy to the KMS key that grants this user access to that key. See the AWS Documentation for more details.

Add the S3 Connection in Privacy Dynamics

Note that a connection must be created for each IAM User that Privacy Dynamics will use to authenticate with your AWS Account. A single Connection can have many buckets.

  1. Sign in to your Privacy Dynamics account.

  2. Go to the Connections page.

  3. Select Add Connection.

  4. Choose S3 and select Next.

  5. Enter the connection details:

    • Name - a name for you to identify the connection.
    • AWS Key ID - The Access Key associated with the IAM user you created above.
    • AWS Secret Access Key - The Secret Access Key for your IAM user.
    • Region Name - The name of the AWS region where your bucket is located (e.g., us-east-1).

    Screenshot of Connection Details page

Tip

If visible, do not select the checkbox labeled "Is S3 Compatible?"

  1. Select TEST CONNECTION to verify the credentials.
  2. Select ADD CONNECTION and your connection saves if there are no errors.

Create a Project Using S3

  1. Select the Anonymize button on the top nav bar.
  2. On the "Choose Data" screen, select the new S3 connection as the Origin Connection.
  3. Optionally, enter a Bucket Prefix to filter the list of objects below. If the prefix is a folder, enter the trailing slash.
  4. To use the same IAM User to write the treated data to S3, select Destination Connection: Same as Origin. Select the bucket and optionally a prefix (or folder, with a trailing slash) to prepend to the object name. Screenshot of project details page
  5. Select the files from the list of datasets, and then click Next to configure your dataset.

Advanced Configuration: Creating a Minimal IAM Policy

To anonymize your data, the Privacy Dynamics User must be able to:

  1. Read from a bucket or folder containing raw data.
  2. Write to a bucket or folder containing anonymized data.
  3. (Optional) List buckets for the Privacy Dynamics user to select during Project configuration. Note: Without the ListAllMyBuckets privilege, you must type the name of the bucket into the UI, instead of selecting from a list.

To that end, we recommend creating two new IAM Policies:

  1. S3ReadRawSensitiveBucket: A policy to allow us to list and read from the bucket containing raw data.
  2. S3WritePrivacySafeBucket: A policy to allow us to write to the anonymized bucket.

You can then attach both policies to the svc_pvcy IAM user, or, if you prefer, you can create two IAM Users with one policy each, and then set up two connections in Privacy Dynamics, one for each IAM User.

Creating the List and Read Policy

  1. Visit the IAM Policies Page and click Create Policy Screenshot of the IAM policies page
  2. Using the Visual Editor, select S3 as the service. Under Actions, select List > ListAllMyBuckets, List > ListBucket, and Read > GetObject. Under Resources, select Specific and then define the specific bucket(s) and folder(s) that you would like to grant List and Read access to (note that ListAllMyBuckets will not be affected by the Resource restrictions). Then select Review Policy to continue. Screenshot of read policy's visual editor screen after config

Tip

If your bucket uses ACL, you may need to add the "Permissions > GetObjectAcl" permission to this policy.

  1. Alternatively, select the JSON tab and enter the following document, substituting the ARN of your resources:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "Allow Reading from Raw Bucket",
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::<YOUR-BUCKET>",
                    "arn:aws:s3:::<YOUR-BUCKET>/<YOUR-FOLDER>/*"
                ]
            },
            {
                "Sid": "Allow Listing All Buckets",
                "Effect": "Allow",
                "Action": "s3:ListAllMyBuckets",
                "Resource": "*"
            }
        ]
    }
    
  2. Give the policy a good name and description. Ensure there are no warnings about ineffective policies, and then select Create Policy. Screenshot of the create policy screen

Creating the Write Policy

  1. Visit the IAM Policies Page and click Create Policy.
  2. Using the Visual Editor, select S3 as the service. Under Actions, select Write > PutObject. Under Resources, select Specific and then define the specific bucket(s) and folder(s) that you would like to grant Write access to. Then select Review Policy to continue. Screenshot of write policy's visual editor screen after config

Tip

If your bucket uses ACL, you will need to add the "Permissions > PutObjectAcl" permission to this policy.

  1. Alternatively, select the JSON tab and enter the following document, substituting the ARN of your resources:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": "s3:PutObject",
                "Resource": "arn:aws:s3:::<YOUR-BUCKET>/<YOUR-FOLDER>/*"
            }
        ]
    }
    
  2. Give the policy a good name and description. Ensure there are no warnings about ineffective policies, and then select Create Policy. Screenshot of the create policy screen for the write policy

Attaching the Policies to the User

  1. Visit the IAM Users page, and either select Create User (follow instructions above) or select the svc_pvcy user from the list.
  2. Select Add Permissions.
  3. Select Attach Existing Policies Directly and then select the policies created in the last step before selecting Next: Review. screenshot of the attach policies screen
  4. Select Add permissions.

Other Configuration

If you have network access controls in place that limit connections to S3, you will need to add Privacy Dynamics' IP addresses to your Allowlist. You can find those IP addresses in this public JSON file.

Previous
Connecting to SQL Server
Next
Connecting to Google Cloud Storage