Brett Westover

 ·  4 min read

Self-hosted developer platform with Coder and Privacy Dynamics

Your development team can have an excellent developer experience with a reproducible workspace containing the app code, an anonymized snapshot of production data, and all the tools they need to build, test, and ship.

A snippet of workspace template code

What is Coder?

A remote development environment, or Cloud Development Environment (CDE), is a customizable space where developers can edit and run code using compute resources in the cloud or other hosted environment, instead of on their local machine. Coder is a self-hosted, remote development platform. You can choose the cloud provider, resources, and tools that define a development environment and then developers can get straight to work in their own workspaces with minimal setup or delay.

A Coder workspace screen

Why Privacy Dynamics?

Privacy Dynamics unlocks production data by creating and maintaining a PII-free replica of the data environment, a privacy best practice known as Data Minimization. Using Privacy Dynamics and Coder together is a great option for teams looking to build an effective development platform that is private, secure, and most importantly, available on-demand.

Privacy Dynamics data anonymize wizard

Production Data, for Development

Using realistic data in the development environment is tremendously helpful when building new features because your design and test processes can be more similar to actual app usage, since the data is real. Real data also helps catch real bugs, before you get into real trouble.

Unlocking the benefits of production data in development can be achieved a few different ways. We recommend a simple loop: de-identify -> snapshot -> develop (repeat)

image showing de-identify, snapshot, and develop
  • De-identify: Using a tool, like Privacy Dynamics, read data securely from production and remove all unwanted Personally Identifiable Information (PII).
  • Snapshot: Make the de-identified data broadly available for developers to use again and again as part of a development platform.
  • Develop: Use the de-identified data, and the developer platform, to build and ship new features. Using real data, deliver value to your customers in less time and with more trust.

Running this loop and updating data at the right cadence is important. Too frequently introduces churn and instability into your development process. Not frequently enough and your environments won't represent recent production data. We recommend syncing weekly and adjusting as needed.

Example: Using Coder and Privacy Dynamics to build a simple app containing user information

We will use Privacy Dynamics to de-identify the production data for our demonstration app, create a backup of that data, and then make that data available in a Google Cloud VM-based environment automatically using Coder workspaces.

The Coder template, the demonstration app and the (simulated) production data are all available in a GitHub repo: pvcy/anonymize-with-coder

We have a production PostgreSQL database that contains PII (simulated for demo purposes). The database is not directly connected to the internet, but we can connect to it from Privacy Dynamics via SSH tunneling. In order to snapshot the data we will create a Coder workspace called "Intermediate Target." It's not necessary to use Coder for the snapshotting steps, but it's a handy way to get an environment for all kinds of purposes, even ad-hoc demonstrations like this. Once we've written the de-identified data to the "Intermediate Target" environment we can create a backup of the database for use in development environments.

Pre-requisites:

  • A production copy of the example data (you can use a Coder environment for this, or import the data to a suitable hosted Postgres database)
  • Privacy Dynamics—you can sign up for a free trial at https://www.privacydynamics.io/
  • Coder, Google Cloud, and the custom workspace template

    • Note: You will need to tweak the template with a GCS bucket name in your account, and you may need to add authentication config to Terraform in order to import the template to Coder. For details on setting up Coder with Google Cloud see the Coder docs.

De-identify and snapshot

  1. Create an "Intermediate Target" Coder workspace from the custom template
  2. Start the demo app:

    $ cd repos/anonymize-with-coder
    $ docker compose up
  3. Create a Privacy Dynamics Connection to both:

    • Production database
    • The new, empty, database in the "Intermediate Target" Coder environment
  4. Create a Privacy Dynamics project to de-identify the "users" table and write it to the new database
  5. Backup the database and upload it to a GCS bucket Note: In this example we're using a bucket called "anonymize-demo-snapshots" but you should use a bucket that exists in your Google account.

    Using a new terminal in the "Intermediate Target" Coder workspace we can use the already running database container to also perform the backup.

    $ docker exec -it anonymize-with-coder-db-1 pg_dump --dbname=postgresql://postgres:$DB_PASSWORD@127.0.0.1/postgres > anonymize_demo_snap.sql

    And since the Coder workspace is running in a GCP VM we have built-in Google Cloud tools and authentication. Use this to upload the file to the GCS bucket:

    $ gsutil cp anonymize_demo_snap.sql gs://anonymize-demo-snapshots/

Develop using the snapshot of de-identified production data

  1. Start a new Coder workspace using the custom template.

    On startup it will automatically download the snapshot.

  2. Start the demo app:

    $ cd repos/anonmize-with-coder
    $ docker compose up

    The Docker Compose configuration mounts the snapshot to the DB container's /docker-entrypoint-initdb.d path which will run on startup, importing the data.

  3. On your local machine, forward a port from the UI or using the Coder CLI:

    $ coder port-forward YOUR-USER/WORKSPACE-NAME 5000:5000
  4. Visit http://localhost:5000 on your local browser.

You should see the de-identified data snapshot in the UI.

Real data in your development platform with Privacy Dynamics

We've demonstrated how to de-identify, snapshot, and develop with production data in a repeatable and easy to use environment. Using real data in your development process can help reduce bugs and ship high quality software faster. If you are in need of a self-hosted development platform, check out Coder. To learn more about Privacy Dynamics and how we can help you unlock production data while reducing privacy risk give us a shout, or sign up for a free trial at https://signup.privacydynamics.io.