Brett Westover
· 4 min read
Your development team can have an excellent developer experience with a reproducible workspace containing the app code, an anonymized snapshot of production data, and all the tools they need to build, test, and ship.
A remote development environment, or Cloud Development Environment (CDE), is a customizable space where developers can edit and run code using compute resources in the cloud or other hosted environment, instead of on their local machine. Coder is a self-hosted, remote development platform. You can choose the cloud provider, resources, and tools that define a development environment and then developers can get straight to work in their own workspaces with minimal setup or delay.
Privacy Dynamics unlocks production data by creating and maintaining a PII-free replica of the data environment, a privacy best practice known as Data Minimization. Using Privacy Dynamics and Coder together is a great option for teams looking to build an effective development platform that is private, secure, and most importantly, available on-demand.
Using realistic data in the development environment is tremendously helpful when building new features because your design and test processes can be more similar to actual app usage, since the data is real. Real data also helps catch real bugs, before you get into real trouble.
Unlocking the benefits of production data in development can be achieved a few different ways. We recommend a simple loop: de-identify -> snapshot -> develop (repeat)
Running this loop and updating data at the right cadence is important. Too frequently introduces churn and instability into your development process. Not frequently enough and your environments won't represent recent production data. We recommend syncing weekly and adjusting as needed.
We will use Privacy Dynamics to de-identify the production data for our demonstration app, create a backup of that data, and then make that data available in a Google Cloud VM-based environment automatically using Coder workspaces.
The Coder template, the demonstration app and the (simulated) production data are all available in a GitHub repo: pvcy/anonymize-with-coder
We have a production PostgreSQL database that contains PII (simulated for demo purposes). The database is not directly connected to the internet, but we can connect to it from Privacy Dynamics via SSH tunneling. In order to snapshot the data we will create a Coder workspace called "Intermediate Target." It's not necessary to use Coder for the snapshotting steps, but it's a handy way to get an environment for all kinds of purposes, even ad-hoc demonstrations like this. Once we've written the de-identified data to the "Intermediate Target" environment we can create a backup of the database for use in development environments.
Pre-requisites:
Coder, Google Cloud, and the custom workspace template
Start the demo app:
$ cd repos/anonymize-with-coder
$ docker compose up
Create a Privacy Dynamics Connection to both:
Backup the database and upload it to a GCS bucket Note: In this example we're using a bucket called "anonymize-demo-snapshots" but you should use a bucket that exists in your Google account.
Using a new terminal in the "Intermediate Target" Coder workspace we can use the already running database container to also perform the backup.
$ docker exec -it anonymize-with-coder-db-1 pg_dump --dbname=postgresql://postgres:$DB_PASSWORD@127.0.0.1/postgres > anonymize_demo_snap.sql
And since the Coder workspace is running in a GCP VM we have built-in Google Cloud tools and authentication. Use this to upload the file to the GCS bucket:
$ gsutil cp anonymize_demo_snap.sql gs://anonymize-demo-snapshots/
Start a new Coder workspace using the custom template.
On startup it will automatically download the snapshot.
Start the demo app:
$ cd repos/anonmize-with-coder
$ docker compose up
The Docker Compose configuration mounts the snapshot to the DB container's /docker-entrypoint-initdb.d
path which will run on startup, importing the data.
On your local machine, forward a port from the UI or using the Coder CLI:
$ coder port-forward YOUR-USER/WORKSPACE-NAME 5000:5000
http://localhost:5000
on your local browser.You should see the de-identified data snapshot in the UI.
We've demonstrated how to de-identify, snapshot, and develop with production data in a repeatable and easy to use environment. Using real data in your development process can help reduce bugs and ship high quality software faster. If you are in need of a self-hosted development platform, check out Coder. To learn more about Privacy Dynamics and how we can help you unlock production data while reducing privacy risk give us a shout, or sign up for a free trial at https://signup.privacydynamics.io.