Securing Postgres clusters in privacy-led architecture

Kids Web Services (KWS) enables developers to create, build, and manage COPPA/GDPR-K compliant apps and sites aimed at under-13s. As part of operating the KWS SaaS platform within SuperAwesome, we frequently create new databases for our customers, as well as managing several large Postgres databases on RDS.

Data security is a fundamental success criterion in kidtech; and ensuring that data is safely isolated in different databases, properly encrypted, and accessible only by the authorised services with their respective encryption keys is just the start of the journey.

As an infrastructure engineer, I want to share with you some of the techniques and procedures we use that can help you protect your Postgres cluster.

Using stored functions as IaC

Terraform is our IaC tool of choice for setting up the majority of our infrastructure, but it doesn’t fit all the use cases.

We found stored functions are as close to Infrastructure as Code that we can get within a SQL database – especially when we need to create databases from a service on the fly, and have them available almost instantly.

A connection to a Postgres cluster is bound to a specific database, meaning that it’s bound to that database’s context and schema within the default search path. This means all functions and operations will apply to the database to which we are currently connected.

We leveraged this feature and created a ‘utility’ database: an empty database containing only stored functions that are executed by a utility user to securely create other databases in the cluster. This, of course, comes with its own security challenges.

Restricting the ‘utility’ user permissions

Invoking stored functions must be restricted to standard users with zero permissions to any resources within the Postgres cluster aside from the function itself. Luckily Postgres comes to the rescue with ‘SECURITY DEFINER’, as described in the Postgres documentation here:

SECURITY DEFINER specifies that the function is to be executed with the privileges of the user that created it.”

Using this feature of Postgres we can make the function run using the permissions of the function owner (a superuser) without ever having to be logged in as that user or pass its credentials.

With this in place, all we have to do is give the utility user role ‘USAGE’ access to the utility schema and permissions to execute the function.

Securely running stored functions in different databases

Dblink enables us to create connections to other databases within the cluster.

We require an active dblink session to create databases and revoke or grant permissions to them.

The majority of the operations that we run require superuser permissions and in order to create a dblink session we have to login with a username and password.

We never want an application to login as a superuser, so we use ‘USER MAPPING’ which allows us to run dblink as a privileged user without ever logging in from the utility account.

In the above snippet, we use SERVERS and USER MAPPING within a function to pass credentials without adding them to the body of the function or any other users.

Securing the new database

We want to make sure that only the database owner has access to the newly created database.

Thanks to the Postgres RBAC system this is surprisingly easy. Here is an example of the function we wrote to secure the newly created database:

As you may have already spotted, this function uses “dblink_connect” without ever passing a password.

Running this function at the end of our database creation process revokes the access permissions to all users. This means that even to access it as a superuser we need to modify the database permissions, which can then be tracked in the audit log.

Ensuring safety through idempotence

At SuperAwesome, we pride ourselves on having all of our infrastructure committed to Git. This gives us easy change management and good visibility of what is deployed in our AWS estate, as well as enabling  us to iterate quickly and change our environments through automation tools.

Not all Postgres operations behave in the same way. Certain operations like: ‘CREATE DATABASE’ and ‘CREATE ROLE’ will always fail if the database or role is already created but aren’t strictly idempotent; however, ‘ALTER’, ‘REVOKE’ and ‘GRANT’ are all idempotent by design.

This behaviour imparity can prevent our automation tools from recovering easily in case of failure, and might pose some risks to existing databases, roles and permissions.

We overcame this problem by compensating for the lack of idempotence in operations running within a dblink session, and we wrapped them in PL/pgSQL control logic:

Even though we take securing our databases very seriously, our ultimate goal is always to minimise data capture altogether and aim for a zero-data architecture.

Want to be part of this journey? SuperAwesome are hiring. To be part of our mission to make the internet safer for kids, click here.

Steve Hill is A DevOps Engineer at SuperAwesome.