0

We have the following Dockerfile:

FROM debian:stable-slim

ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get full-upgrade -y && apt-get install -y --no-install-recommends \
    systemd \
    procps \
    apt-utils \
    build-essential \
    postgresql-server-dev-all \
    postgresql-plpython3-13 \
    pgxnclient \
    libc6-dev \
    wget \
    python3 \
    python3-pip \
    python-is-python3 \
 && rm -rf /var/lib/apt/lists/*

# Import source files into the image
RUN mkdir /docker
WORKDIR /docker
COPY . /docker

# Install PostgreSQL Faker
RUN python3 -m pip install --upgrade pip && pip3 install Faker==6.1.1 && \
    pgxn install postgresql_faker

# Install anon extension
#RUN cd src && make && make install
RUN pwd
RUN ls -lsa

# init script
RUN mkdir -p /docker/docker-entrypoint-initdb.d
COPY ./init_anon.sh /docker/docker-entrypoint-initdb.d/init_anon.sh

# Alternative Entrypoint
COPY ./anon.sh /docker/anon.sh
COPY ./init_anon.sh /docker/init_anon.sh

RUN /docker/docker-entrypoint-initdb.d/init_anon.sh

ENTRYPOINT ["/docker/docker-entrypoint-initdb.d/anon.sh"]

...and the following docker-compose.yml:

version: '3'

services:
  PostgreSQLAnonymizer:
    container_name: PostgreSQLAnonymizer
    hostname: PostgreSQLAnonymizer
    image: postgresql_anonymizer
    build: .
    #image: registry.gitlab.com/dalibo/postgresql_anonymizer
    ports:
      - "5432:5432"
    environment:
      - HOSTNAME=PostgreSQLAnonymizer
      - POSTGRES_DB=postgres
      - POSTGRES_PASSWORD=secret
      - PGUSER=postgres # required for `make installcheck`
      - POSTGRES_USER=postgres
    networks:
      - postgres-network
    #command: /usr/bin/postgres -c shared_preload_libraries='anon'
    command: /bin/systemctl restart postgresql && su postgres -c "/usr/bin/psql -U postgres -p 5432 -h PostgreSQLAnonymizer -c shared_preload_libraries='anon' postgres"
    working_dir: /tmp/source
    volumes:
      - $PWD:/tmp/source
    #- $PWD/anon:/usr/share/postgresql/10/extension/anon
    #
networks:
  postgres-network:

The custom entrypoint (init_anon.sh) looks like the following:

#!/bin/sh

set -e

# Perform all actions as $POSTGRES_USER
export PGUSER="$POSTGRES_USER"

# this is simpler than updating shared_preload_libraries in postgresql.conf
echo "Loading extension for all further sessions"
psql -h PostgreSQLAnonymizer -p 5432 -U postgres --dbname="postgres" -c "ALTER SYSTEM SET session_preload_libraries = 'anon';"
psql -h PostgreSQLAnonymizer -p 5432 -U postgres --dbname="postgres" -c "SELECT pg_reload_conf();"

echo "Creating extension inside template1 and postgres databases"
SQL="CREATE EXTENSION IF NOT EXISTS anon CASCADE;"
psql -h PostgreSQLAnonymizer -p 5432 -U postgres --dbname="template1" -c "$SQL"
psql -h PostgreSQLAnonymizer -p 5432 -U postgres --dbname="postgres" -c "$SQL"

And the anon.sh:

#!/usr/bin/env bash

export PGDATA=/var/lib/postgresql/data/
export PGDATABASE=postgres
export PGUSER=postgres

{
mkdir -p $PGDATA
chown postgres $PGDATA
gosu postgres initdb
gosu postgres pg_ctl start
gosu postgres psql -c "ALTER SYSTEM SET session_preload_libraries = 'anon';"
gosu postgres psql -c "SELECT pg_reload_conf();"

cat | gosu postgres psql
} &> /dev/null

bin/pg_dump_anon.sh -U postgres

We are trying to setup this project for anonymization of PII, PHI and similar.

The problem is that when why we try to run docker-compose build the following error appears:

> [13/13] RUN /docker/docker-entrypoint-initdb.d/init_anon.sh:
#0 0.312 Loading extension for all further sessions
#0 0.375 psql: error: could not translate host name "PostgreSQLAnonymizer" to address: Name or service not known

We tried setting up a custom network, which makes no sense for a single container, but we did it anyways without any avail. We tried setting localhost, 127.0.0.1, 0.0.0.0 for the connection string and that didn't work either. Any clue how to fix this?

Yes, we had to modify the original code a bit, because the code was incorrect, among other reasons it was that the Python version was 3.5, which doesn't include the required Faker package version (only up to 5.0.0 if I recall correctly) and hence we had to make these changes.

We have to use PostgreSQL 9.3 by the way.

Gerald Schneider
  • 19,757
  • 8
  • 52
  • 79
Munchkin
  • 113
  • 1
  • 5
  • Just add the missing entry to the hosts file inside the container? – Gerald Schneider May 24 '22 at 12:31
  • Another problem would be that you seem to try to connect to the postgresqld without starting it first. – Gerald Schneider May 24 '22 at 12:35
  • @GeraldSchneider isn't that done automatically, we specified the hostname? I think it should be started?.. We wrote ` /bin/systemctl restart postgresql ` as well just for this case. – Munchkin May 24 '22 at 12:43
  • `isn't that done automatically` yes, when you actually start a container. But you run the script `init_anon.sh` during build phase, where this is not the case. At this point the postgresqld isn't running either. – Gerald Schneider May 24 '22 at 12:47
  • @GeraldSchneider oooh that's a good point, how do we need to initialize it then? – Munchkin May 24 '22 at 12:49

1 Answers1

1

The problem is that you are running init_anon.sh at the build phase. At that point there is no hostname of the container in /etc/hosts, and the postgresql server is not even running.

You can either add the hostname during build phase and start the daemon before you run the script, or you run the init-script on first start of the container.

if [ ! -f /initialized ]; then
  /docker/docker-entrypoint-initdb.d/init_anon.sh
  touch /initialized
fi
Gerald Schneider
  • 19,757
  • 8
  • 52
  • 79