Skip to main content

Szymon Halastra

Copy CSV to Postgres database inside Docker

docker-postgres

With Docker containers, it is easier to run services like databases without installing them locally. But how to populate the database with raw data in the CSV file?

In this article, I want to explain that. Let’s start with the project structure.

Project structure

data/
  datasource.csv
db/
  scripts/
    1_init.sql
    2_copy.sql
  Dockerfile
docker-compose.yml

Dockerfile uses postgres image and copies all *.sql files to /docker-entrypoint-initdb.d/. Later, all files are executed in alphanumerical order, that’s why *.sql files start with digits. Finally, the port 6666 is exposed.

FROM postgres:alpine
COPY *.sql /docker-entrypoint-initdb.d/
ADD scripts/1_init.sql /docker-entrypoint-initdb.d
ADD scripts/2_copy.sql /docker-entrypoint-initdb.d
RUN chmod a+r /docker-entrypoint-initdb.d/*
EXPOSE 6666

1_init.sql creates the DB table, it has to have the same column names as in CSV file.

CREATE TABLE table_name 
(    
   --statement body 
);

2_copy.sql is responsible for copying data from the CSV to postgres.

COPY table_name FROM '/data/datasource.csv' DELIMITER ',' CSV HEADER;

CSV the file is located in data folder inside of the project. docker-compose.yml builds the Dockerfile from db folder and make it accessible through 5431port. As environmental properties basic postgres properties are used. And at the end data folder with CSV file is copied to the container.

version: '3.3'
services:
 db: 
   build: ./db  
   container_name: postgres    
   ports:     
    - "5431:6666" 
   environment: 
    - POSTGRES_USER=postgres  
    - POSTGRES_PASSWORD=postgres 
    - POSTGRES_DB=db_name  
   volumes:
   - ./data:/data

Only thing left is to run docker-compose:

docker-compose up -d