Copy CSV to Postgres database inside Docker
With Docker containers, it is easier to run services like databases without installing them locally. But how to populate the database with raw data in the CSV file?
In this article, I want to explain that. Let’s start with the project structure.
Project structure
data/
datasource.csv
db/
scripts/
1_init.sql
2_copy.sql
Dockerfile
docker-compose.yml
Dockerfile
uses postgres image and copies all *.sql files to /docker-entrypoint-initdb.d/. Later, all files are executed in alphanumerical order, that’s why *.sql files start with digits. Finally, the port 6666 is exposed.
FROM postgres:alpine
COPY *.sql /docker-entrypoint-initdb.d/
ADD scripts/1_init.sql /docker-entrypoint-initdb.d
ADD scripts/2_copy.sql /docker-entrypoint-initdb.d
RUN chmod a+r /docker-entrypoint-initdb.d/*
EXPOSE 6666
1_init.sql
creates the DB table, it has to have the same column names as in CSV file.
CREATE TABLE table_name
(
--statement body
);
2_copy.sql
is responsible for copying data from the CSV to postgres.
COPY table_name FROM '/data/datasource.csv' DELIMITER ',' CSV HEADER;
CSV the file is located in data folder inside of the project. docker-compose.yml builds the Dockerfile from db folder and make it accessible through 5431port. As environmental properties basic postgres properties are used. And at the end data folder with CSV file is copied to the container.
version: '3.3'
services:
db:
build: ./db
container_name: postgres
ports:
- "5431:6666"
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_DB=db_name
volumes:
- ./data:/data
Only thing left is to run docker-compose
:
docker-compose up -d