Postgres
Install dlt with PostgreSQLโ
To install the DLT library with PostgreSQL dependencies, run:
pip install dlt[postgres]
Setup Guideโ
1. Initialize a project with a pipeline that loads to Postgres by running:
dlt init chess postgres
2. Install the necessary dependencies for Postgres by running:
pip install -r requirements.txt
This will install dlt with the postgres
extra, which contains the psycopg2
client.
3. After setting up a Postgres instance and psql
/ query editor, create a new database by running:
CREATE DATABASE dlt_data;
Add the dlt_data
database to .dlt/secrets.toml
.
4. Create a new user by running:
CREATE USER loader WITH PASSWORD '<password>';
Add the loader
user and <password>
password to .dlt/secrets.toml
.
5. Give the loader
user owner permissions by running:
ALTER DATABASE dlt_data OWNER TO loader;
You can set more restrictive permissions (e.g., give user access to a specific schema).
6. Enter your credentials into .dlt/secrets.toml
.
It should now look like this:
[destination.postgres.credentials]
database = "dlt_data"
username = "loader"
password = "<password>" # replace with your password
host = "localhost" # or the IP address location of your database
port = 5432
connect_timeout = 15
You can also pass a database connection string similar to the one used by the psycopg2
library or SQLAlchemy. The credentials above will look like this:
# keep it at the top of your toml file! before any section starts
destination.postgres.credentials="postgresql://loader:<password>@localhost/dlt_data?connect_timeout=15"
To pass credentials directly, you can use the credentials
argument passed to the dlt.pipeline
or pipeline.run
methods.
pipeline = dlt.pipeline(pipeline_name='chess', destination='postgres', dataset_name='chess_data', credentials="postgresql://loader:<password>@localhost/dlt_data")
Write dispositionโ
All write dispositions are supported.
If you set the replace
strategy to staging-optimized
, the destination tables will be dropped and replaced by the staging tables.
Data loadingโ
dlt
will load data using large INSERT VALUES statements by default. Loading is multithreaded (20 threads by default).
Supported file formatsโ
- insert-values is used by default.
Supported column hintsโ
postgres
will create unique indexes for all columns with unique
hints. This behavior may be disabled.
Additional destination optionsโ
The Postgres destination creates UNIQUE indexes by default on columns with the unique
hint (i.e., _dlt_id
). To disable this behavior:
[destination.postgres]
create_indexes=false
dbt supportโ
This destination integrates with dbt via dbt-postgres.
Syncing of dlt
stateโ
This destination fully supports dlt state sync.
Additional Setup guidesโ
- Load data from HubSpot to PostgreSQL in python with dlt
- Load data from GitHub to PostgreSQL in python with dlt
- Load data from Zendesk to PostgreSQL in python with dlt
- Load data from Pipedrive to PostgreSQL in python with dlt
- Load data from Chess.com to PostgreSQL in python with dlt
- Load data from Jira to PostgreSQL in python with dlt
- Load data from Airtable to PostgreSQL in python with dlt
- Load data from Notion to PostgreSQL in python with dlt
- Load data from MongoDB to PostgreSQL in python with dlt
- Load data from Google Analytics to PostgreSQL in python with dlt
- Load data from Google Sheets to PostgreSQL in python with dlt
- Load data from Stripe to PostgreSQL in python with dlt
- Load data from Slack to PostgreSQL in python with dlt
- Load data from Salesforce to PostgreSQL in python with dlt
- Load data from Shopify to PostgreSQL in python with dlt
- Load data from AWS S3 to PostgreSQL in python with dlt
- Load data from Mux to PostgreSQL in python with dlt