postgresql
Allows SELECT and INSERT queries to be performed on data that is stored on a remote PostgreSQL server.
Syntax
Arguments
| Argument | Description |
|---|---|
host:port | PostgreSQL server address. |
database | Remote database name. |
table | Remote table name, or a query passed to PostgreSQL as is (see Passing a query instead of a table name). |
user | PostgreSQL user. |
password | User password. |
schema | Non-default table schema. Optional. |
on_conflict | Conflict resolution strategy. Example: ON CONFLICT DO NOTHING. Optional. |
Arguments also can be passed using named collections. In this case host and port should be specified separately. This approach is recommended for production environment.
Returned value
A table object with the same columns as the original PostgreSQL table.
In the INSERT query to distinguish table function postgresql(...) from table name with column names list you must use keywords FUNCTION or TABLE FUNCTION. See examples below.
Implementation Details
SELECT queries on PostgreSQL side run as COPY (SELECT ...) TO STDOUT inside read-only PostgreSQL transaction with commit after each SELECT query.
Simple WHERE clauses such as =, !=, >, >=, <, <=, and IN are executed on the PostgreSQL server.
All joins, aggregations, sorting, IN [ array ] conditions and the LIMIT sampling constraint are executed in ClickHouse only after the query to PostgreSQL finishes.
Passing a query instead of a table name
Instead of a table name, the third argument can be a SELECT query that is passed to PostgreSQL as is. The structure of the resulting table is inferred from the query result. The query can be written either as a subquery, or wrapped into the query function:
This is useful to push down joins, aggregations or any other processing to PostgreSQL. Such a table is read-only: INSERT into it is not allowed. The same syntax is supported by the PostgreSQL table engine.
The subquery form (SELECT ...) is parsed by ClickHouse and re-serialized in the PostgreSQL dialect (PostgreSQL identifier quoting and string-literal escaping) before being sent to the server. It must therefore be valid ClickHouse SQL. To pass PostgreSQL-specific syntax that ClickHouse does not parse, use the query('...') form, whose text is sent to PostgreSQL verbatim.
Any outer WHERE, LIMIT, aggregation, etc. of the surrounding ClickHouse query is not pushed down into the passed query — it is applied in ClickHouse after the full query result is fetched. To restrict the data read from PostgreSQL, put the filter inside the passed query. With external_table_strict_query = 1 an outer filter that cannot be pushed down is rejected with an exception instead of being applied locally.
INSERT queries on PostgreSQL side run as COPY "table_name" (field1, field2, ... fieldN) FROM STDIN inside PostgreSQL transaction with auto-commit after each INSERT statement.
PostgreSQL Array types converts into ClickHouse arrays.
Be careful, in PostgreSQL an array data type column like Integer[] may contain arrays of different dimensions in different rows, but in ClickHouse it is only allowed to have multidimensional arrays of the same dimension in all rows.
Supports multiple replicas that must be listed by |. For example:
or
Supports replicas priority for PostgreSQL dictionary source. The bigger the number in map, the less the priority. The highest priority is 0.
Examples
Table in PostgreSQL:
Selecting data from ClickHouse using plain arguments:
Or using named collections:
Inserting:
Using Non-default Schema:
Related
Replicating or migrating Postgres data with PeerDB
In addition to table functions, you can always use PeerDB by ClickHouse to set up a continuous data pipeline from Postgres to ClickHouse. PeerDB is a tool designed specifically to replicate data from Postgres to ClickHouse using change data capture (CDC).