This is one stop global knowledge base where you can learn about all the products, solutions and support features.
The
tcn
module provides a trigger function that notifies listeners of changes to any table on which it is attached. It must be used as an
AFTER
trigger
FOR EACH ROW
.
This module is considered
“
trusted
”
, that is, it can be installed by non-superusers who have
CREATE
privilege on the current database.
Only one parameter may be supplied to the function in a
CREATE TRIGGER
statement, and that is optional. If supplied it will be used for the channel name for the notifications. If omitted
tcn
will be used for the channel name.
The payload of the notifications consists of the table name, a letter to indicate which type of operation was performed, and column name/value pairs for primary key columns. Each part is separated from the next by a comma. For ease of parsing using regular expressions, table and column names are always wrapped in double quotes, and data values are always wrapped in single quotes. Embedded quotes are doubled.
A brief example of using the extension follows.
test=# create table tcndata test-# ( test(# a int not null, test(# b date not null, test(# c text, test(# primary key (a, b) test(# ); CREATE TABLE test=# create trigger tcndata_tcn_trigger test-# after insert or update or delete on tcndata test-# for each row execute function triggered_change_notification(); CREATE TRIGGER test=# listen tcn; LISTEN test=# insert into tcndata values (1, date '2012-12-22', 'one'), test-# (1, date '2012-12-23', 'another'), test-# (2, date '2012-12-23', 'two'); INSERT 0 3 Asynchronous notification "tcn" with payload ""tcndata",I,"a"='1',"b"='2012-12-22'" received from server process with PID 22770. Asynchronous notification "tcn" with payload ""tcndata",I,"a"='1',"b"='2012-12-23'" received from server process with PID 22770. Asynchronous notification "tcn" with payload ""tcndata",I,"a"='2',"b"='2012-12-23'" received from server process with PID 22770. test=# update tcndata set c = 'uno' where a = 1; UPDATE 2 Asynchronous notification "tcn" with payload ""tcndata",U,"a"='1',"b"='2012-12-22'" received from server process with PID 22770. Asynchronous notification "tcn" with payload ""tcndata",U,"a"='1',"b"='2012-12-23'" received from server process with PID 22770. test=# delete from tcndata where a = 1 and b = date '2012-12-22'; DELETE 1 Asynchronous notification "tcn" with payload ""tcndata",D,"a"='1',"b"='2012-12-22'" received from server process with PID 22770.
The
tcn
module provides a trigger function that notifies listeners of changes to any table on which it is attached. It must be used as an
AFTER
trigger
FOR EACH ROW
.
This module is considered
“
trusted
”
, that is, it can be installed by non-superusers who have
CREATE
privilege on the current database.
Only one parameter may be supplied to the function in a
CREATE TRIGGER
statement, and that is optional. If supplied it will be used for the channel name for the notifications. If omitted
tcn
will be used for the channel name.
The payload of the notifications consists of the table name, a letter to indicate which type of operation was performed, and column name/value pairs for primary key columns. Each part is separated from the next by a comma. For ease of parsing using regular expressions, table and column names are always wrapped in double quotes, and data values are always wrapped in single quotes. Embedded quotes are doubled.
A brief example of using the extension follows.
test=# create table tcndata test-# ( test(# a int not null, test(# b date not null, test(# c text, test(# primary key (a, b) test(# ); CREATE TABLE test=# create trigger tcndata_tcn_trigger test-# after insert or update or delete on tcndata test-# for each row execute function triggered_change_notification(); CREATE TRIGGER test=# listen tcn; LISTEN test=# insert into tcndata values (1, date '2012-12-22', 'one'), test-# (1, date '2012-12-23', 'another'), test-# (2, date '2012-12-23', 'two'); INSERT 0 3 Asynchronous notification "tcn" with payload ""tcndata",I,"a"='1',"b"='2012-12-22'" received from server process with PID 22770. Asynchronous notification "tcn" with payload ""tcndata",I,"a"='1',"b"='2012-12-23'" received from server process with PID 22770. Asynchronous notification "tcn" with payload ""tcndata",I,"a"='2',"b"='2012-12-23'" received from server process with PID 22770. test=# update tcndata set c = 'uno' where a = 1; UPDATE 2 Asynchronous notification "tcn" with payload ""tcndata",U,"a"='1',"b"='2012-12-22'" received from server process with PID 22770. Asynchronous notification "tcn" with payload ""tcndata",U,"a"='1',"b"='2012-12-23'" received from server process with PID 22770. test=# delete from tcndata where a = 1 and b = date '2012-12-22'; DELETE 1 Asynchronous notification "tcn" with payload ""tcndata",D,"a"='1',"b"='2012-12-22'" received from server process with PID 22770.
It is a good idea to save the database server's log output somewhere, rather than just discarding it via
/dev/null
. The log output is invaluable when diagnosing problems.
The server log can contain sensitive information and needs to be protected, no matter how or where it is stored, or the destination to which it is routed. For example, some DDL statements might contain plaintext passwords or other authentication details. Logged statements at the
ERROR
level might show the SQL source code for applications and might also contain some parts of data rows. Recording data, events and related information is the intended function of this facility, so this is not a leakage or a bug. Please ensure the server logs are visible only to appropriately authorized people.
Log output tends to be voluminous (especially at higher debug levels) so you won't want to save it indefinitely. You need to rotate the log files so that new log files are started and old ones removed after a reasonable period of time.
If you simply direct the
stderr
of
postgres
into a file, you will have log output, but the only way to truncate the log file is to stop and restart the server. This might be acceptable if you are using
PostgreSQL
in a development environment, but few production servers would find this behavior acceptable.
A better approach is to send the server's
stderr
output to some type of log rotation program. There is a built-in log rotation facility, which you can use by setting the configuration parameter
logging_collector
to
true
in
postgresql.conf
. The control parameters for this program are described in Section 20.8.1. You can also use this approach to capture the log data in machine readable
CSV
(comma-separated values) format.
Alternatively, you might prefer to use an external log rotation program if you have one that you are already using with other server software. For example, the
rotatelogs
tool included in the
Apache
distribution can be used with
PostgreSQL
. One way to do this is to pipe the server's
stderr
output to the desired program. If you start the server with
pg_ctl
, then
stderr
is already redirected to
stdout
, so you just need a pipe command, for example:
pg_ctl start | rotatelogs /var/log/pgsql_log 86400
You can combine these approaches by setting up
logrotate
to collect log files produced by
PostgreSQL
built-in logging collector. In this case, the logging collector defines the names and location of the log files, while
logrotate
periodically archives these files. When initiating log rotation,
logrotate
must ensure that the application sends further output to the new file. This is commonly done with a
postrotate
script that sends a
SIGHUP
signal to the application, which then reopens the log file. In
PostgreSQL
, you can run
pg_ctl
with the
logrotate
option instead. When the server receives this command, the server either switches to a new log file or reopens the existing file, depending on the logging configuration (see Section 20.8.1).
When using static log file names, the server might fail to reopen the log file if the max open file limit is reached or a file table overflow occurs. In this case, log messages are sent to the old log file until a successful log rotation. If
logrotate
is configured to compress the log file and delete it, the server may lose the messages logged in this time frame. To avoid this issue, you can configure the logging collector to dynamically assign log file names and use a
prerotate
script to ignore open log files.
Another production-grade approach to managing log output is to send it to
syslog
and let
syslog
deal with file rotation. To do this, set the configuration parameter
log_destination
to
syslog
(to log to
syslog
only) in
postgresql.conf
. Then you can send a
SIGHUP
signal to the
syslog
daemon whenever you want to force it to start writing a new log file. If you want to automate log rotation, the
logrotate
program can be configured to work with log files from
syslog
.
On many systems, however,
syslog
is not very reliable, particularly with large log messages; it might truncate or drop messages just when you need them the most. Also, on
Linux
,
syslog
will flush each message to disk, yielding poor performance. (You can use a
“
-
”
at the start of the file name in the
syslog
configuration file to disable syncing.)
Note that all the solutions described above take care of starting new log files at configurable intervals, but they do not handle deletion of old, no-longer-useful log files. You will probably want to set up a batch job to periodically delete old log files. Another possibility is to configure the rotation program so that old log files are overwritten cyclically.
pgBadger is an external project that does sophisticated log file analysis. check_postgres provides Nagios alerts when important messages appear in the log files, as well as detection of many other extraordinary conditions.
It is a good idea to save the database server's log output somewhere, rather than just discarding it via
/dev/null
. The log output is invaluable when diagnosing problems.
The server log can contain sensitive information and needs to be protected, no matter how or where it is stored, or the destination to which it is routed. For example, some DDL statements might contain plaintext passwords or other authentication details. Logged statements at the
ERROR
level might show the SQL source code for applications and might also contain some parts of data rows. Recording data, events and related information is the intended function of this facility, so this is not a leakage or a bug. Please ensure the server logs are visible only to appropriately authorized people.
Log output tends to be voluminous (especially at higher debug levels) so you won't want to save it indefinitely. You need to rotate the log files so that new log files are started and old ones removed after a reasonable period of time.
If you simply direct the
stderr
of
postgres
into a file, you will have log output, but the only way to truncate the log file is to stop and restart the server. This might be acceptable if you are using
PostgreSQL
in a development environment, but few production servers would find this behavior acceptable.
A better approach is to send the server's
stderr
output to some type of log rotation program. There is a built-in log rotation facility, which you can use by setting the configuration parameter
logging_collector
to
true
in
postgresql.conf
. The control parameters for this program are described in Section 20.8.1. You can also use this approach to capture the log data in machine readable
CSV
(comma-separated values) format.
Alternatively, you might prefer to use an external log rotation program if you have one that you are already using with other server software. For example, the
rotatelogs
tool included in the
Apache
distribution can be used with
PostgreSQL
. One way to do this is to pipe the server's
stderr
output to the desired program. If you start the server with
pg_ctl
, then
stderr
is already redirected to
stdout
, so you just need a pipe command, for example:
pg_ctl start | rotatelogs /var/log/pgsql_log 86400
You can combine these approaches by setting up
logrotate
to collect log files produced by
PostgreSQL
built-in logging collector. In this case, the logging collector defines the names and location of the log files, while
logrotate
periodically archives these files. When initiating log rotation,
logrotate
must ensure that the application sends further output to the new file. This is commonly done with a
postrotate
script that sends a
SIGHUP
signal to the application, which then reopens the log file. In
PostgreSQL
, you can run
pg_ctl
with the
logrotate
option instead. When the server receives this command, the server either switches to a new log file or reopens the existing file, depending on the logging configuration (see Section 20.8.1).
When using static log file names, the server might fail to reopen the log file if the max open file limit is reached or a file table overflow occurs. In this case, log messages are sent to the old log file until a successful log rotation. If
logrotate
is configured to compress the log file and delete it, the server may lose the messages logged in this time frame. To avoid this issue, you can configure the logging collector to dynamically assign log file names and use a
prerotate
script to ignore open log files.
Another production-grade approach to managing log output is to send it to
syslog
and let
syslog
deal with file rotation. To do this, set the configuration parameter
log_destination
to
syslog
(to log to
syslog
only) in
postgresql.conf
. Then you can send a
SIGHUP
signal to the
syslog
daemon whenever you want to force it to start writing a new log file. If you want to automate log rotation, the
logrotate
program can be configured to work with log files from
syslog
.
On many systems, however,
syslog
is not very reliable, particularly with large log messages; it might truncate or drop messages just when you need them the most. Also, on
Linux
,
syslog
will flush each message to disk, yielding poor performance. (You can use a
“
-
”
at the start of the file name in the
syslog
configuration file to disable syncing.)
Note that all the solutions described above take care of starting new log files at configurable intervals, but they do not handle deletion of old, no-longer-useful log files. You will probably want to set up a batch job to periodically delete old log files. Another possibility is to configure the rotation program so that old log files are overwritten cyclically.
pgBadger is an external project that does sophisticated log file analysis. check_postgres provides Nagios alerts when important messages appear in the log files, as well as detection of many other extraordinary conditions.
The examples in the previous section illustrated full text matching using simple constant strings. This section shows how to search table data, optionally using indexes.
It is possible to do a full text search without an index. A simple query to print the
title
of each row that contains the word
friend
in its
body
field is:
SELECT title FROM pgweb WHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');
This will also find related words such as
friends
and
friendly
, since all these are reduced to the same normalized lexeme.
The query above specifies that the
english
configuration is to be used to parse and normalize the strings. Alternatively we could omit the configuration parameters:
SELECT title FROM pgweb WHERE to_tsvector(body) @@ to_tsquery('friend');
This query will use the configuration set by default_text_search_config.
A more complex example is to select the ten most recent documents that contain
create
and
table
in the
title
or
body
:
SELECT title FROM pgweb WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table') ORDER BY last_mod_date DESC LIMIT 10;
For clarity we omitted the
coalesce
function calls which would be needed to find rows that contain
NULL
in one of the two fields.
Although these queries will work without an index, most applications will find this approach too slow, except perhaps for occasional ad-hoc searches. Practical use of text searching usually requires creating an index.
We can create a GIN index ( Section 12.9) to speed up text searches:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', body));
Notice that the 2-argument version of
to_tsvector
is used. Only text search functions that specify a configuration name can be used in expression indexes (Section 11.7). This is because the index contents must be unaffected by default_text_search_config. If they were affected, the index contents might be inconsistent because different entries could contain
tsvector
s that were created with different text search configurations, and there would be no way to guess which was which. It would be impossible to dump and restore such an index correctly.
Because the two-argument version of
to_tsvector
was used in the index above, only a query reference that uses the 2-argument version of
to_tsvector
with the same configuration name will use that index. That is,
WHERE to_tsvector('english', body) @@ 'a & b'
can use the index, but
WHERE to_tsvector(body) @@ 'a & b'
cannot. This ensures that an index will be used only with the same configuration used to create the index entries.
It is possible to set up more complex expression indexes wherein the configuration name is specified by another column, e.g.:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));
where
config_name
is a column in the
pgweb
table. This allows mixed configurations in the same index while recording which configuration was used for each index entry. This would be useful, for example, if the document collection contained documents in different languages. Again, queries that are meant to use the index must be phrased to match, e.g.,
WHERE to_tsvector(config_name, body) @@ 'a & b'
.
Indexes can even concatenate columns:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));
Another approach is to create a separate
tsvector
column to hold the output of
to_tsvector
. To keep this column automatically up to date with its source data, use a stored generated column. This example is a concatenation of
title
and
body
, using
coalesce
to ensure that one field will still be indexed when the other is
NULL
:
ALTER TABLE pgweb ADD COLUMN textsearchable_index_col tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))) STORED;
Then we create a GIN index to speed up the search:
CREATE INDEX textsearch_idx ON pgweb USING GIN (textsearchable_index_col);
Now we are ready to perform a fast full text search:
SELECT title FROM pgweb WHERE textsearchable_index_col @@ to_tsquery('create & table') ORDER BY last_mod_date DESC LIMIT 10;
One advantage of the separate-column approach over an expression index is that it is not necessary to explicitly specify the text search configuration in queries in order to make use of the index. As shown in the example above, the query can depend on
default_text_search_config
. Another advantage is that searches will be faster, since it will not be necessary to redo the
to_tsvector
calls to verify index matches. (This is more important when using a GiST index than a GIN index; see Section 12.9.) The expression-index approach is simpler to set up, however, and it requires less disk space since the
tsvector
representation is not stored explicitly.
The examples in the previous section illustrated full text matching using simple constant strings. This section shows how to search table data, optionally using indexes.
It is possible to do a full text search without an index. A simple query to print the
title
of each row that contains the word
friend
in its
body
field is:
SELECT title FROM pgweb WHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');
This will also find related words such as
friends
and
friendly
, since all these are reduced to the same normalized lexeme.
The query above specifies that the
english
configuration is to be used to parse and normalize the strings. Alternatively we could omit the configuration parameters:
SELECT title FROM pgweb WHERE to_tsvector(body) @@ to_tsquery('friend');
This query will use the configuration set by default_text_search_config.
A more complex example is to select the ten most recent documents that contain
create
and
table
in the
title
or
body
:
SELECT title FROM pgweb WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table') ORDER BY last_mod_date DESC LIMIT 10;
For clarity we omitted the
coalesce
function calls which would be needed to find rows that contain
NULL
in one of the two fields.
Although these queries will work without an index, most applications will find this approach too slow, except perhaps for occasional ad-hoc searches. Practical use of text searching usually requires creating an index.
We can create a GIN index ( Section 12.9) to speed up text searches:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', body));
Notice that the 2-argument version of
to_tsvector
is used. Only text search functions that specify a configuration name can be used in expression indexes (Section 11.7). This is because the index contents must be unaffected by default_text_search_config. If they were affected, the index contents might be inconsistent because different entries could contain
tsvector
s that were created with different text search configurations, and there would be no way to guess which was which. It would be impossible to dump and restore such an index correctly.
Because the two-argument version of
to_tsvector
was used in the index above, only a query reference that uses the 2-argument version of
to_tsvector
with the same configuration name will use that index. That is,
WHERE to_tsvector('english', body) @@ 'a & b'
can use the index, but
WHERE to_tsvector(body) @@ 'a & b'
cannot. This ensures that an index will be used only with the same configuration used to create the index entries.
It is possible to set up more complex expression indexes wherein the configuration name is specified by another column, e.g.:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));
where
config_name
is a column in the
pgweb
table. This allows mixed configurations in the same index while recording which configuration was used for each index entry. This would be useful, for example, if the document collection contained documents in different languages. Again, queries that are meant to use the index must be phrased to match, e.g.,
WHERE to_tsvector(config_name, body) @@ 'a & b'
.
Indexes can even concatenate columns:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));
Another approach is to create a separate
tsvector
column to hold the output of
to_tsvector
. To keep this column automatically up to date with its source data, use a stored generated column. This example is a concatenation of
title
and
body
, using
coalesce
to ensure that one field will still be indexed when the other is
NULL
:
ALTER TABLE pgweb ADD COLUMN textsearchable_index_col tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))) STORED;
Then we create a GIN index to speed up the search:
CREATE INDEX textsearch_idx ON pgweb USING GIN (textsearchable_index_col);
Now we are ready to perform a fast full text search:
SELECT title FROM pgweb WHERE textsearchable_index_col @@ to_tsquery('create & table') ORDER BY last_mod_date DESC LIMIT 10;
One advantage of the separate-column approach over an expression index is that it is not necessary to explicitly specify the text search configuration in queries in order to make use of the index. As shown in the example above, the query can depend on
default_text_search_config
. Another advantage is that searches will be faster, since it will not be necessary to redo the
to_tsvector
calls to verify index matches. (This is more important when using a GiST index than a GIN index; see Section 12.9.) The expression-index approach is simpler to set up, however, and it requires less disk space since the
tsvector
representation is not stored explicitly.