Pablowe » Blog Archive » Approximating Postgres Replication Delay

Jay Taylor's notes

back to listing index

Pablowe » Blog Archive » Approximating Postgres Replication Delay

[web search]

Original source (www.pablowe.net)

Tags: database postgres postgresql replication www.pablowe.net

Clipped on: 2013-09-10

Approximating Postgres Replication Delay

August 30th, 2012 Posted in Postgres, Replication

Coming from the MySQL world, I’m used to being able to easily determine the replication delay (in seconds) via the SHOW SLAVE STATUS command:

mysql> show slave status\G
*************************** 1. row ***************************
...
        Seconds_Behind_Master: 0
...
1 row in set (0.00 sec)

Unfortunately, there is no such comparable command in PostgreSQL. The official docs propose the following approach:

$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
 pg_current_xlog_location 
--------------------------
 0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
 pg_last_xlog_receive_location 
-------------------------------
 0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
 pg_last_xlog_replay_location 
------------------------------
 0/2000000
(1 row)

The problem with this is that it’s not an atomic unit of work and is GUARANTEED to be inaccurate (it can be done quickly, minimizing the margin of error). Also, it involves remote connections to multiple hosts, which I’m not interested in for my use case. If we want to calculate this difference from the master we can simply issue the following command:

postgres=# SELECT pg_stat_replication.*, pg_current_xlog_location() FROM pg_stat_replication;
-[ RECORD 1 ]------------+------------------------------
procpid                  | 15839
usesysid                 | 16384
usename                  | repl
application_name         | postgres
client_addr              | 10.0.0.2
client_hostname          | 
client_port              | 50353
backend_start            | 2012-08-30 00:52:48.957179+00
state                    | streaming
sent_location            | 0/5530528
write_location           | 0/5530528
flush_location           | 0/5530528
replay_location          | 0/5530528
sync_priority            | 0
sync_state               | async
pg_current_xlog_location | 0/5530528

and compare pg_current_xlog_location with the write location. This approach works, but will become a bit more interesting in 9.2, with the addition of the pg_xlog_location_diff function, which calculates the difference between two transaction log locations.

Getting the replication delay from only the slave, though, is a bit more circuitous. We’ll need to use the pg_last_xact_replay_timestamp() function, which gets time stamp of last transaction replayed during recovery. This by itself won’t be terribly interesting, but we can approximate the delay as follows:

postgres=# SELECT NOW() - pg_last_xact_replay_timestamp() AS replication_delay;
 replication_delay 
-------------------
 00:00:00.893494
(1 row)

Here, the replica is less than one second behind the master. In a very busy system, with many writes per second, this number will remain fairly accurate. However, in a system where there are few writes, the “replication_delay” will continually grow because the last replayed transaction timestamp isn’t increasing (this is generally the same limitation as MySQL’s SHOW SLAVE STATUS output). We can get around this, though, by introducing an external heartbeat, which will minimize the drift that will manifest with this approach.

Be Sociable, Share!