JRehkemper.de

PostgreSQL Replication with Failover and Switchback

If you have a Postgres Database that can not afford any downtime, it is a good Idea to setup replication. That way, if your primary Postgres fails you can switch to the secondary and the other way around. That is also very useful to do OS-updates and patches with minimal downtime.

The Problem with Highly Available Databases

Certain services are more easy to scale and make highly available than others. Webserver for example can work independent from each other since they do not store changing data, also known as state.
Databases are far more complicated in that regard, because you need to keep their data in sync. The most common way to deal with this on postgres, is to have one primary server and one or many secondary servers. Usually the primary will accept all connections and will be the single source of truth. That doesn’t distribute the workload but at least makes the database redundant.
The secondary server have a connection to the master as well and will read the WAL Logs (Write ahead logs) to reproduce the changes made to the primary database.
If the primary should fail or needs downtime, you can promote one of the secondaries to be the new primary and disable the old primary. Now the secondaries need to connect to the new primary and sync with it.
Once your downtime for the original primary is done, it can rejoin the cluster as a secondary server and sync his data with the current primary.

If you really need to distribute the load between the servers the only way is to categorize your queries into write-queries and read-queries. All read-queries can then be forwarded to a secondary server and the primary only needs to process the write-queries. If that is still not enough you will have to look for postgres-extensions that allows master-to-master replication or use a replicated database like cockroach db.

Scenario

In my scenario I will create a cluster with two AlmaLinux Servers with streaming replication between them. We will call them postgres-01 and postgres-02.
For your clients to automatically connect to the active database even after a failover, you should use a virtual IP with a tool like keepalived. Keeplived is not part of this article, but can be found here.

Install PostgreSQL

First we need to install the Postgres database on both servers. You can follow this article to do so.

Configure replication

Once the Database is up and running we can prepare the initial replication.

Postgres-01 / Primary

On the Primary Database we need to create a user with replication privileges. This user will be used by the secondary server to connect and read and copy all the data from your primary server.

[tux@postgres-01]$ sudo su - postgres

[postgres@postgres-01]$ psql

postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replica';

To allow the login and replication, we well add the ip of postgres-02 into the pg_hba.conf for replication.

sudo vim /var/lib/pgsql/data/pg_hba.conf

# Add this at the bottom of the file
host    replication     replicator      10.0.1.63/32            scram-sha-256

For that to take affect, we need to reload configuration.

sudo systemctl reload postgresql

Postgres-02 / Secondary

On the secondary we will start by deleting all existing data. Otherwise we would have conflicts between the two databases.

sudo su - postgres
rm -r /var/lib/pgsql/data/*

After that we can initiate the replication with the pg_basebackup command.
-h is the hostname or ip of the primary server -D is your datadirectory -U is the user used for the replication -P will prompt you for a password -v will produce verbose output -R sets this server into secondary mode -X will set the replication type to stream

pg_basebackup -h 10.0.1.62 -D /var/lib/pgsql/data/ -U replicator -P -v -R -X stream
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_2533"
23034/23034 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...  
pg_basebackup: syncing data to disk ...  
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

Check Replication Status

Now the replication is up and running. You can get the status of your replication from the pg_stat_wal_receiver table on the secondary server.

[postgres@postgres-02]$ psql
postgres=# SELECT "status", "last_msg_send_time", "slot_name", "sender_host" FROM pg_stat_wal_receiver;
  status   |      last_msg_send_time       | slot_name | sender_host
-----------+-------------------------------+-----------+-------------
 streaming | 2023-12-20 17:42:17.985544+01 |           | 10.0.1.62
(1 row)

Test Replication

We can test the replication by making changes to our Primary Server.

Postgres-01 / Primary

For demonstration we will create a new database called replication_DB.

[postgres@postgres-01]$ psql
CREATE DATABASE "replication_db";

Postgres-02 / Secondary

Now we will list the databases present on our secondary service and will see that our replication_db is already present.

[postgres@postgres-02]$ psql
postgres=# \l
                                                List of databases
      Name      |  Owner   | Encoding | Collate |  Ctype  | ICU Locale | Locale Provider |   Access privileges
----------------+----------+----------+---------+---------+------------+-----------------+-----------------------
 postgres       | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            |
 replication_db | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            |
 template0      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            | =c/postgres          +
                |          |          |         |         |            |                 | postgres=CTc/postgres
 template1      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            | =c/postgres          +
                |          |          |         |         |            |                 | postgres=CTc/postgres
(4 rows)

Prepare Failover

Now we have a working replication. The next step is to prepare the failover in both directions. Right now we can only switch from postgres-01 to postgres-02. Not the other way around.
Once postgres-01 is down and postgres-02 promoted to master, we have to give postgres-01 a chance to sync it’s data.
For that we have to allow the replication connection in the pg_hba.conf of postgres-02.

[tux@postgres-02]$ sudo vim /var/lib/pgsql/data/pg_hba.conf

# Add this at the bottom of the file
# 10.0.1.62 is the ip of postgres-01
host    replication     replicator      10.0.1.62/32            scram-sha-256

Execute Failover

Now we can execute our failover.
This requires two steps: we need to stop the database on postgres-01 and promote postgres-02 to be the new primary.
If you are using a virtual ip or a loadbalancer, you need to adjust it to point to the new primary.

Shutdown Primary

First we shut down postgres-01.

[tux@postgres-01]$ sudo systemctl stop postgresql

Promote Secondary

The we promote postgres-02 to be the new primary.

[postgres@postgres-02]$ pgsql
postgres=# SELECT pg_promote();
 pg_promote
------------
 t
(1 row)

Now postgres-02 is the active primary and postgres-01 is down.

Prepare Switchback

The next step is to prepare the switchback. That way postgres-01 can be promoted to be the primary again and postgres-02 can be shut down.

Set Postgres-01 to Standby

If we just restart the database on postgres-01 it will think it is still the master and will accept write queries. That could lead to inconsistent data between postgres-02 and postgres-01. We don’t want that.

To tell postgres to start as a read-only secondary, we can create the standby.signal file.

[postgres@postgres-01]$ touch /var/lib/pgsql/data/standby.signal

Now the postgres would start in secondary mode and will try to replicate from it’s primary.
The next step is to define the connection to the current primary.

[postgres@postgres-01]$ vim /var/lib/pgsql/data/postgresql.conf
# host is postgres-02
primary_conninfo = 'user=replicator host=10.0.1.63 port=5432 password=replica'

Now we can start the database again

[tux@postgres-01]$ sudo systemctl start postgresql

To check if the database is replicating, we can check its logs. The log name may vary in your case, since the last three characters are for the day of the week. Mon in my case means Monday.

sudo tail -f /var/lib/pgsql/data/log/postgresql-Mon.log
2023-12-20 17:59:36.789 CET [2738] LOG:  received SIGHUP, reloading configuration files
2023-12-20 17:59:36.790 CET [2738] LOG:  parameter "primary_conninfo" changed to "user=replicator host=10.0.1.63 port=5432 password=replica"
2023-12-20 17:59:36.805 CET [2836] LOG:  fetching timeline history file for timeline 2 from primary server
2023-12-20 17:59:36.823 CET [2836] LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
2023-12-20 17:59:36.833 CET [2836] LOG:  replication terminated by primary server
2023-12-20 17:59:36.833 CET [2836] DETAIL:  End of WAL reached on timeline 1 at 0/3420D60.  
2023-12-20 17:59:36.834 CET [2836] FATAL:  terminating walreceiver process due to administrator command
2023-12-20 17:59:36.834 CET [2742] LOG:  new target timeline is 2
2023-12-20 17:59:36.849 CET [2837] LOG:  started streaming WAL from primary at 0/3000000 on timeline 2
2023-12-20 17:59:36.894 CET [2742] LOG:  redo starts at 0/3420D60

We can see in the logs, that the replication is running.

Test Replication

If we want to, we can test the replication by creating a record as well.

Postgres-01 / Secondary

[postgres@postgres-01]$ pgsql replication_db
replication_db=# SELECT * FROM test;
ERROR:  relation "test" does not exist

As we can see, the table test does not yet exist.

Postgres-02 / Primary

So lets go to our primary and create it.

[postgres@postgres-02]$ psql replication_db
replication_db=# CREATE TABLE "test" (id int, name text);
CREATE TABLE

Postgres-01 / Secondary

Back on our secondary we can check again and get no error. That means the table exists, but there are no entries. That means the replication works.

[postgres@postgres-01]$ psql replication_db
replication_db=# SELECT * FROM test;

Execute Switchback

Now we are ready to switch back to postgres-01 as our primary database.

Shutdown Primary

Again we start by stopping our primary database

[tux@postgres-02]$ sudo systemctl stop postgresql

Promote Secondary

Then we promote our secondary to be the new master.

[postgres@postgres-01]$ pgsql
postgres=# SELECT pg_promote();
 pg_promote
------------
 t
(1 row)

This also removes the standby.signal file. So if we restart the database, postgres-01 we stay the master.

Set Postgres-02 to Standby

In order to be able to start postgres-02 as a secondary again, we will need to create the standby.signal

[postgres@postgres-02]$ touch /var/lib/pgsql/data/standby.signal

Now we can start Postgres-02 again and it will start to replicate from postgres-01.

[tux@postgres-02]$ sudo systemctl start postgresql

Conclusion

Now we have a working replication and the ability to switch between the two servers with minimal downtime.
It is important to always create the standby.signal file on the secondary database before starting it. Otherwise you will have to independent masters and might get inconsistent data.

profile picture of the author

Jannik Rehkemper

I'm an professional Linux Administrator and Hobby Programmer. My training as an IT-Professional started in 2019 and ended in 2022. Since 2023 I'm working as an Linux Administrator.