If you have a Postgres Database that can not afford any downtime, it is a good Idea to setup replication. That way, if your primary Postgres fails you can switch to the secondary and the other way around. That is also very useful to do OS-updates and patches with minimal downtime.
The Problem with Highly Available Databases
Certain services are more easy to scale and make highly available than others. Webserver for example can work independent from each other since they do not store changing data, also known as state.
Databases are far more complicated in that regard, because you need to keep their data in sync. The most common way to deal with this on postgres, is to have one primary server and one or many secondary servers. Usually the primary will accept all connections and will be the single source of truth. That doesn’t distribute the workload but at least makes the database redundant.
The secondary server have a connection to the master as well and will read the WAL Logs (Write ahead logs) to reproduce the changes made to the primary database.
If the primary should fail or needs downtime, you can promote one of the secondaries to be the new primary and disable the old primary.
Now the secondaries need to connect to the new primary and sync with it.
Once your downtime for the original primary is done, it can rejoin the cluster as a secondary server and sync his data with the current primary.
If you really need to distribute the load between the servers the only way is to categorize your queries into write-queries and read-queries. All read-queries can then be forwarded to a secondary server and the primary only needs to process the write-queries. If that is still not enough you will have to look for postgres-extensions that allows master-to-master replication or use a replicated database like cockroach db.
Scenario
In my scenario I will create a cluster with two AlmaLinux Servers with streaming replication between them. We will call them postgres-01 and postgres-02.
For your clients to automatically connect to the active database even after a failover, you should use a virtual IP with a tool like keepalived. Keeplived is not part of this article, but can be found here.
Install PostgreSQL
First we need to install the Postgres database on both servers. You can follow this article to do so.
Configure replication
Once the Database is up and running we can prepare the initial replication.
Postgres-01 / Primary
On the Primary Database we need to create a user with replication privileges. This user will be used by the secondary server to connect and read and copy all the data from your primary server.
[tux@postgres-01]$ sudo su - postgres
[postgres@postgres-01]$ psql
postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replica';
To allow the login and replication, we well add the ip of postgres-02 into the pg_hba.conf for replication.
sudo vim /var/lib/pgsql/data/pg_hba.conf
# Add this at the bottom of the file
host    replication     replicator      10.0.1.63/32            scram-sha-256
For that to take affect, we need to reload configuration.
sudo systemctl reload postgresql
Postgres-02 / Secondary
On the secondary we will start by deleting all existing data. Otherwise we would have conflicts between the two databases.
sudo su - postgres
rm -r /var/lib/pgsql/data/*
After that we can initiate the replication with the pg_basebackup command.-h is the hostname or ip of the primary server
-D is your datadirectory
-U is the user used for the replication
-P will prompt you for a password
-v will produce verbose output
-R sets this server into secondary mode
-X will set the replication type to stream
pg_basebackup -h 10.0.1.62 -D /var/lib/pgsql/data/ -U replicator -P -v -R -X stream
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_2533"
23034/23034 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...  
pg_basebackup: syncing data to disk ...  
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
Check Replication Status
Now the replication is up and running. You can get the status of your replication from the pg_stat_wal_receiver table on the secondary server.
[postgres@postgres-02]$ psql
postgres=# SELECT "status", "last_msg_send_time", "slot_name", "sender_host" FROM pg_stat_wal_receiver;
  status   |      last_msg_send_time       | slot_name | sender_host
-----------+-------------------------------+-----------+-------------
 streaming | 2023-12-20 17:42:17.985544+01 |           | 10.0.1.62
(1 row)
Test Replication
We can test the replication by making changes to our Primary Server.
Postgres-01 / Primary
For demonstration we will create a new database called replication_DB.
[postgres@postgres-01]$ psql
CREATE DATABASE "replication_db";
Postgres-02 / Secondary
Now we will list the databases present on our secondary service and will see that our replication_db is already present.
[postgres@postgres-02]$ psql
postgres=# \l
                                                List of databases
      Name      |  Owner   | Encoding | Collate |  Ctype  | ICU Locale | Locale Provider |   Access privileges
----------------+----------+----------+---------+---------+------------+-----------------+-----------------------
 postgres       | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            |
 replication_db | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            |
 template0      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            | =c/postgres          +
                |          |          |         |         |            |                 | postgres=CTc/postgres
 template1      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |            | libc            | =c/postgres          +
                |          |          |         |         |            |                 | postgres=CTc/postgres
(4 rows)
Prepare Failover
Now we have a working replication. The next step is to prepare the failover in both directions. Right now we can only switch from postgres-01 to postgres-02. Not the other way around.
Once postgres-01 is down and postgres-02 promoted to master, we have to give postgres-01 a chance to sync it’s data.
For that we have to allow the replication connection in the pg_hba.conf of postgres-02.
[tux@postgres-02]$ sudo vim /var/lib/pgsql/data/pg_hba.conf
# Add this at the bottom of the file
# 10.0.1.62 is the ip of postgres-01
host    replication     replicator      10.0.1.62/32            scram-sha-256
Execute Failover
Now we can execute our failover.
This requires two steps: we need to stop the database on postgres-01 and promote postgres-02 to be the new primary.
If you are using a virtual ip or a loadbalancer, you need to adjust it to point to the new primary.
Shutdown Primary
First we shut down postgres-01.
[tux@postgres-01]$ sudo systemctl stop postgresql
Promote Secondary
The we promote postgres-02 to be the new primary.
[postgres@postgres-02]$ pgsql
postgres=# SELECT pg_promote();
 pg_promote
------------
 t
(1 row)
Now postgres-02 is the active primary and postgres-01 is down.
Prepare Switchback
The next step is to prepare the switchback. That way postgres-01 can be promoted to be the primary again and postgres-02 can be shut down.
Set Postgres-01 to Standby
If we just restart the database on postgres-01 it will think it is still the master and will accept write queries. That could lead to inconsistent data between postgres-02 and postgres-01. We don’t want that.
To tell postgres to start as a read-only secondary, we can create the standby.signal file.
[postgres@postgres-01]$ touch /var/lib/pgsql/data/standby.signal
Now the postgres would start in secondary mode and will try to replicate from it’s primary.
The next step is to define the connection to the current primary.
[postgres@postgres-01]$ vim /var/lib/pgsql/data/postgresql.conf
# host is postgres-02
primary_conninfo = 'user=replicator host=10.0.1.63 port=5432 password=replica'
Now we can start the database again
[tux@postgres-01]$ sudo systemctl start postgresql
To check if the database is replicating, we can check its logs. The log name may vary in your case, since the last three characters are for the day of the week. Mon in my case means Monday.
sudo tail -f /var/lib/pgsql/data/log/postgresql-Mon.log
2023-12-20 17:59:36.789 CET [2738] LOG:  received SIGHUP, reloading configuration files
2023-12-20 17:59:36.790 CET [2738] LOG:  parameter "primary_conninfo" changed to "user=replicator host=10.0.1.63 port=5432 password=replica"
2023-12-20 17:59:36.805 CET [2836] LOG:  fetching timeline history file for timeline 2 from primary server
2023-12-20 17:59:36.823 CET [2836] LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
2023-12-20 17:59:36.833 CET [2836] LOG:  replication terminated by primary server
2023-12-20 17:59:36.833 CET [2836] DETAIL:  End of WAL reached on timeline 1 at 0/3420D60.  
2023-12-20 17:59:36.834 CET [2836] FATAL:  terminating walreceiver process due to administrator command
2023-12-20 17:59:36.834 CET [2742] LOG:  new target timeline is 2
2023-12-20 17:59:36.849 CET [2837] LOG:  started streaming WAL from primary at 0/3000000 on timeline 2
2023-12-20 17:59:36.894 CET [2742] LOG:  redo starts at 0/3420D60
We can see in the logs, that the replication is running.
Test Replication
If we want to, we can test the replication by creating a record as well.
Postgres-01 / Secondary
[postgres@postgres-01]$ pgsql replication_db
replication_db=# SELECT * FROM test;
ERROR:  relation "test" does not exist
As we can see, the table test does not yet exist.
Postgres-02 / Primary
So lets go to our primary and create it.
[postgres@postgres-02]$ psql replication_db
replication_db=# CREATE TABLE "test" (id int, name text);
CREATE TABLE
Postgres-01 / Secondary
Back on our secondary we can check again and get no error. That means the table exists, but there are no entries. That means the replication works.
[postgres@postgres-01]$ psql replication_db
replication_db=# SELECT * FROM test;
Execute Switchback
Now we are ready to switch back to postgres-01 as our primary database.
Shutdown Primary
Again we start by stopping our primary database
[tux@postgres-02]$ sudo systemctl stop postgresql
Promote Secondary
Then we promote our secondary to be the new master.
[postgres@postgres-01]$ pgsql
postgres=# SELECT pg_promote();
 pg_promote
------------
 t
(1 row)
This also removes the standby.signal file. So if we restart the database, postgres-01 we stay the master.
Set Postgres-02 to Standby
In order to be able to start postgres-02 as a secondary again, we will need to create the standby.signal
[postgres@postgres-02]$ touch /var/lib/pgsql/data/standby.signal
Now we can start Postgres-02 again and it will start to replicate from postgres-01.
[tux@postgres-02]$ sudo systemctl start postgresql
Conclusion
Now we have a working replication and the ability to switch between the two servers with minimal downtime.
It is important to always create the standby.signal file on the secondary database before starting it. Otherwise you will have to independent masters and might get inconsistent data.
