PostgreSQL high availability: multi master replication solution


Author: ibrar Ahmed joined percona in July 2018. Prior to joining percona, ibrar worked as a senior database architect at enterprisedb for 10 years. Ibrar has 18 years of software development experience. Ibrar has written many books on PostgreSQL.

Translator: Wei Bo, executive director of training and certification of PG branch of China, Senior Database Engineer, with more than ten years of experience in database operation and maintenance management and training, mastering PostgreSQL architecture deployment and performance optimization, is committed to promoting the development of PostgreSQL in China.

Due to the huge amount of data, scalability has become one of the most popular topics in the field of database. Scalability can be achieved horizontally or vertically. Vertical scalability means adding more resources / hardware to existing nodes to enhance the database’s ability to store and process more data, such as adding CPU, memory or disk to existing nodes. Each DBMS engine can use the newly added resources more effectively by improving the locking / mutual exclusion mechanism and concurrency, so as to improve the ability of vertical scalability. The database engine provides configuration parameters, which helps to make more efficient use of available hardware resources.

Due to the cost of hardware and the limitation of adding new hardware to existing nodes, it is not always possible to add new hardware. Therefore, horizontal scalability is needed, which means adding more nodes to the existing network nodes instead of enhancing the functions of the existing nodes. Contrary to vertical scalability, horizontal scalability is difficult to implement. This requires more development work.

PostgreSQL provides a rich set of functions for vertical scalability and horizontal scalability. It supports computers with multiple processors and large amounts of memory, and provides configuration parameters to manage the use of these resources. The new function of parallelism in PostgreSQL makes vertical scalability more prominent, but it is not lack of horizontal scalability. Replication is the key pillar of horizontal scalability. PostgreSQL supports one-way master-slave replication, which is enough to meet many use cases.

Key concepts

Database replication
Database replication copies data to other servers and stores it on multiple nodes. In this process, the database instance is transferred from one node to another, and the exact replication is carried out. Data replication is used to improve data availability, which is a key function of HA. Usually there is a complete database instance, or some frequently used or needed objects are copied to another server. Replication provides multiple consistent copies of the database, which not only provides high availability, but also improves query performance.

synchronous copy
When writing data to disk, there are two strategies: synchronous and asynchronous. Synchronous replication means writing data to both the master and slave servers at the same time. In other words, “synchronous replication” means submitting and waiting for the remote side to write / refresh. Synchronous replication is used in high-end transactional environments with immediate failover requirements.

Asynchronous replication
Asynchronous means that data is written to the host first, and then copied to the slave. In the case of a crash, data loss may occur, but asynchronous replication provides little overhead, so it is acceptable in most cases. It doesn’t overburden the host. Compared with synchronous replication, it takes longer to fail over from the primary database to the secondary database.

In short, the main difference between synchronous and asynchronous is when data is written to the master and slave servers.

Single master replication
Single primary replication means that only data can be modified on a single node and those modifications can be replicated to one or more nodes. Data can only be updated and inserted on the primary node. In this case, the application needs to route traffic to the primary server, which increases the complexity of the application. Because only one master node is responsible for writing data, there is no chance of conflict. In most cases, single master replication is sufficient for the application because the configuration and management are not so complex. But in some cases, single master replication is not enough, you need multi master replication.

Multi master replication
Multi master replication means that there are multiple nodes acting as master nodes. Data is replicated between nodes and can be updated and inserted on a set of primary nodes. In this case, there are multiple copies of the data. The system is also responsible for resolving any conflicts between concurrent changes. There are two main reasons for having multiple primary replicates. One is high availability, and the second is performance. In most cases, some nodes are dedicated to intensive write operations, while others are dedicated to some nodes or used for failover.

Advantages and disadvantages of multi master replication


  • If one host fails, the other host can still provide update and insert services.
  • The primary node is located in several different locations, so the probability of failure of all primary nodes is very small.
  • Data can be updated on multiple servers.
  • Applications do not need to route traffic only to a single host.


  • The main drawback of multi master replication is its complexity.
  • It’s very difficult to resolve conflicts because you can write on multiple nodes at the same time.
  • Sometimes human intervention is needed in case of conflict.
  • The possibility of data inconsistency.

As we have discussed, in most cases, single master replication is sufficient, which is strongly recommended, but in some cases, multiple master replication is still required. PostgreSQL has built-in single master replication, but unfortunately there is no multi master replication in the main version of PostgreSQL. There are several multi master replication solutions available, some in the form of applications and some in the form of PostgreSQL branches. These spin offs have their own small communities and are mostly managed by a single company rather than by the PostgreSQL international community.
PostgreSQL high availability: multi master replication solution
There are many categories of these solutions, including open source / closed source, priority, free and paid.

  • BDR (bidirectional replication)
  • xDB
  • PostgreSQL-XL
  • PostgreSQL-XC/PostgreSQL-XC2
  • Rubyrep
  • Bucardo

These are some of the key features of all replication solutions
PostgreSQL high availability: multi master replication solution

1.BDR (bidirectional replication)
BDR is a multi master replication solution with different versions. The early version of BDR is open source, but the latest version is closed source. This solution was developed by 2nd quadrant and is one of the most elegant multi master solutions to date. BDR provides asynchronous multi master logical replication. This is based on the PostgreSQL logic decoding function.    Because BDR application essentially replays transactions on other nodes, if there is a conflict between the transaction being applied and the transaction submitted on the receiving node, the replay operation may fail.
PostgreSQL high availability: multi master replication solution

Enterprisedb has developed its own bi-directional replication solution called XdB with Java. It’s based on its own protocol. Because it is a closed source solution, there is no design information known to the outside world.

  • Developed and maintained by enterprisedb.
  • Develop with Java.
  • Source code is closed source code.
  • XdB replication server contains multiple executables.
  • This is a completely closed source proprietary software.
  • With java development, people will complain about its performance.
  • The time to fail over is not acceptable.
  • The user interface can be used to configure and maintain the replication system.

PostgreSQL high availability: multi master replication solution

3.PostgreSQL XC/XC2
PostgreSQL XC was developed by enterprisedb and NTT. It’s a synchronous replication solution. Postgres XC is an open source project to provide a writable, scalable, synchronous, symmetric and transparent PostgreSQL cluster solution. Over the years, I have never seen a lot of PostgreSQL XC development on enterprisedb and NTT. At present, Huawei is working hard for this. For OLAP, some performance improvements have been reported, but not for TPS.

4.PostgreSQL XL
It is a branch of postgresql-xc and is currently supported by 2nd quadrant. It will lag behind the iteration of the community PostgreSQL version. It is understood that it is based on PostgreSQL 10.6, which is not compatible with the latest version of PostgreSQL, postgresql-12. We know it’s based on postgresql-xc, and when we talk about OLAP, it’s very good, but not very suitable for high TPS.
PostgreSQL high availability: multi master replication solution
Note: all PostgreSQL XC / XC2 / XL are considered as “PostgreSQL derived software”, which is not synchronized with the current development of PostgreSQL.

It is an asynchronous master / Master replication developed by Arndt Lehmann. It claims to have the simplest configuration features and can run across platforms, including windows. It always runs on two servers, called “left” and “right” in rubyrep terminology, respectively. Therefore, it is more suitable to call it “2-master” setting instead of “multi master”.

  • Rubyrep can replicate changes continuously between the left and right databases.
  • Automatically set the necessary triggers, log tables, etc
  • Automatically discover newly added tables and synchronize table contents
  • Automatically reconfigure the sequence to avoid duplicate key conflicts
  • Track changes to primary key columns
  • Master slave replication can be realized at the same time
  • Provides pre established conflict resolution: left / right wins; Early / late change wins
  • Custom conflict resolution can be specified through Ruby code snippets
  • Replication decisions can be selectively recorded in the rubyrep event log table

Note: – as far as development is concerned, the project has not been carried out in the past three years.

Bucardo is a trigger based replication solution developed by Jon Jensen and Greg Sabino Mullane of end point company. Bucardo’s solution has been around for nearly 20 years and was originally designed as an “lazy” asynchronous solution that eventually replicates all changes. There is a Perl daemon that listens for and operates on notify requests. Changes to the table are recorded in the table (bucardo)_ Delta) and notify the daemons. The daemons notify the controller, which starts a child process to synchronize table changes. If there is a conflict, use a standard or custom conflict handler to handle it.

  • Trigger based replication
  • There are conflict resolution strategies
  • Dependency on Perl 5, DBI, DBD:: PG, dbix:: safe.
  • Installation and configuration are complex.
  • Replication is often interrupted and errors occur.

Single host replication is sufficient in most cases, and it has been observed that people are configuring multi host replication and making their designs too complex. It is strongly recommended that you design the system and try to avoid multi master replication and use it only when there is no other way. There are two reasons: first, it makes the system too complex and difficult to debug; Second, because there is no community maintained multi master replication available, you will not be able to get support from the PostgreSQL community.

Link to the original text:

For more wonderful content, please pay attention to the following platforms and websites:

Official account of China Postgre SQL chapter (technical articles and technical activities):
PostgreSQL branch of Open Source Software Alliance

Technical Q & a community of China postgre SQL branch:

Official website of China postgre SQL branch: