MySQL :: MySQL NDB Cluster replication: Circular replication for active-active clusters

MySQL Blog Archive

For the latest blogs go to blogs.oracle.com/mysql

MySQL NDB Cluster replication: Circular replication for active-active clusters

Posted on May 17, 2024 by Magnus Blåudd
Category: NDB
Tags: replication

This is the fourth article in our series about MySQL NDB Cluster replication. We've now got so far into the details that data can not only be replicated from one cluster to the other but also back to the first again, we call this circular replication and it enables locating clusters in disparate places to achieve even higher availability, protection against disaster and locating data closer to users.

In MySQL NDB Cluster replication it’s possible to read and write data from any MySQL Server regardless of which cluster it’s connected to. This is what we refer to as multi-primary or active-active replication where data is replicated in all directions and conflict detection is used to handle when data has occasionally been updated in more than one cluster simultaneously. The most common use case for this is with applications that normally updates only its own data in one cluster and thus conflicts are rare, in the case when there is a need to switch the application to use another cluster - because it's more local or for cluster maintenance - the data will already be there and the application can continue to function without interruption. The benefit of this is that application failover can occur without needing to failover at the database layer.

Active-active and conflict detection are unique features that enhances the MySQL replication mechanism and makes it possible to build more advanced replication configurations spanning multiple NDB clusters.

This article shows two different ways to build clusters with circular replication and builds on concepts described in our previous posts about MySQL NDB Cluster replication:

Circular replication with IGNORE_SERVER_IDS

Diagram showing MySQL NDB Cluster circular replication using IGNORE_SERVER_IDS which is one way to set up geo replication. — Circular replication for active-active clusters using IGNORE_SERVER_IDS = ()

This diagram shows a logical view of the components required for circular replication for active-active clusters using redundant binlog and replica MySQL servers. In this configuration, the loop is terminated using IGNORE_SERVER_IDS = (server_id_list), an option for the CHANGE REPLICATION SOURCE command. The list of server_id's to ignore should contain all the MySQL servers in the replicas own cluster.

The MySQL clients use the frontend MySQL servers to read and write data in NDB.
Both the binlog MySQL servers write the changes from NDB to their binary log.
The replica MySQL Server fetches and applies the changes to the other cluster.
Both the binlog MySQL servers in the second cluster write the replica changes in NDB to their binary log.
The replica MySQL Server on the first cluster is configured to ignore server_ids from its own cluster while fetching changes, only those which are not ignored will be applied. The loop is terminated!

The same thing happens with changes done by the MySQL clients using the MySQL servers on the second cluster, their changes will be written to the two binary logs on the first cluster, and when they are replicated to the second cluster the changes will be written to the binary log on that cluster as well, they will be fetched but not applied again to the first cluster. This data flow is depicted with grey arrows in the diagram.

The advantage of this method to terminate the loop is that the binlogs on both clusters will have the complete record of the changes that have occurred regardless of which cluster the change originated in. Having a complete binary log allows further replication chains to be added from any binlog MySQL Server in the cluster, thus giving freedom to replicate to yet another cluster, standalone MySQL Server or MySQL HeatWave. The IGNORE_SERVER_IDS setting takes effect when the replica fetches the binary log from the source and an example of how this works can be seen in the below pictures where the NDB epoch transaction on the source contains changes from server_id 4 while those changes has been ignored and not written to the replicas relay log.

NDB epoch transaction in binlog with server_id 4 — The NDB epoch transaction has been written to binlog in first cluster, the changes from server_id=4 are included.

Showing relaylog on seconds cluster where the changes from server_id=4 has been filtered away — The same NDB epoch transaction has been transferred to second cluster and written to
relaylog, the changes from server_id=4 have been ignored.

This configuration requires that the replica MySQL Server knows the server_id’s of all binlog MySQL servers connected to the same cluster, in case adding new binlog MySQL servers to the cluster the replication settings need to be updated to ignore those as well. The recomended best practice is to use different server_id's in different clusters, for example 1000 series in first and 2000 series in the second, thus making it easy to distinguish which cluster they belong to.

Scaling out circular replication to more than two clusters

This solution can be scaled out by adding more clusters to the ring and also then each replica needs to be configured to ignore changes originating from its own cluster. When using circular replication each additional cluster extends the time for a change to propagate back to the originating cluster where the loop can be terminated, whether or not this is acceptable depends on the individual application use case. The resource usage is on the other hand constant in the number of MySQL servers required for replication at each cluster. With more clusters in the ring, the redundant binlog and replica MySQL servers reduce the need to stop replication and make it possible to continuously replicate also while performing maintenance like hardware or software upgrades.

Circular replication with –log-replica-updates=0

Another method to close the circular replication loop is to use the --log-replica-updates option to avoid that the binlog MySQL Server writes changes applied by a replica MySQL Server.

Diagram showing logical view of MySQL NDB Cluster circular replication — Circular replication for active-active clusters using –log-replica-updates=0

The preceding diagram shows a logical view of the components required for circular replication for active-active clusters using redundant binlog and replica MySQL servers. The loop is terminated using --log-replica-updates=0 which is a command line argument of the MySQL Server.

The MySQL clients use the frontend MySQL servers to read and write data in NDB.
Both the binlog MySQL servers write the changes from NDB to their binary log.
The replica MySQL Server fetches and applies the changes to the other cluster.
Both the binlog MySQL servers in the second cluster are configured with –log-replica-updates=0 and will NOT write the replica changes in NDB to their binary log. The loop is terminated!

The same thing happens with changes done by the MySQL clients using the MySQL servers on the second cluster, their changes will be written to the two binlogs on the first cluster, but when they have been replicated to the second cluster the changes will not be written to binlog again. This data flow is depicted with grey arrows in the diagram.

As can be seen in this diagram there is quite a short path until the replication loop is terminated, this avoids additional processing which is of course beneficial in most cases. The configuration does not produce a binary log with all changes that have occurred in both clusters and this means that a further replication chain would need to pull changes from the binlog MySQL servers on both clusters to get the full picture.

Scaling out to more clusters

Scaling out this solution to more than two clusters can be done using a replication approach sometimes referred to as merge replication where each individual cluster fetches changes directly from the other clusters. Such a setup reduces the time for data changes to propagate between the clusters as well as reducing the amount of data changes that need to be transferred. In the next article in this series, we will take a look at how this is done with merge replication between four clusters.

Summary

This article explains how circular replication between MySQL NDB clusters works and two different approaches for how to terminate the loop were described. Circular replication is one way to configure a MySQL NDB Cluster for active-active replication which allows data to be updated from any MySQL Server regardless of which cluster it is connected to.

More information

For more details on how to configure replication see:

MySQL NDB Cluster is open-source and can be downloaded both in source and binary form at MySQL Downloads where you find both the newlys released 8.4 LTS version as well as our innovation releases.

The source code is also available at the MySQL Server repository on GitHub and is frequently used by universities when studying the inner details of advanced high-performance distributed databases. Go explore the source yourself and why not submit a pull request with enhancements!