Following an emergency failover, and there is a risk of the transaction sets differing between parts of the ClusterSet, you have to fence the cluster either from write traffic or all traffic.
If a network partition happens, then there is the possibility of a split-brain situation, where instances lose synchronization and cannot communicate correctly to define the synchronization state. A split-brain can occur in situations such as when a DBA decides to forcibly elect a replica cluster to become the primary cluster creating more than one master, leading to the split-brain situation.
In this situation, a DBA can choose to fence the original primary cluster from:
Writes.
All traffic.
Three fencing operations are available:
-
<Cluster>.fenceWrites(): Stops write traffic to a primary cluster of a ClusterSet. Replica clusters do not accept writes, so this operation has no effect on them.It is possible to use on INVALIDATED Replica clusters. Also, if run against a Replica cluster with
super_read_onlydisabled, it will enable it. -
<Cluster>.unfenceWrites(): Resumes write traffic. This operation can be run on a cluster that was previously fenced from write traffic using the<Cluster>.fenceWrites()operation.It is not possible to use
on a Replica Cluster.cluster.unfenceWrites() -
<Cluster>.fenceAllTraffic(): Fences a cluster, and all Read Replicas in that cluster, from all traffic. If you have fenced a cluster from all traffic using<Cluster>.fenceAllTraffic(), you have to reboot the cluster using thedba.rebootClusterFromCompleteOutage()MySQL Shell command.For more information on
dba.rebootClusterFromCompleteOutage(), see Section 8.8.3, “Rebooting a Cluster from a Major Outage”.
Issuing .fenceWrites() on a replica cluster
returns an error:
ERROR: Unable to fence Cluster from write traffic:
operation not permitted on REPLICA Clusters
Cluster.fenceWrites: The Cluster '<Cluster>' is a REPLICA Cluster
of the ClusterSet '<ClusterSet>' (MYSQLSH 51616)
Even though you primarily use fencing on clusters belonging to a
clusterset, it is also possible to fence standalone clusters
using <Cluster>.fenceAllTraffic().
-
To fence a primary cluster from write traffic, use the Cluster.fenceWrites command as follows:
<Cluster>.fenceWrites()After running the command:
The automatic
super_read_onlymanagement is disabled on the cluster.super_read_onlyis enabled on all the instances in the cluster.All applications are blocked from performing writes on the cluster.
cluster.fenceWrites() The Cluster 'primary' will be fenced from write traffic * Disabling automatic super_read_only management on the Cluster... * Enabling super_read_only on '127.0.0.1:3311'... * Enabling super_read_only on '127.0.0.1:3312'... * Enabling super_read_only on '127.0.0.1:3313'... NOTE: Applications will now be blocked from performing writes on Cluster 'primary'. Use <Cluster>.unfenceWrites() to resume writes if you are certain a split-brain is not in effect. Cluster successfully fenced from write traffic -
To check that you have fenced a primary cluster from write traffic, use the
<Cluster>.statuscommand as follows:<Cluster>.clusterset.status()The output is as follows:
clusterset.status() { "clusters": { "primary": { "clusterErrors": [ "WARNING: Cluster is fenced from Write traffic. Use cluster.unfenceWrites() to unfence the Cluster." ], "clusterRole": "PRIMARY", "globalStatus": "OK_FENCED_WRITES", "primary": null, "status": "FENCED_WRITES", "statusText": "Cluster is fenced from Write Traffic." }, "replica": { "clusterRole": "REPLICA", "clusterSetReplicationStatus": "OK", "globalStatus": "OK" } }, "domainName": "primary", "globalPrimaryInstance": null, "primaryCluster": "primary", "status": "UNAVAILABLE", "statusText": "Primary Cluster is fenced from write traffic." -
To unfence a cluster and resume write traffic to a primary cluster, use the Cluster.fenceWrites command as follows:
<Cluster>.unfenceWrites()The automatic
super_read_onlymanagement on the primary cluster is enabled, and thesuper_read_onlystatus on the primary cluster instance.cluster.unfenceWrites() The Cluster 'primary' will be unfenced from write traffic * Enabling automatic super_read_only management on the Cluster... * Disabling super_read_only on the primary '127.0.0.1:3311'... Cluster successfully unfenced from write traffic -
To fence a cluster from all traffic, use the Cluster.fenceAllTraffic command as follows:
<Cluster>.fenceAllTraffic()The
super_read_onlystatus is enabled on the primary instance of the cluster instance. Before enablingoffline_modeon all the instances in the cluster:cluster.fenceAllTraffic() The Cluster 'primary' will be fenced from all traffic * Enabling super_read_only on the primary '127.0.0.1:3311'... * Enabling offline_mode on the primary '127.0.0.1:3311'... * Enabling offline_mode on '127.0.0.1:3312'... * Stopping Group Replication on '127.0.0.1:3312'... * Enabling offline_mode on '127.0.0.1:3313'... * Stopping Group Replication on '127.0.0.1:3313'... * Stopping Group Replication on the primary '127.0.0.1:3311'... Cluster successfully fenced from all traffic -
To unfence a cluster from all traffic, use the
dba.rebootClusterFromCompleteOutage()MySQL Shell command. When you have restored the cluster, you rejoin the instances to the cluster by selecting Y when asked if you want to rejoin the instance to the cluster:cluster = dba.rebootClusterFromCompleteOutage() Restoring the cluster 'primary' from complete outage... The instance '127.0.0.1:3312' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y/N]: Y The instance '127.0.0.1:3313' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y/N]: Y * Waiting for seed instance to become ONLINE... 127.0.0.1:3311 was restored. Rejoining '127.0.0.1:3312' to the cluster. Rejoining instance '127.0.0.1:3312' to cluster 'primary'... The instance '127.0.0.1:3312' was successfully rejoined to the cluster. Rejoining '127.0.0.1:3313' to the cluster. Rejoining instance '127.0.0.1:3313' to cluster 'primary'... The instance '127.0.0.1:3313' was successfully rejoined to the cluster. The cluster was successfully rebooted. <Cluster:primary>