Following an emergency failover, and there is a risk of the transaction sets differing between parts of the ClusterSet, you have to fence the cluster either from write traffic or all traffic.
If a network partition happens, then there is the possibility of a split-brain situation, where instances lose synchronization and cannot communicate correctly to define the synchronization state. A split-brain can occur in situations such as when a DBA decides to forcibly elect a replica cluster to become the primary cluster creating more than one master, leading to the split-brain situation.
In this situation, a DBA can choose to fence the original primary cluster from:
Writes.
All traffic.
Three fencing operations are available:
-
<Cluster>.fenceWrites()
: Stops write traffic to a primary cluster of a ClusterSet. Replica clusters do not accept writes, so this operation has no effect on them.It is possible to use on INVALIDATED Replica clusters. Also, if run against a Replica cluster with
super_read_only
disabled, it will enable it. -
<Cluster>.unfenceWrites()
: Resumes write traffic. This operation can be run on a cluster that was previously fenced from write traffic using the<Cluster>.fenceWrites()
operation.It is not possible to use
on a Replica Cluster.cluster
.unfenceWrites() -
<Cluster>.fenceAllTraffic()
: Fences a cluster, and all Read Replicas in that cluster, from all traffic. If you have fenced a cluster from all traffic using<Cluster>.fenceAllTraffic()
, you have to reboot the cluster using thedba.rebootClusterFromCompleteOutage()
MySQL Shell command.For more information on
dba.rebootClusterFromCompleteOutage()
, see Section 7.8.3, “Rebooting a Cluster from a Major Outage”.
Issuing .fenceWrites()
on a replica cluster
returns an error:
ERROR: Unable to fence Cluster from write traffic:
operation not permitted on REPLICA Clusters
Cluster.fenceWrites: The Cluster '<Cluster>' is a REPLICA Cluster
of the ClusterSet '<ClusterSet>' (MYSQLSH 51616)
Even though you primarily use fencing on clusters belonging to a
clusterset, it is also possible to fence standalone clusters
using <Cluster>.fenceAllTraffic()
.
-
To fence a primary cluster from write traffic, use the Cluster.fenceWrites command as follows:
<Cluster>.fenceWrites()
After running the command:
The automatic
super_read_only
management is disabled on the cluster.super_read_only
is enabled on all the instances in the cluster.All applications are blocked from performing writes on the cluster.
cluster.fenceWrites() The Cluster 'primary' will be fenced from write traffic * Disabling automatic super_read_only management on the Cluster... * Enabling super_read_only on '127.0.0.1:3311'... * Enabling super_read_only on '127.0.0.1:3312'... * Enabling super_read_only on '127.0.0.1:3313'... NOTE: Applications will now be blocked from performing writes on Cluster 'primary'. Use <Cluster>.unfenceWrites() to resume writes if you are certain a split-brain is not in effect. Cluster successfully fenced from write traffic
-
To check that you have fenced a primary cluster from write traffic, use the
<Cluster>.status
command as follows:<Cluster>.clusterset.status()
The output is as follows:
clusterset.status() { "clusters": { "primary": { "clusterErrors": [ "WARNING: Cluster is fenced from Write traffic. Use cluster.unfenceWrites() to unfence the Cluster." ], "clusterRole": "PRIMARY", "globalStatus": "OK_FENCED_WRITES", "primary": null, "status": "FENCED_WRITES", "statusText": "Cluster is fenced from Write Traffic." }, "replica": { "clusterRole": "REPLICA", "clusterSetReplicationStatus": "OK", "globalStatus": "OK" } }, "domainName": "primary", "globalPrimaryInstance": null, "primaryCluster": "primary", "status": "UNAVAILABLE", "statusText": "Primary Cluster is fenced from write traffic."
-
To unfence a cluster and resume write traffic to a primary cluster, use the Cluster.fenceWrites command as follows:
<Cluster>.unfenceWrites()
The automatic
super_read_only
management on the primary cluster is enabled, and thesuper_read_only
status on the primary cluster instance.cluster.unfenceWrites() The Cluster 'primary' will be unfenced from write traffic * Enabling automatic super_read_only management on the Cluster... * Disabling super_read_only on the primary '127.0.0.1:3311'... Cluster successfully unfenced from write traffic
-
To fence a cluster from all traffic, use the Cluster.fenceAllTraffic command as follows:
<Cluster>.fenceAllTraffic()
The
super_read_only
status is enabled on the primary instance of the cluster instance. Before enablingoffline_mode
on all the instances in the cluster:cluster.fenceAllTraffic() The Cluster 'primary' will be fenced from all traffic * Enabling super_read_only on the primary '127.0.0.1:3311'... * Enabling offline_mode on the primary '127.0.0.1:3311'... * Enabling offline_mode on '127.0.0.1:3312'... * Stopping Group Replication on '127.0.0.1:3312'... * Enabling offline_mode on '127.0.0.1:3313'... * Stopping Group Replication on '127.0.0.1:3313'... * Stopping Group Replication on the primary '127.0.0.1:3311'... Cluster successfully fenced from all traffic
-
To unfence a cluster from all traffic, use the
dba.rebootClusterFromCompleteOutage()
MySQL Shell command. When you have restored the cluster, you rejoin the instances to the cluster by selecting Y when asked if you want to rejoin the instance to the cluster:cluster = dba.rebootClusterFromCompleteOutage() Restoring the cluster 'primary' from complete outage... The instance '127.0.0.1:3312' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y/N]: Y The instance '127.0.0.1:3313' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y/N]: Y * Waiting for seed instance to become ONLINE... 127.0.0.1:3311 was restored. Rejoining '127.0.0.1:3312' to the cluster. Rejoining instance '127.0.0.1:3312' to cluster 'primary'... The instance '127.0.0.1:3312' was successfully rejoined to the cluster. Rejoining '127.0.0.1:3313' to the cluster. Rejoining instance '127.0.0.1:3313' to cluster 'primary'... The instance '127.0.0.1:3313' was successfully rejoined to the cluster. The cluster was successfully rebooted. <Cluster:primary>