Group Replication's failure detection mechanism is designed to identify group members that are no longer communicating with the group, and expel them as and when it seems likely that they have failed. Having a failure detection mechanism increases the chance that the group contains a majority of correctly working members, and that requests from clients are therefore processed correctly.
Normally, all group members regularly exchange messages with all other group members. If a group member does not receive any messages from a particular fellow member for 5 seconds, when this detection period ends, it creates a suspicion of the fellow member. When a suspicion times out, the suspected member is assumed to have failed, and is expelled from the group. An expelled member is removed from the membership list seen by the other members, but it does not know that it has been expelled from the group, so it sees itself as online and the other members as unreachable. If the member has not in fact failed (for example, because it was just disconnected due to a temporary network issue) and it is able to resume communication with the other members, it receives a view containing the information that it has been expelled from the group.
The responses of group members, including the failed member itself, to these situations can be configured at a number of points in the process. By default, the following behaviors happen if a member is suspected of having failed:
When a suspicion is created, it times out immediately (its lifetime is set to 0), so the suspected member is expelled as soon as the expired suspicion is identified. The member could potentially survive for a further few seconds after the timeout because the check for expired suspicions is carried out periodically.
If an expelled member resumes communication and realises that it was expelled, it does not try to rejoin the group and accepts its expulsion.
When an expelled member accepts its expulsion, it switches to super read only mode and awaits operator attention. (The exception is in releases from MySQL 8.0.12 to 8.0.15, where the default was for the member to shut itself down. From MySQL 8.0.16, the behavior was changed to match the behavior in MySQL 5.7.)
These defaults are set to prioritize the correct operation of the group and the correct handling of requests. However, they might be inconvenient in the case of slower networks or networks with a high rate of transient failures, because in these situations there could be a frequent requirement for operator intervention to fix expelled members. They also do not allow for continued operation of the group to be planned in the case of expected network failures or machine slowdowns. You can use Group Replication configuration options to change these behaviors either permanently or temporarily, to suit your system's requirements and your priorities, as follows:
You can use the
group_replication_member_expel_timeoutsystem variable, which is available from MySQL 8.0.13, to allow additional time between the creation of a suspicion and the expulsion of the suspect member. You can set the lifetime of the suspicion up to 3600 seconds (one hour) before it times out. (The 5-second detection period before a suspicion is created is not configurable.) Suspect members in this state are listed as
UNREACHABLE, but are not removed from the group's membership list.
Bear in mind that while a group has unreachable members, you cannot add or remove any other members or elect a new primary. If you do want to take one of these actions and you cannot make the suspect member active again, you can force the suspicion to time out by changing
group_replication_member_expel_timeouton any online member to a value less than the time that has already elapsed since the suspicion was created.
You can use the
group_replication_autorejoin_triessystem variable, which is available from MySQL 8.0.16, to make an expelled member that is able to resume communication automatically try to rejoin the group. You can specify a number of attempts that the member makes to rejoin the group, instead of just accepting its expulsion as soon as it resumes communication. When the member's expulsion or unreachable majority timeout is reached, it makes an attempt to rejoin (using the current plugin option values), then continues to make further auto-rejoin attempts up to the specified number of tries. After an unsuccessful auto-rejoin attempt, the member waits 5 minutes before the next try. During the auto-rejoin procedure, the expelled member remains in super read only mode and displays an
ERRORstate on its view of the replication group.
Bear in mind that while a member remains in this mode, although writes cannot be made on the member, reads can, with an increasing likelihood of stale reads over time. If you do want to intervene to take the member offline, the member can be stopped manually at any time by using a
STOP GROUP_REPLICATIONstatement or shutting down the server. You can monitor the auto-rejoin procedure using the Performance Schema. While an auto-rejoin procedure is taking place, the Performance Schema table
events_stages_currentshows the event “Undergoing auto-rejoin procedure”, with the number of retries that have been attempted so far during this instance of the procedure (in the
events_stages_summary_global_by_event_nametable shows the number of times the server instance has initiated the auto-rejoin procedure (in the
events_stages_history_longtable shows the time each of these auto-rejoin procedures was completed (in the
You can use the
group_replication_exit_state_actionsystem variable, which is available from MySQL 8.0.12 and MySQL 5.7.24, to choose whether an expelled member that fails to rejoin (or does not try) shuts down MySQL Server or switches itself to super read only mode. As with the auto-rejoin process, if the member goes to super read only mode, there is a probability of stale reads which increases over time. Instructing the member to shut itself down ends this situation and means that you do not need to pro-actively monitor the servers for failures, but it means that the MySQL Server instance is unavailable and must be restarted. Operator intervention is required whatever exit action is set, as an ex-member that has exhausted its auto-rejoin attempts (or never had any) and has been expelled from the group is not allowed to rejoin without a restart of Group Replication.Important
If a failure occurs before the member has successfully joined the group, the exit action specified by
group_replication_exit_state_actionis not taken. This is the case if there is a failure during the local configuration check, or a mismatch between the configuration of the joining member and the configuration of the group. In these situations, the
super_read_onlysystem variable is left with its original value, and the server does not shut down MySQL. To ensure that the server cannot accept updates when Group Replication did not start, we therefore recommend that
super_read_only=ONis set in the server's configuration file at startup, which Group Replication will change to
OFFon primary members after it has been started successfully. This safeguard is particularly important when the server is configured to start Group Replication on server boot (
group_replication_start_on_boot=ON), but it is also useful when Group Replication is started manually using a
If a failure occurs after the member has successfully joined the group, the specified exit action is taken. This is the case if there is an applier error, if the member is expelled from the group, or if the member is set to time out in the event of an unreachable majority. In these situations, if
READ_ONLYis the exit action, the
super_read_onlysystem variable is set to
ON, or if
ABORT_SERVERis the exit action, the server shuts down MySQL.
Note that where group members are at an older MySQL Server release
that does not support a relevant setting, or at a release with a
different default, they act towards themselves and other group
members according to the default behaviors stated above. For
example, a member that does not support the
system variable expels other members as soon as an expired
suspicion is detected, and this expulsion is accepted by other
members even if they support the system variable and have a longer
Members that have not failed might lose contact with part, but not all, of the replication group due to a network partition. For example, in a group of 5 servers (S1,S2,S3,S4,S5), if there is a disconnection between (S1,S2) and (S3,S4,S5) there is a network partition. The first group (S1,S2) is now in a minority because it cannot contact more than half of the group. Any transactions that are processed by the members in the minority group are blocked, because the majority of the group is unreachable, therefore the group cannot achieve quorum. If the servers in the majority group are still online, they can automatically form their own functional partition and continue to function as a replication group. For a detailed description of this scenario, see Section 18.4.5, “Network Partitioning”.
In this situation, the default behavior is for the members in both the minority and the majority to remain in the group, continue to accept transactions (although they are blocked on the members in the minority), and wait for operator intervention. The intervention process, which is described in Section 18.4.5, “Network Partitioning”, involves checking which servers are functioning and forcing a new group membership if necessary.
If you do not want to pro-actively monitor for this situation,
and want to avoid the possibility of creating a split-brain
situation (with two versions of the group membership) due to
inappropriate intervention, you can instruct members that find
themselves in a minority to exit the group after a timeout
period. The system variable
sets a number of seconds for a member to wait after losing
contact with the majority of group members. After this time, all
pending transactions that have been processed by the member and
the others in the minority group are rolled back, and the
servers in that group move to the
state, then follow the exit action specified by