Group Replication is designed to ensure that the database service
is continuously available, even if some of the servers that form
the group are currently unable to participate in it due to planned
maintenance or unplanned issues. As long as the remaining members
are a majority of the group they can elect a new primary and
continue to function as a group. However, if every member of a
replication group leaves the group, and Group Replication is
stopped on every member by a STOP
GROUP_REPLICATION
statement or system shutdown, the
group now only exists in theory, as a configuration on the
members. In that situation, to re-create the group, it must be
started by bootstrapping as if it was being started for the first
time.
The difference between bootstrapping a group for the first time
and doing it for the second or subsequent times is that in the
latter situation, the members of a group that was shut down might
have different transaction sets from each other, depending on the
order in which they were stopped or failed. A member cannot join a
group if it has transactions that are not present on the other
group members. For Group Replication, this includes both
transactions that have been committed and applied, which are in
the gtid_executed
GTID set, and
transactions that have been certified but not yet applied, which
are in the group_replication_applier
channel.
The exact point at which a transaction is committed depends on the
transaction consistency level that is set for the group (see
Section 20.5.3, “Transaction Consistency Guarantees”).
However, a Group Replication group member never removes a
transaction that has been certified, which is a declaration of the
member’s intent to commit the transaction.
The replication group must therefore be restarted beginning with the most up to date member, that is, the member that has the most transactions executed and certified. The members with fewer transactions can then join and catch up with the transactions they are missing through distributed recovery. It is not correct to assume that the last known primary member of the group is the most up to date member of the group, because a member that was shut down later than the primary might have more transactions. You must therefore restart each member to check the transactions, compare all the transaction sets, and identify the most up to date member. This member can then be used to bootstrap the group.
Follow this procedure to restart a replication group safely after every member shuts down.
For each group member in turn, in any order:
Connect a client to the group member. If Group Replication is not already stopped, issue a
STOP GROUP_REPLICATION
statement and wait for Group Replication to stop.Edit the MySQL Server configuration file (typically named
my.cnf
on Linux and Unix systems, ormy.ini
on Windows systems) and set the system variablegroup_replication_start_on_boot=OFF
. This setting prevents Group Replication from starting when MySQL Server is started, which is the default.If you cannot change that setting on the system, you can just allow the server to attempt to start Group Replication, which will fail because the group has been fully shut down and not yet bootstrapped. If you take that approach, do not set
group_replication_bootstrap_group=ON
on any server at this stage.Start the MySQL Server instance, and verify that Group Replication has not been started (or has failed to start). Do not start Group Replication at this stage.
Collect the following information from the group member:
The contents of the
gtid_executed
GTID set. You can get this by issuing the following statement:mysql> SELECT @@GLOBAL.GTID_EXECUTED
The set of certified transactions on the
group_replication_applier
channel. You can get this by issuing the following statement:mysql> SELECT received_transaction_set FROM \ performance_schema.replication_connection_status WHERE \ channel_name="group_replication_applier";
When you have collected the transaction sets from all the group members, compare them to find which member has the biggest transaction set overall, including both the executed transactions (
gtid_executed
) and the certified transactions (on thegroup_replication_applier
channel). You can do this manually by looking at the GTIDs, or you can compare the GTID sets using stored functions, as described in Section 19.1.3.8, “Stored Function Examples to Manipulate GTIDs”.Use the member that has the biggest transaction set to bootstrap the group, by connecting a client to the group member and issuing the following statements:
mysql> SET GLOBAL group_replication_bootstrap_group=ON; mysql> START GROUP_REPLICATION; mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
It is important not to store the setting
group_replication_bootstrap_group=ON
in the configuration file, otherwise when the server is restarted again, a second group with the same name is set up.To verify that the group now exists with this founder member in it, issue this statement on the member that bootstrapped it:
mysql> SELECT * FROM performance_schema.replication_group_members;
Add each of the other members back into the group, in any order, by issuing a
START GROUP_REPLICATION
statement on each of them:mysql> START GROUP_REPLICATION;
To verify that each member has joined the group, issue this statement on any member:
mysql> SELECT * FROM performance_schema.replication_group_members;
When the members have rejoined the group, if you edited their configuration files to set
group_replication_start_on_boot=OFF
, you can edit them again to setON
(or remove the system variable, sinceON
is the default).