The group communication engine for Group Replication (XCom, a Paxos variant) includes a cache for messages (and their metadata) exchanged between the group members as a part of the consensus protocol. Among other functions, the message cache is used for recovery of missed messages by members that reconnect with the group after a period where they were unable to communicate with the other group members.
From MySQL 8.0.16, a cache size limit can be set for XCom's
message cache using the
group_replication_message_cache_size
system variable. If the cache size limit is reached, XCom removes
the oldest entries that have been decided and delivered. The same
cache size limit should be set on all group members, because an
unreachable member that is attempting to reconnect selects any
other member at random for recovery of missed messages. The same
messages should therefore be available in each member's cache.
Before MySQL 8.0.16, the cache size was 1 GB, and the default
setting for the cache size from MySQL 8.0.16 is the same. Ensure
that sufficient memory is available on your system for your chosen
cache size limit, considering the size of MySQL Server's other
caches and object pools. Note that the limit set using
group_replication_message_cache_size
applies only to the data stored in the cache, and the cache
structures require an additional 50 MB of memory.
When selecting a
group_replication_message_cache_size
setting, do so with reference to the expected volume of messages
in the time period before a member is expelled. The length of this
time period is controlled by the
group_replication_member_expel_timeout
system variable, which determines the waiting period (up to an
hour) that is allowed in addition to the initial 5-second
detection period for members to return to the group rather than
being expelled. Note that before MySQL 8.0.21, this time period
defaulted to 5 seconds from the member becoming unavailable, which
is just the detection period before a suspicion is created,
because the additional expel timeout set by the
group_replication_member_expel_timeout
system variable defaulted to zero. From 8.0.21 the expel timeout
defaults to 5 seconds, so by default a member is not expelled
until it has been absent for at least 10 seconds.