WL#9038: Group Replication: Support Binary Log Checksums

Affects: Server-8.0   —   Status: Complete

EXECUTIVE SUMMARY
=================

This worklog implements support for binlog checksums in Group
Replication. After this work is done, the user will be able
to use Group Replication without having to explicitly disable
binlog checksums on all members of the cluster.


RATIONALE
=========

Currently the Group Replication plugin requires that binlog-checksum
is disabled (set to NONE).

The default for the variable is CRC32:
https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#option_mysqld_binlog-checksum

In order to use Group Replication, however, you're required to set
it to NONE. This is not advisable since it:
  1. Removes one extra check that data is not corrupted when reading
     it from disk.
  2. Hurts usability as it requires more custom configuration of
     each mysqld member;
  3. Hurts user perception of the product, as it does not play well
     with other parts of the server

This worklog will introduce the support to binary log checksums to
Group Replication, that is, the required configuration
  --binlog-checksum=NONE
will be dropped.


USER STORIES
============

- As a MySQL DBA I want my servers to automatically compute and
  validate checksums of binary logs events, ensuring its integrity.

- As a MySQL DBA I want my binary logs to contain event checksums,
  so that even if I move them around for archiving purposes (different
  storage, different communication infrastructures, etc) or to feed
  other consumers than just MySQL replication, I can still validate
  event integrity when consuming such events.
Functional requirements
=======================
FR1: Group Replication must support all values of --binlog-checksum[1]
     option.

FR2: When --binlog-checksum option value is set to CRC32, all binary
     and relay logs, with the exception of the relay logs of
     group_replication_applier channel, must have checksum validation.

FR3: A member with --binlog-checksum=CRC32 must be able to join a group
     with --binlog-checksum=NONE.

FR4: A member with --binlog-checksum=NONE must be able to join a group
     with --binlog-checksum=CRC32.

[1] https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#option_mysqld_binlog-checksum

Non-functional requirements
===========================
  None.
SUMMARY OF THE APPROACH
=======================
This worklog will introduce the support to binary log checksums to
Group Replication, that is, the required configuration[1]
  --binlog-checksum=NONE
will be dropped.
All values of --binlog-checksum will be supported.

When --binlog-checksum option value is set to CRC32, all binary
and relay logs must have checksum validation.

There is one exception, group_replication_applier channel relay logs
cannot follow that rule since them do contain transactions from
multiple sources.
In single-primary mode, them do contain transactions from the
primary and the view change log transactions queued by the local
member.
In multi-primary mode, them do contain transactions from all members
plus the local view change log events.
The replication event checksums design does not consider multiple
sources, change that would be a too big change.

Also checksums are only generated when events are written to the
binary log, on Group Replication transactions are sent to the group
members before that, that is, the write to the binary log only
happens once the group agrees on the transaction delivery order and
outcome.
In order fix handle this, first checksum implementation needs to
support multiple sources.

[1] https://dev.mysql.com/doc/refman/8.0/en/replication-options-binary-log.html#option_mysqld_binlog-checksum


USER INTERFACE
==============
There are no changes on configuration options.

DBAs can drop the --binlog-checksum=NONE from mysqld configuration.


SECURITY CONTEXT
================
There are no repercussions.


UPGRADE/DOWNGRADE AND CROSS-VERSION REPLICATION
===============================================
There are no repercussions on such scenarios, since the two
possibles options of --binlog-checksum: NONE and CRC32 are
compatible.

Case 1: primary: NONE, secondary: CRC32
The secondary will not validate the checksums on the
group_replication_applier channel. The same will happen on recovery
since the primary will not send checksums.
The secondary will generate checksum events on its binary logs.

Case 2: primary: CRC32, secondary: NONE
The secondary will not validate the checksums on the
group_replication_applier channel, though secondary will validate
checksums on other replication channels (group_replication_recovery).
The secondary will not generate checksum events on its binary logs.

Case 3: primary: CRC32, secondary: CRC32
The primary generates checksum for the binary log.
The secondary will not validate the checksums on the
group_replication_applier channel, though secondary will validate
checksums on other replication channels (group_replication_recovery)
and will will generate checksum events on its binary logs.


OBSERVABILITY
=============
There are no repercussions.


DEPLOYMENT AND INSTALLATION
===========================
There are no repercussions.


PROTOCOL
========
There are no repercussions.


FAILURE MODEL SPECIFICATION
===========================
There are no repercussions.
SUMMARY OF CHANGES
==================

Server core changes
-------------------
- On group_replication_channel relay log initialization force no
  checksum.
- Obsolete ER_GRP_RPL_BINLOG_CHECKSUM_SET error code.

Group Replication changes
-------------------------
- Remove the validation that --binlog-checksum must be NONE.