MySQL :: MySQL Shell AdminAPI – What’s new in 8.0.23?

The MySQL Development Team is happy to announce a new 8.0 Maintenance Release of MySQL Shell AdminAPI – 8.0.23!

In addition to several bug fixes and minor changes, some significant enhancements regarding monitoring/troubleshooting and performance were included.

MySQL Shell AdminAPI

Cluster diagnostics

Checking how a Cluster is running and, whenever the cluster is not 100% healthy, perform troubleshooting tasks is certainly one of the main tasks of a DBA. The AdminAPI makes this operation very easy by centralizing the monitoring information in:

<Cluster>.status([options])

In this release, we’ve extended the status() command to provide more information relevant to diagnosing errors.

Let’s dive into those additions with some examples. “A picture is worth a thousand words!”

A cluster member is expelled from the Cluster

Previous to 8.0.23, whenever a Cluster member was expelled from the Cluster it would simply be displayed as a (MISSING) member. However, there are numerous reasons that can cause a member to go (MISSING) such as Group Replication being stopped, the member has crashed, or some replication error caused it.

Using the extended option of Cluster.status() could be valuable in the sense that at least it would provide the instance member Role as reported by Group Replication. However, it did not provide any extra information regarding the cause of the issue.

For those reasons, we have included in the default output of Cluster.status():

The memberState whenever the corresponding instance state is not ONLINE. This information comes directly from performance_schema.replication_group_members.
A new instanceErrors field for each instance, displaying diagnostic information that can be detected for that instance that is not ONLINE

The following example shows the output of the command for a scenario on which Group Replication was manually stopped on an instance:

Note: The information depends on whether the instance is reachable or not.

Split-brain

Split-brain can be detected whenever an instance is not part of the majority group but it reports itself as being ONLINE and is reachable:

Applier error

If a replication error happens, the member could be stuck on RECOVERING for quite a while until it eventually fails and goes (MISSING). The only way to diagnose what really happened was to check the error log.

This is certainly not user friendly, so we have also included checks to verify whether the cause of a member to go into ERRORstate:

Other Diagnostics

Some particular scenarios such as restoring a Cluster member from a backup may imply a change on the server_uuid even though the member is running on the same host:port so it could rejoin the cluster automatically. However, since the server_uuid is used as a unique identifier of an instance, the AdminAPI would not understand that the instance was rejoined and would mark it as (MISSING).

Similarly, an instance that belongs to the Group Replication group but it does not to the metadata is now identified and reported in Cluster.status()

Summarizing the diagnostics

The following issues are detected and identified in the new field instanceErrors:

SECONDARY member with super_read_only disabled
Recovery channel error
Applier channel error
Member part of the Group Replication group but not of the metadata
Reachable OFFLINE member (GR plugin stopped)
Split-brain
Member server_uuid does not match what is recorded in the metadata

Replication info

Similar to what’s available in ReplicaSet.status(), we have included information about the replication recovery channel whenever a member is performing incremental recovery, in a new recovery field.

Note: This information is only available if the extended has a value > 0

Multithreaded replication appliers

MySQL InnoDB Cluster and InnoDB ReplicaSet rely on different replication mechanisms, namely, Group Replication and Asynchronous Replication. However, even though the two replication protocols are different in terms of data dissemination, both rely on an asynchronous mechanism to process and apply the binlog changes. The time interval from the moment a transaction is committed on a primary until it is committed on a secondary member is usually called the replication lag.

That said, either technology can suffer from replication lag; one of the most discussed and dreadful matters that MySQL DBAs have to face in production environments.

Fortunately, many improvements in this area have been made since MySQL 5.7. For example, with MySQL 8, a new mechanism to track independent transactions was introduced, based on the WRITESET of each transaction. By evaluating which transactions do *not* have interdependencies and can be executed in parallel on the binary log applier, this mechanism greatly improves the throughput of the applier.

Although those improvements are available for some time, tweaking the settings is necessary to overcome the replication lag problem and, up to this release, the replication applier used a single-threaded applier by default.

Considering that many common workloads have a large number of small transactions happening concurrently and that most modern servers have high processing capabilities, parallelization in replication definitely makes a difference!

For those reasons, InnoDB Cluster and InnoDB ReplicaSet now support and enable by default parallel replication appliers.

Required settings

InnoDB Cluster and ReplicaSet require that the MySQL servers have some settings in place to be able to operate. To enable parallel replication appliers by default, we have extended the settings to include the following:

transaction_write_set_extraction=XXHASH64 (Already a requirement for InnoDB Cluster but not for InnoDB ReplicaSet)
slave_parallel_type=LOGICAL_CLOCK
binlog_transaction_dependency_tracking=WRITESET
slave_preserve_commit_order=ON

Nothing changes regarding how to check and configure your instances to be ready for InnoDB Cluster/ReplicaSet usage, the commands were simply extended to also check and enable automatically the settings listed above:

Check the requirements with: dba.checkInstanceConfiguration()
Configure the instance to be ready with: dba.configureInstance() / dba.configureReplicaSetInstance()

Below, an example of configuring an instance to be ready for InnoDB Cluster usage with dba.configureInstance():

Applier Worker Threads

Multithreaded replication relies on several threads executing tasks. That number of threads can be configured and tweaked according to one’s use-case. We have decided that 4 is a reasonable number that suits typical deployments and workloads so we made that the default.

It’s possible to change this default number whenever you configure an instance for InnoDB Cluster/ReplicaSet usage. A new option named applierWorkerThreads was added to dba.configureInstance() / dba.configureReplicaSetInstance().

For example:

1 2	mysql-js> dba.configureInstance("clusteradmin@t480:3330", {applierWorkerThreads: 2, restart: true})

And, as with any other settings, you can consult the settings in place for your Cluster/ReplicaSet by using the .options() command.

Tweaking the number of applier worker threads

Should you need to tweak the number of threads for your running Cluster/ReplicaSet and you can rely on dba.configureInstance() / dba.configureReplicaSetInstance() to do so. Similarly as when configuring the instances to be ready, you can change the current value of the threads number:

For example:

1 2	mysql-js> dba.configureInstance("clusteradmin@t480:3320", {applierWorkerThreads: 16})

Note: Be aware that even though you can change the settings for online members it doesn’t have immediate effect, it requires a rejoin (stop and start GR) of the instance.

Cluster/ReplicaSet upgrades are affected?

When you upgrade a Cluster or a ReplicaSet that has been running a version of MySQL server and MySQL Shell earlier than 8.0.23, multithreaded replication may not be enabled on the instances since those settings weren’t requirements.

MySQL Shell detects that when running the .status() command and guides you accordingly to do the changes and take advantage of this feature.

Notable Bugs fixed

BUG#26649039 – SHELL FAILS TO IDENTIFY REJOINING MEMBER WITH A NEW UUID

If a Cluster member is taken out of the Cluster and afterward restored from a backup using, for example, MEB, then whenever the instance rejoins the cluster either automatically or via Cluster.rejoinInstance()it was marked as being a (MISSING)member of the Cluster.

This happened because the AdminAPI uses the server_uuidas the unique identifier of an instance, and since the server_uuidmay change after the backup is restored, the AdminAPI would not understand that the instance is the same one.

This has been fixed together with the new Cluster.status() diagnostics described above, i.e. a new check has been added when an instance is rejoined so when the instance is not found on the metadata through the UUID, it’s searched using its host and port, if found, the metadata will be updated based on the options used for the rejoin operation.

BUG#31757737 – INNODB CLUSTER OPERATIONS SHOULD AUTO-CONNECT TO PRIMARY

InnoDB Cluster operations such as Cluster.addInstance()or Cluster.rejoinInstance()would fail if the shell’s active session was established to a Secondary member of the Cluster. However, considering that Shell is capable of knowing which member is the Primary and that all cluster members must have the same cluster-admin credentials, the commands should not fail and should automatically use the primary’s connection.

That’s exactly how this bug was solved. Now, regardless of which member of the instance you have established a connection to obtain the cluster’s object, operations will be executed on the right member accordingly.

BUG#27882663 – CLUSTER.STATUS() DOES NOT SHOW ACTIVE GR MEMBERS NOT IN METADATA

The Cluster.status()operation didn’t show information about members part of the cluster which did not belong to the metadata. That information would only be seen when using Cluster.rescan(). However, not showing all members in the Group Replication group even if not present in the metadata hides the unexpected/undesired participation of the instances in the cluster (not managed by InnoDB Cluster).

This has been fixed together with the improvements in Cluster.status()by listing members that are participating in the Cluster even though are not registered in the Metadata and by indicating to the user the steps to include them in the Metadata.

BUG#32112864 – REBOOTCLUSTERFROMCOMPLETEOUTAGE() DOES NOT EXCLUDE INSTANCES IF IN OPTION “REMOVEINSTANCES” LIST

dba.rebootClusterFromCompleteOutage() requires an active session on
MySQL Shell to:

Obtain the cluster members as registered in the Metadata.
Determine which of the cluster members has the GTID superset.
If the active session is not to the member with the GTID
superset the command aborts and indicates to the user which is
the instance with the GTID superset.

However, the GTID superset check is done using all instances (registered
in the Cluster’s metadata) reachable by Shell.

If an instance has a diverging GTID-set and the user wants to explicitly
remove it from the cluster, the operation would be blocked since the
Shell cannot determine which instance has the GTID superset. Depending
on the point-of-view, different instances can be considered as the most
up-to-date.

Also, it should be possible for users to reboot a cluster by picking a
particular instance even though it’s not the one most up-to-date, as
long as they indicate the other instances are not meant to be rejoined
by using the command’s options/prompts.

This patch fixed that by ensuring that if the user explicitly sets the
removeInstances variable or answers “NO” to the prompt regarding
instances rejoin, those instances must be excluded from the GTID
superset validation.

BUG#31428813 – DBA.UPGRADEMETADATA() FAILS WITH ERROR: UNKNOWN COLUMN ‘MYSQL.ROUTER’ IN ‘FIELD LIST’

Upgrading the Metadata schema with dba.upgradeMetadata() failed if
ANSI_QUOTES were used in the sql_mode.

This was caused by a specific query to insert data into the routers
table of the metadata schema which was using double quotes to quote a
string. When the sql_mode is set to use ANSI_QUOTES, MySQL treats " as
an identifier quote character and not as a string quote character
resulting in an error whilst running that query.

This patch fixed that by ensuring that the upgrade metadata command
prepares the session to be used by the AdminAPI, which, amongst other
sanity checks it ensures the sql_mode used by that session uses the
default values to avoid user-set incompatible settings

BUG#32152133 – REPLACE CHANGE MASTER/START SLAVE TERMINOLOGY WITH CHANGE SOURCE/START REPLICA

As in the MySQL Server, deprecated terminology in replication related features was updated, while keeping backward compatibility wherever necessary.

This one specially handled the following deprecations on MySQL Replication commands:

CHANGE MASTER TO
START SLAVE UNTIL MASTER_LOG_POS, MASTER_LOG_FILE

In favor of the new:

CHANGE REPLICATION SOURCE TO
START REPLICA UNTIL SOURCE_LOG_POS, SOURCE_LOG_FILE

And their parameters:

MASTER_HOST
MASTER_PORT
MASTER_*

In favor of:

SOURCE_HOST
SOURCE_PORT
SOURCE_*

You can read about the general change at the MySQL Terminology Updates blog post.

Try it now and send us your feedback

MySQL Shell 8.0.23 GA is available for download from the following links:

MySQL Community Downloads website: https://dev.mysql.com/downloads/shell/
MySQL Shell is also available on GitHub: https://github.com/mysql/mysql-shell

And as always, we are eager to listen to the community feedback! So please let us know your thoughts here in the comments, via a bug report, or a support ticket.

You can also reach us at #shell and #mysql_innodb_cluster in Slack: https://mysqlcommunity.slack.com/

The documentation of MySQL Shell can be found in https://dev.mysql.com/doc/mysql-shell/8.0/en/ and the official documentation of InnoDB Cluster and InnoDB ReplicaSet can be found in the AdminAPI User Guide.

The full list of changes and bug fixes can be found in the 8.0.23 Shell Release Notes.

Enjoy, and Thank you for using MySQL!