MySQL NDB Cluster 8.0.28 is a new release of NDB 8.0, based on
MySQL Server 8.0 and including features in version 8.0 of the
NDB storage engine, as well as fixing
recently discovered bugs in previous NDB Cluster releases.
Obtaining NDB Cluster 8.0. NDB Cluster 8.0 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.
For an overview of changes made in NDB Cluster 8.0, see What is New in MySQL NDB Cluster.
This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 8.0 through MySQL 8.0.28 (see Changes in MySQL 8.0.28 (2022-01-18, General Availability)).
index_statstable, which provides very basic information about
NDBindex statistics. It is intended primarily for use in our internal testing, but may be helpful in conjunction with ndb_index_stat and other tools. (Bug #32906654)
Previously, ndb_import always tried to import data into a table whose name was derived from the name of the CSV file being read. This release adds a
--tableoption (short form:
-t) for this program, which overrides this behavior and specifies the name of the target table directly. (Bug #30832382)
Important Change: The deprecated data node option
--connect-delayhas been removed. This option was a synonym for
--connect-retry-delay, which was not honored in all cases; this issue has been fixed, and the option now works correctly. In addition, the short form
-rfor this option has been deprecated, and you should expect it to be removed in a future release. (Bug #31565810)
References: See also: Bug #33362935.
Microsoft Windows: On Windows, added missing debug and test suite binaries for MySQL Server (commercial) and MySQL NDB Cluster (commercial and community). (Bug #32713189)
NDB Replication: The mysqld option
--slave-skip-errorscan be used to allow the replication applier SQL thread to skip over certain numbered errors automatically. This is not recommended in production because it allows replicas to diverge since whole transactions in the binary log are not applied; for
NDBCLUSTERwith its epoch transactions, this results in entire epochs of changes not being applied, likely leading to inconsistent data.
Ndb also checks the sequence of epochs applied, and stops the replica applier with an error if there is a sequence problem. Where
--slave-skip-errorsis in use, and an error is skipped, this results in a whole epoch transaction being skipped; this is detected on any subsequent attempt to apply an epoch transaction, which results in the replica applier SQL thread being stopped.
A new option
--ndb-applier-allow-skip-epochis added in this release to allow users to ignore wholly skipped epoch transactions, so that they can use the
--slave-skip-errorsoption as with other MySQL storage engines. This is intended for use in testing, and not in a production setting. Use of these options is entirely at your own risk.
When mysqld is started with the new option (together with
--slave-skip-errors), detection of a missing epoch generates a warning, but the replica applier SQL thread continues applying. (Bug #33398973)
NDB Replication: The
log_namecolumn of the
ndb_apply_statustable was created as
VARBINARY, despite being defined as
VARCHAR, using the
latin1character set, causing hex-decoded output when querying the table using some tools.
We fix this by detecting the faulty column type in
ndb_apply_statusand reinstalling the table definition into the data dictionary while connecting to
NDB, when mysqld checks the layout of this table. (Bug #33380726)
NDB Cluster APIs: Several new basic example C++ NDB API programs have been added to the distribution, under
storage/ndb/ndbapi-examples/ndbapi_basic/in the source tree. These are shorter and should be easier to understand for newcomers to the NDB API than the existing API examples. They also follow recent C++ standards and practices. These examples have also been added to the NDB API documentation; see Basic NDB API Examples, for more information. (Bug #33378579, Bug #33517296)
NDB Cluster APIs: It is no longer possible to use the
DIVERIFYREQsignal asynchronously. (Bug #33161562)
wait for scanslog output during online reorganization was not performed correctly. As part of this fix, we change timing to generate one message every 10 seconds rather than scaling indefinitely, so as to supply regular updates. (Bug #35523977)
Added missing values checks in ndbd and ndbmtd. (Bug #33661024)
Online table reorganization increases the number of fragments of a table, and moves rows between them. This is done in the following steps:
Copy rows to new fragments
Update distribution information (hashmap count and total fragments)
Wait for scan activity using old distribution to stop
Delete rows which have moved out of existing partitions
Remove reference to old hashmap
Wait for scan activity started since step 2 to stop
Due to a counting error, it was possible for the reorganization to hang in step 6; the scan reference count was not decremented, and thus never reached zero as expected. (Bug #33523991)
UNIQUEindex created with
USING HASHdoes not support ordered or range access operations, but rather only those operations in which the full key is specified, returning at most a single row. Even so, for such an index on an NDB table, range access was still used on the index. (Bug #33466554, Bug #33474345)
The same pushed join on
NDBtables returned an incorrect result when the
batched_key_accessoptimizer switch was enabled.
This issue arose as follows: When the batch key access (BKA) algorithm is used to join two tables, a set of batched keys is first collected from one of the tables; a multirange read (MRR) operation is constructed against the other. A set of bounds (ranges) is specified on the MRR, using the batched keys to construct each bound.
When result rows are returned it is necessary to identify which range each returned row comes from. This is used to identify the outer table row to perform the BKA join with. When the MRR operation in question was a root of a pushed join operation,
SPJwas unable to retrieve this identifier (
RANGE_NO). We fix this by implementing the missing
SPJAPI functionality for returning such a
RANGE_NOfrom a pushed join query. (Bug #33416308)
Each query against the
ndinfo.index_statstable leaked an
NdbRecord. We fix this by changing the context so that it owns the
NdbRecordobject which it creates and then to release the
NdbRecordwhen going out of scope, and by supporting the creation of one and only one record per context. (Bug #33408123)
A problem with concurrency occurred when updating cached table statistics with changed rows, when several threads updating same table the threads competed for the
NDB_SHAREmutex in order to update the cached row count.
We fix this by reimplementing the storage of changed rows using an atomic counter rather than trying to take the mutex and update the actual shared value, which reduces the need to serialize the threads. In addition, we now append the number of changed rows to the row count only when removing the statistics from the cache and provide a separate mutex protecting only the cached statistics. (Bug #33384978)
References: See also: Bug #32169848.
If the schema distribution client detected a timeout before freeing the schema object when the coordinator received the schema event, the coordinator processed the stale schema event instead of returning.
The coordinator did not know whether a schema distribution timeout was detected by the client, and started processing the schema event as soon as the schema object was valid. To fix this, we indicate the state of the schema object and change its state when the client detects the schema distribution timeout and when the schema event is received by the coordinator, so that both the coordinator and the client are aware of this, and remain synchronized. (Bug #33318201)
The MySQL Optimizer uses two different methods,
Cost_model::page_read_cost(), to estimate the cost for different access methods, but the cost values returned by these were not always comparable; in some cases this led to the wrong index being chosen and longer execution time for effected queries. To fix this for
NDB, we override the optimizer's
page_read_cost()method with one specific to
NDBCLUSTER. It was also found while working on this issue that the
NDBhandler did not implement the
read_time()method, used by
read_cost(); this method is now implemented by
ha_ndbcluster, and thus the optimizer can now properly take into account the cost difference for
NDBwhen using a unique key as opposed to an ordered index (range scan). (Bug #33317872)
NDBtables for queries, the index statistics are retrieved to help the optimizer select the optimal query plan. Each client accessing the stats acquires the global index statistics mutex both before and after accessing the statistics. This causes mutex contention affecting query performance, whether or not there are queries are operating on the same tables, or on different ones.
We fix this by protecting the count of index statistics references with an atomic counter. The problem was clearly visible when benchmarking with more than 32 clients, when throughput did not increase with additional clients. With this fix, the throughput continues to scale with up to 64 clients. (Bug #33317320)
In certain cases, an event's category was not properly detected. (Bug #33304814)
It was not possible to add new data nodes running ndbd to an existing cluster with data nodes running ndbd. (Bug #33193393)
For a user granted the
password_last_changedcolumn in the
mysql.usertable was updated each time the SQL node was restarted. (Bug #33172887)
DBDICTdid not always perform table name checks correctly. (Bug #33161548)
Added a number of missing ID and other values checks in ndbd and ndbmtd. (Bug #33161486, Bug #33162047)
Added a number of missing ID and other values checks in ndbd and ndbmtd. (Bug #33161259, Bug #33161362)
SET_LOGLEVELORDsignals were not always handled correctly. (Bug #33161246)
DUMP 11001did not always handle all of its arguments correctly. (Bug #33157513)
File names were not always verified correctly. (Bug #33157475)
Added a number of missing checks in the data nodes. (Bug #32983723, Bug #33157488, Bug #33161451, Bug #33161477, Bug #33162082)
Added a number of missing ID and other values checks in ndbd and ndbmtd. (Bug #32983700, Bug #32893708, Bug #32957478, Bug #32983256, Bug #32983339, Bug #32983489, Bug #32983517, Bug #33157527, Bug #33157531, Bug #33161271, Bug #33161298, Bug #33161314, Bug #33161331, Bug #33161372, Bug #33161462, Bug #33161511, Bug #33161519, Bug #33161537, Bug #33161570, Bug #33162059, Bug #33162065, Bug #33162074, Bug #33162082, Bug #33162092, Bug #33162098, Bug #33304819)
The management server did not always handle events of the wrong size correctly. (Bug #32957547)
When ndb_mgmd is started without the
--config-fileoption, the user is expected to provide the connection string for another management server in the same cluster, so that the management server being started can obtain configuration information from the other. If the host address in the connection string could not be resolved, then the ndb_mgmd being started hung indefinitely while trying to establish a connection.
This issue occurred because a failure to connect was treated as a temporary error, which led to the ndb_mgmd retrying the connection, which subsequently failed, and so on, repeatedly. We fix this by treating a failure in host name resolution by ndb_mgmd as a permanent error, and immediately exiting. (Bug #32901321)
The order of parameters used in the argument to ndb_import
--csvoptis now handled consistently, with the rightmost parameter always taking precedence. This also applies to duplicate instances of a parameter. (Bug #32822757)
In some cases, issues with the redo log while restoring a backup led to an unplanned shutdown of the data node. To fix this, when the redo log file is not available for writes, we now include the correct wait code and waiting log part in the
CONTINUEBsignal before sending it. (Bug #32733659)
References: See also: Bug #31585833.
The binary logging thread sometimes attempted to start before all data nodes were ready, which led to excess logging of unnecessary warnings and errors. (Bug #32019919)
Instituted a number of value checks in the internal
Ndb_table_guard::getTable()method. This fixes a known issue in which an SQL node underwent an unplanned shutdown while executing
ALTER TABLEon an
NDBtable, and potentially additional issues. (Bug #30232826)
Replaced a misleading error message and otherwise improved the behavior of ndb_mgmd when the
HostNamecould not be resolved. (Bug #28960182)
A query used by MySQL Enterprise Monitor to monitor memory use in NDB Cluster became markedly less performant as the number of
NDBtables increased. We fix this as follows:
Row counts for virtual
ndbinfotables have been made available to the MySQL optimizer
Size estimates are now provided for all
Primary keys have been added to most internal
Following these improvements, the performance of queries against
ndbinfotables should be comparable to queries against equivalent
MyISAMtables. (Bug #28658625)
Following improvements in LDM performance made in NDB 8.0.23, an
UPDATE_FRAG_DIST_KEY_ORDsignal was never sent when needed to a data node using node ID
1. When running the cluster with 3 or 4 replicas and another node in the same node group restarted, this could result in SQL statements being rejected with error MySQL 1297
SHOW WARNINGSreporting error NDB error
Prior to upgrading to this release, you can work around the issue by restarting data node 1 whenever any other node in the same node group has been restarted.
(Bug #105098, Bug #33460188)
Following the rolling restart of a data node performed as part of an upgrade from NDB 7.6 to NDB 8.0, the data node underwent a forced shutdown. We fix this by allowing
LQHKEYREQsignals to be sent to both the
DBSPJkernel blocks. (Bug #105010, Bug #33387443)
AutomaticThreadConfigparameter was enabled,
NumCPUswas always shown as
0in the data node log. In addition, when this parameter is in use, thread CPU bindings are now made correctly, and the data node log shows the actual CPU binding for each thread. (Bug #102503, Bug #32474961)
--helpdid not return the expected output. (Bug #98158, Bug #30733508)
NDBdid not close any pending schema transactions when returning an error from internal system table creation and drop functions.