MySQL NDB Cluster 8.0.23 is a new release of NDB 8.0, based on
MySQL Server 8.0 and including features in version 8.0 of the
NDB storage engine, as well as fixing
recently discovered bugs in previous NDB Cluster releases.
Obtaining NDB Cluster 8.0. NDB Cluster 8.0 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.
For an overview of changes made in NDB Cluster 8.0, see What is New in MySQL NDB Cluster 8.0.
This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 8.0 through MySQL 8.0.23 (see Changes in MySQL 8.0.23 (2021-01-18, General Availability)).
In addition, a number of status variables have been deprecated and are being replaced, as shown in the following table:
Table 1 Deprecated NDB status variables and their replacements
Also as part of this work, the
ADD_TABLE_SLAVEare deprecated, and replaced by, respectively,
--helpoutput of some NDB utility programs such as ndb_restore has been updated. (Bug #31571031)
NDB Client Programs: Effective with this release, the MySQL NDB Cluster Auto-Installer (ndb_setup.py) has been has been removed from the NDB Cluster binary and source distributions, and is no longer supported. (Bug #32084831)
References: See also: Bug #31888835.
ndbmemcache, which was deprecated in the previous release of NDB Cluster, has now been removed from NDB Cluster, and is no longer supported. (Bug #32106576)
As part of work previously done in NDB 8.0, the metadata check performed as part of auto-synchronization between the representation of an
NDBtable in the NDB dictionary and its counterpart in the MySQL data dictionary has been extended to include, in addition to table-level properties, the properties of columns, indexes, and foreign keys. (This check is also made by a debug MySQL server when performing a
CREATE TABLEstatement, and when opening an
As part of this work, any mismatches found between an object's properties in the NDB dictionary and the MySQL data dictionary are now written to the MySQL error log. The error log message includes the name of the property, its value in the NDB dictionary, and its value in the MySQL data dictionary. If the object is a column, index, or foreign key, the object's type is also indicated in the message. (WL #13412)
ThreadConfigparameter has been extended with two new thread types, query threads and recovery threads, intended for scaleout of LDM threads. The number of query threads must be a multiple of the number of LDM threads, up to a maximum of 3 times the number of LDM threads.
It is also now possible when setting
ThreadConfigto combine the
repthreads into a single thread by setting either or both of these arguments to 0.
When one of these arguments is set to 0 but the other remains set to 1, the resulting combined thread is named
main_rep. When both are set to 0, they are combined with the
recvthread (assuming that
recvto 1), and this combined thread is named
main_rep_recv. These thread names are those shown when checking the
threadstable in the
In addition, the maximums for a number of existing thread types have been increased. The new maximums are: LDM threads: 332; TC threads: 128; receive threads: 64; send threads: 64; main threads: 2. (The maximums for query threads and recovery threads are 332 each.) Maximums for other thread types remain unchanged from previous NDB Cluster releases.
Another change related to this work causes
NDBto employ mutexes for protecting job buffers when more than 32 block threads are in use. This may cause a slight decrease in performance (roughly 1 to 2 percent), but also results in a decrease in the amount of memory used by very large configurations. For example, a setup with 64 threads which used 2 GB of job buffer memory previously should now require only about 1 GB instead. In our testing this has resulted in an overall improvement (on the order of 5 percent) in the execution of very complex queries.
For more information, see the descriptions of the arguments to the
ThreadConfigparameter discussed previously, and of the
ndbinfo.threadstable. (WL #12532, WL #13219, WL #13338)
This release adds the possibility of configuring the threads for multithreaded data nodes (ndbmtd) automatically by implementing a new data node configuration parameter
AutomaticThreadConfig. When set to 1,
NDBsets up the thread assignments automatically, based on the number of processors available to applications. If the system does not limit the number of processors, you can do this by setting
NumCPUsto the desired number. Automatic thread configuration makes it unnecessary to set any values for
AutomaticThreadConfigis enabled, settings for either of these parameters are not used.
As part of this work, a set of tables providing information about hardware and CPU availability and usage by NDB data nodes have been added to the
ndbinfoinformation database. These tables, along with a brief description of the information provided by each, are listed here:
cpudata: CPU usage during the past second
cpudata_1sec: CPU usage per second over the past 20 seconds
cpudata_20sec: CPU usage per 20-second interval over the past 400 seconds
cpudata_50ms: CPU usage per 50-millisecond interval during the past second
cpuinfo: The CPU on which the data node executes
hwinfo: The hardware on the host where the data node resides
Not all of the tables listed are available on all platforms supported by NDB Cluster:
cpudata_50mstables are available only on Linux and Solaris operating systems.
cpuinfotable is not available on FreeBSD or macOS.
Added statistical information in the
DBLQHblock which is employed to track the use of key lookups and scans, as well as tracking queries from
DBSPJ. By detecting situations in which the load is high, but in which there is not actually any need to decrease the number of rows scanned per realtime break, rather than checking the size of job buffer queues to decide how many rows to scan, this makes it possible to scan more rows when there is no CPU congestion. This helps improve performance and realtime behaviour when handling high loads. (WL #14081)
A new method for handling table partitions and fragments is introduced, such that the number of local data managers (LDMs) for a given data node can determined independently of the number of redo log parts, and that the number of LDMs can now be highly variable.
NDBemploys this method when the
ClassicFragmentationdata node configuration parameter, implemented as part of this work, is set to
false. When this is done, the number of LDMs is no longer used to determine how many partitions to create for a table per data node; instead, the
PartitionsPerNodeparameter, also introduced in this release, now determines this number, which is now used for calculating how many fragments a table should have.
ClassicFragmentationhas its default value
true, then the traditional method of using the number of LDMs is used to determine how many fragments a table should have.
For more information, see Multi-Threading Configuration Parameters (ndbmtd). (WL #13930, WL #14107)
macOS: Removed a number of compiler warnings which occurred when building
NDBfor Mac OS X. (Bug #31726693)
Microsoft Windows: Removed a compiler warning C4146: unary minus operator applied to unsigned type, result still unsigned from Visual Studio 2013 found in
storage\ndb\src\kernel\blocks\dbacc\dbaccmain.cpp. (Bug #23130016)
Solaris: Due to a source-level error,
atomic_swap_32()was supposed to be specified but was not actually used for Solaris builds of NDB Cluster. (Bug #31765608)
NDB Cluster APIs: Removed redundant usage of
strlen()in the implementation of
NdbDictionaryand related internal classes in the NDB API. (Bug #100936, Bug #31930362)
MySQL NDB ClusterJ: When a
DomainTypeHandlerwas instantiated by a
SessionFactory, it was stored locally in a static map,
typeToHandlerMap. If multiple, distinct
SessionFactoriesfor separate connections to the data nodes were obtained by a ClusterJ application, the static
typeToHandlerMapwould be shared by all those factories. When one of the
SessionFactorieswas closed, the connections it created were closed and any tables opened by the connections were cleared from the NDB API global cache. However, the
typeToHandlerMapwas not cleared, and through it the other
SessionFactorieskeep accessing the
DomainTypeHandlersof tables that had already been cleared. These obsolete
NdbTablereferences and any
ndbapicalls using those table references ended up with errors.
This patch fixes the issue by making the
typeToHandlerMapand the related
proxyInterfacesToDomainClassMapmaps local to a
SessionFactory, so that they are cleared when the
SessionFactoryis closed. (Bug #31710047)
MySQL NDB ClusterJ: Setting
com.mysql.clusterj.connection.pool.size=0made connections to an NDB Cluster fail. With this fix, setting
com.mysql.clusterj.connection.pool.size=0disables connection pooling as expected, so that every request for a
SessionFactoryresults in the creation of a new factory and separate connections to the cluster can be created using the same connection string. (Bug #21370745, Bug #31721416)
disk_page_abort_prealloc(), the callback from this internal function is ignored, and so removal of the operation record for the
LQHKEYREQsignal proceeds without waiting. This left the table subject to removal before the callback had completed, leading to a failure in
PGMANwhen the page was retrieved from disk.
To avoid this, we add an extra usage count for the table especially for this page cache miss; this count is decremented as soon as the page cache miss returns. This means that we guarantee that the table is still present when returning from the disk read. (Bug #32146931)
When a table was created, it was possible for a fragment of the table to be checkpointed too early during the next local checkpoint. This meant that Prepare Phase LCP writes were still being performed when the LCP completed, which could lead to problems with subsequent
ALTER TABLEstatements on the table just created. Now we wait for any potential Prepare Phase LCP writes to finish before the LCP is considered complete. (Bug #32130918)
Using the maximum size of an index key supported by index statistics (3056 bytes) caused buffer issues in data nodes. (Bug #32094904)
References: See also: Bug #25038373.
CLOCK_MONOTONICwhich on Linux is adjusted by frequency changes but is not updated during suspend. On macOS,
CLOCK_UPTIME_RAWwhich is the same, except that it is not affected by any adjustments.
In addition, when intializing
NdbConditionthe monotonic clock to use is taken directly from
NdbTick, rather than re-executing the same preprocessor logic used by
NdbTick. (Bug #32073826)
When the data node receive thread found that the job buffer was too full to receive, nothing was done to ensure that, the next time it checked, it resumed receiving from the transporter at the same point at which it stopped previously. (Bug #32046097)
The metadata check failed during auto-synchronization of tables restored using the ndb_restore tool. This was a timing issue relating to indexes, and was found in the following two scenarios encountered when a table had been selected for auto-synchronization:
When the indexes had not yet been created in the NDB dictionary
When the indexes had been created, but were not yet usable
Optimized sending of packed signals by registering the kernel blocks affected and the sending functions which need to be called for each one in a data structure rather than looking up this information each time. (Bug #31936941)
When two data definition language statements—one on a database and another on a table in the same schema—were run in parallel, it was possible for a deadlock to occur. The DDL statement affecting the database acquired the global schema lock first, but before it could acquire a metadata lock on the database, the statement affecting the table acquired an intention-exclusive metadata lock on the schema. The table DDL statement was thus waiting for the global schema lock to upgrade its metadata lock on the table to an exclusive lock, while the database DDL statement waited for an exclusive metadata lock on the database, leading to a deadlock.
A similar type of deadlock involving tablespaces and tables was already known to occur; NDB already detected and resolved that issue. The current fix extends that logic to handle databases and tables as well, to resolve the problem. (Bug #31875229)
Clang 8 raised a warning due to an uninitialized variable. (Bug #31864792)
An empty page acquired for an insert did not receive a log sequence number. This is necessary in case the page was used previously and thus required undo log execution before being used again. (Bug #31859717)
No reason was provided when rejecting an attempt to perform an in-place
ALTER TABLE ... ADD PARTITIONstatement on a fully replicated table. (Bug #31809290)
When the master node had recorded a more recent GCI than a node starting up which had performed an unsuccessful restart, subsequent restarts of the latter could not be performed because it could not restore the stated GCI. (Bug #31804713)
When using 3 or 4 fragment replicas, it is possible to add more than one node at a time, which means that
DBDIHcan have distribution keys based on numbers of fragment replicas that differ by up to 3 (that is,
MAX_REPLICAS- 1), rather than by only 1. (Bug #31784934)
It was possible in
ABORTsignal to arrive from
DBTCbefore it received an
LQHKEYREFsignal from the next local query handler. Now in such cases, the out-of-order
ABORTsignal is ignored. (Bug #31782578)
NDBdid not handle correctly the case when an
ALTER TABLE ... COMMENT="..."statement did not specify
ALGORITHM=COPY. (Bug #31776392)
It was possible in some cases to miss the end point of undo logging for a fragment. (Bug #31774459)
ndb_print_sys_file did not work correctly with version 2 of the
sysfileformat that was introduced in NDB 8.0.18. (Bug #31726653)
References: See also: Bug #31828452.
DBLQHcould not handle the case in which identical operation records having the same transaction ID came from different transaction coordinators. This led to locked rows persisting after a node failure, which kept node recovery from completing. (Bug #31726568)
It is possible for
DBDIHto receive a local checkpoint having a given ID to restore while a later LCP is actually used instead, but when performing a partial LCP in such cases, the
DIHblock was not fully synchronized with the ID of the LCP used. (Bug #31726514)
In most cases, when searching a hash index, the row is used to read the primary key, but when the row has not yet been committed the primary key may be read from the copy row. If the row has been deleted, it can no longer be used to read the primary key. Previously in such cases, the primary key was treated as a NULL, but this could lead to making a comparison using uninitialised data.
Now when this occurs, the comparison is made only if the row has not been deleted; otherwise the row is checked of among the operations in the serial queue. If no operation has the primary key, then any comparison can be reported as not equal, since no entry in the parallel queue can reinsert the row. This needs to be checked due to the fact that, if an entry in the serial queue is an insert then the primary key from this operation must be identified as such to preclude inserting the same primary key twice. (Bug #31688797)
As with writing redo log records, when the file currently used for writing global checkpoint records becomes full, writing switches to the next file. This switch is not supposed to occur until the new file is actually ready to receive the records, but no check was made to ensure that this was the case. This could lead to an unplanned data node shutdown restoring data from a backup using ndb_restore. (Bug #31585833)
Release of shared global memory when it is no longer required by the
DBSPJblock now occurs more quickly than previously. (Bug #31321518)
References: See also: Bug #31231286.
Stopping 3 nodes out of 4 in a single node group using kill -9 caused an unplanned cluster shutdown. To keep this from happening under such conditions,
NDBnow ensures that any node group that has not had any node failures is viewed by arbitration checks as fully viable. (Bug #31245543)
Multi-threaded index builds could sometimes attempt to use an internal function disallowed to them. (Bug #30587462)
While adding new data nodes to the cluster, and while the management node was restarting with an updated configuration file, some data nodes terminated unexpectedly with the error virtual void TCP_Transporter::resetBuffers(): Assertion `!isConnected()' failed. (Bug #30088051)
Optimized the internal
NdbReceiver::unpackNdbRecord()method, which is used to convert rows retrieved from the data nodes from packed wire format to the NDB API row format. Prior to the change, roughly 13% of CPU usage for executing a join occurred within this method; this was reduced to approximately 8%. (Bug #95007, Bug #29640755)