MySQL NDB Cluster 8.0.16 is a new development release of NDB 8.0,
based on MySQL Server 8.0 and including features in version 8.0 of
NDB storage engine, as well as
fixing recently discovered bugs in previous NDB Cluster releases.
Obtaining NDB Cluster 8.0. NDB Cluster 8.0 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.
For an overview of changes made in NDB Cluster 8.0, see What is New in NDB Cluster.
This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 8.0 through MySQL 8.0.16 (see Changes in MySQL 8.0.16 (2019-04-25, General Availability)).
Incompatible Change: Distribution of privileges amongst MySQL servers connected to NDB Cluster, as implemented in NDB 7.6 and earlier, does not function in NDB 8.0, and most code supporting these has now been removed. When a mysqld detects such tables in
NDB, it creates shadow tables local to itself using the
InnoDBstorage engine; these shadow tables are created on each MySQL server connected to an NDB cluster. Privilege tables using the
NDBstorage engine are not employed for access control; once all connected MySQL servers are upgraded, the privilege tables in
NDBcan be removed safely using ndb_drop_table.
For compatibility reasons, ndb_restore
--restore-privilege-tablescan still be used to restore distributed privilege tables present in a backup taken from a previous release of NDB Cluster to a cluster running NDB 8.0. These tables are handled as described in the preceeding paragraph.
For additional information regarding upgrades from previous NDB Cluster release series to NDB 8.0, see Upgrading and Downgrading NDB Cluster.
Incompatible Change: For consistency with
NDBstorage engine now uses a generated constraint name if the
CONSTRAINTclause is not specified, or the
CONSTRAINTkeyword is specified without a
symbol. In previous
This change described above may introduce incompatibilities for applications that depend on the previous foreign key constraint naming behavior. (Bug #29173134)
Packaging: A Docker image for this release can be obtained from https://hub.docker.com/r/mysql/mysql-cluster/. (Bug #96084, Bug #30010921)
Allocation of resources in the transaction corrdinator (see The DBTC Block) is now performed using dynamic memory pools. This means that resource allocation determined by data node configuration parameters such as those discussed in Transaction parameters and Transaction temporary storage is now limited so as not to exceed the total resources available to the transaction coordinator.
As part of this work, several new data node parameters controlling transactional resources in
DBTC, listed here, have also been added. For more information about these new parameters, see Transaction resource allocation parameters. (Bug #29164271, Bug #29194843)
References: See also: Bug #29131828.
NDBbackups can now be performed in a parallel fashion on individual data nodes using multiple local data managers (LDMs). (Previously, backups were done in parallel across data nodes, but were always serial within data node processes.) No special syntax is required for the
START BACKUPcommand in the ndb_mgm client to enable this feature, but all data nodes must be using multiple LDMs. This means that data nodes must be running ndbmtd and they must be configured to use multiple LDMs prior to taking the backup (see Multi-Threading Configuration Parameters (ndbmtd)).
ndb_restore also now detects such a backup and automatically attempts to restore it in parallel. It is also possible to restore backups taken in parallel to a previous version of NDB Cluster by slightly modifying the usual restore procedure.
For more information about taking and restoring NDB Cluster backups that were created using parallelism on the data nodes, see Taking an NDB Backup with Parallel Data Nodes, and Restoring from a backup taken in parallel. (Bug #28563639, Bug #28993400)
The compile-cluster script included in the
NDBsource distribution no longer supports in-source builds.
Building with CMake3 is now supported by the compile-cluster script included in the
As part of its automatic synchronization mechanism,
NDBnow implements a metadata change monitor thread for detecting changes made to metadata for data objects such as tables, tablespaces, and log file groups with the MySQL data dictionary. This thread runs in the background, checking every 60 seconds for inconsistencies between the
NDBdictionary and the MySQL data dictionary.
The monitor polling interval can be adjusted by setting the value of the
ndb_metadata_check_intervalsystem variable, and can be disabled altogether by setting
ndb_metadata_checkto OFF. The number of times that inconsistencies have been detected since mysqld was last started is shown as the status variable,
Condition pushdown is no longer limited to predicate terms referring to column values from the same table to which the condition was being pushed; column values from tables earlier in the query plan can now also be referred to from pushed conditions. This lets the data nodes filter out more rows (in parallel), leaving less work to be performed by a single mysqld process, which is expected to provide significant improvements in query performance.
For more information, see Engine Condition Pushdown Optimization.
Important Change; NDB Disk Data: mysqldump terminated unexpectedly when attempting to dump
NDBdisk data tables. The underlying reason for this was that mysqldump expected to find information relating to undo log buffers in the
EXTRAcolumn of the
INFORMATION_SCHEMA.FILEStable but this information had been removed in NDB 8.0.13. This information is now restored to the
EXTRAcolumn. (Bug #28800252)
Important Change: When restoring to a cluster using data node IDs different from those in the original cluster, ndb_restore tried to open files corresponding to node ID 0. To keep this from happening, the
--backupidoptions—neither of which has a default value—are both now explicitly required when invoking ndb_restore. (Bug #28813708)
Important Change: Starting with this release, the default value of the
ndb_log_binsystem variable is now
FALSE. (Bug #27135706)
Packaging; MySQL NDB ClusterJ:
libndbclientwas missing from builds on some platforms. (Bug #28997603)
NDB Disk Data: When a log file group had more than 18 undo logs, it was not possible to restart the cluster. (Bug #251155785)
References: See also: Bug #28922609.
NDB Disk Data: Concurrent
CREATE TABLEstatements using tablespaces caused deadlocks between metadata locks. This occurred when
Ndb_metadata_change_monitoracquired exclusive metadata locks on tablespaces and logfile groups after detecting metadata changes, due to the fact that each exclusive metadata lock in turn acquired a global schema lock. This fix attempts to solve that issue by downgrading the locks taken by
MDL_SHARED_READ. (Bug #29175268)
References: See also: Bug #29394407.
NDB Disk Data: The error message returned when validation of
MaxNoOfOpenFilesin relation to
InitialNoOfOpenFilesfailed has been improved to make the nature of the problem clearer to users. (Bug #28943749)
NDB Disk Data: Schema distribution of
ALTER LOGFILE GROUPstatements failed on a participant MySQL server if the referenced tablespace or log file group did not exist in its data dictionary. Now in such cases, the effects of the statement are distributed successfully regardless of any initial mismatch between MySQL servers. (Bug #28866336)
NDB Disk Data: Repeated execution of
ALTER TABLESPACE ... ADD DATAFILEagainst the same tablespace caused data nodes to hang and left them, after being killed manually, unable to restart. (Bug #22605467)
NDB Replication: A
DROP DATABASEoperation involving certain very large tables could lead to an unplanned shutdown of the cluster. (Bug #28855062)
NDB Replication: When writes on the master—done in such a way that multiple changes affecting
BLOBcolumn values belonging to the same primary key were part of the same epoch—were replicated to the slave, Error 1022 occurred due to constraint violations in the
NDB$BLOB_table. (Bug #28746560)
NDB Cluster APIs:
NDBnow identifies short-lived transactions not needing the reduction of lock contention provided by
NdbBlob::close()and no longer invokes this method in cases (such as when autocommit is enabled) in which unlocking merely causes extra work and round trips to be performed prior to committing or aborting the transaction. (Bug #29305592)
References: See also: Bug #49190, Bug #11757181.
NDB Cluster APIs: When the most recently failed operation was released, the pointer to it held by
NdbTransactionbecame invalid and when accessed led to failure of the NDB API application. (Bug #29275244)
NDB Cluster APIs: When the
SUMAblock sends a
TE_ALTERevent, it does not keep track of when all fragments of the event are sent. When
NDBreceives the event, it buffers the fragments, and processes the event when all fragments have arrived. An issue could possibly arise for very large table definitions, when the time between transmission and reception could span multiple epochs; during this time,
SUMAcould send a
SUB_GCP_COMPLETE_REPsignal to indicate that it has sent all data for an epoch, even though in this case that is not entirely true since there may be fragments of a
TE_ALTERevent still waiting on the data node to be sent. Reception of the
SUB_GCP_COMPLETE_REPleads to closing the buffers for that epoch. Thus, when
TE_ALTERfinally arrives, NDB assumes that it is a duplicate from an earlier epoch, and silently discards it.
We fix the problem by making sure that the
SUMAkernel block never sends a
SUB_GCP_COMPLETE_REPfor any epoch in which there are unsent fragments for a
This issue could have an impact on NDB API applications making use of
TE_ALTERevents. (SQL nodes do not make any use of
TE_ALTERevents and so they and applications using them were not affected.) (Bug #28836474)
When a pushed join executing in the
DBSPJblock had to store correlation IDs during query execution, memory for these was allocated for the lifetime of the entire query execution, even though these specific correlation IDs are required only when producing the most recent batch in the result set. Subsequent batches require additional correlation IDs to be stored and allocated; thus, if the query took sufficiently long to complete, this led to exhaustion of query memory (error 20008). Now in such cases, memory is allocated only for the lifetime of the current result batch, and is freed and made available for re-use following completion of the batch. (Bug #29336777)
References: See also: Bug #26995027.
When comparing or hashing a fixed-length string that used a
NO_PADcollation, any trailing padding characters (typically spaces) were sent to the hashing and comparison functions such that they became significant, even though they were not supposed to be. Now any such trailing spaces are trimmed from a fixed-length string whenever a
NO_PADcollation is specified.Note
NO_PADcollations were introduced as part of UCA-9.0 collations in MySQL 8.0, there should be no impact relating to this fix on upgrades to NDB 8.0 from previous GA releases of NDB Cluster.
NOT BETWEENpredicate was evaluated as a pushed condition,
NULLvalues were not eliminated by the condition as specified in the SQL standard. (Bug #29232744)
References: See also: Bug #28672214.
NULLas less than any other value, and predicates of the form
are checked for possible nulls. Predicates of the form
were not checked, which could lead to errors. Now in such cases, these predicates are rewritten so that the column comes first, so that they are also checked for the presence of
NULL. (Bug #29231709)
References: See also: Bug #92407, Bug #28643463.
When a join condition made a comparison between a column of a temporal data type such as
DATETIMEand a constant of the same type, the predicate was pushed if the condition was expressed in the form
, but not when in inverted order (as
). (Bug #29058732)
When processing a pushed condition,
NDBdid not detect errors or warnings thrown when a literal value being compared was outside the range of the data type it was being compared with,and thus truncated. This could lead to excess or missing rows in the result. (Bug #29054626)
REFkey in the child of a pushed join referred to any columns of a table not a member of the pushed join, this table was not an
NDBtable (because its format was of nonnative endianness), and the data type of the column being joined on was stored in an endian-sensitive format, then the key generated was generated, likely resulting in the return of an (invalid) empty join result.
Since only big endian platforms may store tables in nonnative (little endian) formats, this issue was expected only on such platforms, most notably SPARC, and not on x86 platforms. (Bug #29010641)
API and data nodes running NDB 7.6 and later could not use an existing parsed configuration from an earlier release series due to being overly strict with regard to having values defined for configuration parameters new to the later release, which placed a restriction on possible upgrade paths. Now NDB 7.6 and later are less strict about having all new parameters specified explicitly in the configuration which they are served, and use hard-coded default values in such cases. (Bug #28993400)
NDB 7.6 SQL nodes hung when trying to connect to an NDB 8.0 cluster. (Bug #28985685)
The schema distribution data maintained in the
NDBbinary logging thread keeping track of the number of subscribers to the
NDBschema table always allocated some memory structures for 256 data nodes regardless of the actual number of nodes. Now
NDBallocates only as many of these structures as are actually needed. (Bug #28949523)
NdbfsDumpRequests) to provide
NDBfile system information to global checkpoint and local checkpoint stall reports in the node logs. (Bug #28922609)
When a joined table was eliminated early as not pushable, it could not be referred to in any subsequent join conditions from other tables without eliminating those conditions from consideration even if those conditions were otherwise pushable. (Bug #28898811)
When starting or restarting an SQL node and connecting to a cluster where
NDBwas already started,
NDBreported Error 4009 Cluster Failure because it could not acquire a global schema lock. This was because the MySQL Server as part of initialization acquires exclusive metadata locks in order to modify internal data structures, and the
ndbclusterplugin acquires the global schema lock. If the connection to
NDBwas not yet properly set up during mysqld initialization, mysqld received a warning from
ndbclusterwhen the latter failed to acquire global schema lock, and printed it to the log file, causing an unexpected error in the log. This is fixed by not pushing any warnings to background threads when failure to acquire a global schema lock occurs and pushing the
NDBerror as a warning instead. (Bug #28898544)
A race condition between the
DBLQHkernel blocks occurred when different operations in a transaction on the same row were concurrently being prepared and aborted. This could result in
DBTUPattempting to prepare an operation when a preceding operation had been aborted, which was unexpected and could thus lead to undefined behavior including potential data node failures. To solve this issue,
DBLQHnow check that all dependencies are still valid before attempting to prepare an operation.Note
This fix also supersedes a previous one made for a related issue which was originally reported as Bug #28500861.
Where a data node was restarted after a configuration change whose result was a decrease in the sum of
MaxNoOfUniqueHashIndexes, it sometimes failed with a misleading error message which suggested both a temporary error and a bug, neither of which was the case.
The failure itself is expected, being due to the fact that there is at least one table object with an ID greater than the (new) sum of the parameters just mentioned, and that this table cannot be restored since the maximum value for the ID allowed is limited by that sum. The error message has been changed to reflect this, and now indicates that this is a permanent error due to a problem configuration. (Bug #28884880)
ndbinfo.cpustattable reported inaccurate information regarding send threads. (Bug #28884157)
Execution of an LCP_COMPLETE_REP signal from the master while the LCP status was IDLE led to an assertion. (Bug #28871889)
NDBnow provides on-the-fly
.frmfile translation during discovery of tables created in versions of the software that did not support the MySQL Data Dictionary. Previously, such translation of tables that had old-style metadata was supported only during schema synchronization during MySQL server startup, but not subsequently, which led to errors when
NDBtables having old-style metadata, created by ndb_restore and other such tools after mysqld had been started, were accessed using
SHOW CREATE TABLEor
SELECT; these tables were usable only after restarting mysqld. With this fix, the restart is no longer required. (Bug #28841009)
An in-place upgrade to an NDB 8.0 release from an earlier relase did not remove
.ndbfiles, even though these are no longer used in NDB 8.0. (Bug #28832816)
storage/ndb/demosand the demonstration scripts and support files it contained from the source tree. These were obsolete and unmaintained, and did not function with any current version of NDB Cluster.
storage/ndb/include/newtonapi, which included files relating to an obsolete and unmaintained API not supported in any release of NDB Cluster, as well as references elsewhere to these files. (Bug #28808766)
There was no version compatibility table for NDB 8.x; this meant that API nodes running NDB 8.0.13 or 7.6.x could not connect to data nodes running NDB 8.0.14. This issue manifested itself for NDB API users as a failure in
wait_until_ready(). (Bug #28776365)
References: See also: Bug #18886034, Bug #18874849.
A fix for a previous issue disabled the usage of pushed conditions for lookup type (
eq_ref) operations in pushed joins. It was thought at the time that not pushing a lookup condition would not have any measurable impact on performance, since only a single row could be eliminated if the condition failed. The solution implemented at that time did not take into account the possibility that, in a pushed join, a lookup operation could be a parent operation for other lookups, and even scan operations, which meant that eliminating a single row could actually result in an entire branch being eliminated in error. (Bug #28728603)
References: This issue is a regression of: Bug #27397802.
When a local checkpoint (LCP) was complete on all data nodes except one, and this node failed,
NDBdid not continue with the steps required to finish the LCP. This led to the following issues:
No new LCPs could be started.
Redo and Undo logs were not trimmed and so grew excessively large, causing an increase in times for recovery from disk. This led to write service failure, which eventually led to cluster shutdown when the head of the redo log met the tail. This placed a limit on cluster uptime.
Node restarts were no longer possible, due to the fact that a data node restart requires that the node's state be made durable on disk before it can provide redundancy when joining the cluster. For a cluster with two data nodes and two fragment replicas, this meant that a restart of the entire cluster (system restart) was required to fix the issue (this was not necessary for a cluster with two fragment replicas and four or more data nodes). (Bug #28728485, Bug #28698831)
References: See also: Bug #11757421.
The pushability of a condition to
NDBwas limited in that all predicates joined by a logical
ANDwithin a given condition had to be pushable to
NDBin order for the entire condition to be pushed. In some cases this severely restricted the pushability of conditions. This fix breaks up the condition into its components, and evaluates the pushability of each predicate; if some of the predicates cannot be pushed, they are returned as a remainder condition which can be evaluated by the MySQL server. (Bug #28728007)
ANALYZE TABLEon an
NDBtable with an index having longer than the supported maximum length caused data nodes to fail. (Bug #28714864)
It was possible in certain cases for nodes to hang during an initial restart. (Bug #28698831)
References: See also: Bug #27622643.
When a condition was pushed to a storage engine, it was re-evaluated by the server, in spite of the fact that only rows matching the pushed condition should ever be returned to the server in such cases. (Bug #28672214)
In some cases, one and sometimes more data nodes underwent an unplanned shutdown while running ndb_restore. This occurred most often, but was not always restircted to, when restoring to a cluster having a different number of data nodes from the cluster on which the original backup had been taken.
The root cause of this issue was exhaustion of the pool of
SafeCounterobjects, used by the
DBDICTkernel block as part of executing schema transactions, and taken from a per-block-instance pool shared with protocols used for
NDBevent setup and subscription processing. The concurrency of event setup and subscription processing is such that the
SafeCounterpool can be exhausted; event and subscription processing can handle pool exhaustion, but schema transaction processing could not, which could result in the node shutdown experienced during restoration.
This problem is solved by giving
DBDICTschema transactions an isolated pool of reserved
SafeCounterswhich cannot be exhausted by concurrent
NDBevent activity. (Bug #28595915)
When a backup aborted due to buffer exhaustion, synchronization of the signal queues prior to the expected drop of triggers for insert, update, and delete operations resulted in abort signals being processed before the
STOP_BACKUPphase could continue. The abort changed the backup status to
ABORT_BACKUP_ORD, which led to an unplanned shutdown of the data node since resuming
STOP_BACKUPrequires that the state be
STOP_BACKUP_REQ. Now the backup status is not set to
STOP_BACKUP_REQ(requesting the backup to continue) until after signal queue synchronization is complete. (Bug #28563639)
The output of ndb_config
--query-allnow shows that configuration changes for the
MaxNoOfExecutionThreadsdata node parameters require system initial restarts (
restart="system" initial="true"). (Bug #28494286)
After a commit failed due to an error, mysqld shut down unexpectedly while trying to get the name of the table involved. This was due to an issue in the internal function
ndbcluster_print_error(). (Bug #28435082)
API nodes should observe that a node is moving through
SL_STOPPINGphases (graceful stop) and stop using the node for new transactions, which minimizes potential disruption in the later phases of the node shutdown process. API nodes were only informed of node state changes via periodic heartbeat signals, and so might not be able to avoid interacting with the node shutting down. This generated unnecessary failures when the heartbeat interval was long. Now when a data node is being gracefully stopped, all API nodes are notified directly, allowing them to experience minimal disruption. (Bug #28380808)
ndb_restore did not restore autoincrement values correctly when one or more staging tables were in use. As part of this fix, we also in such cases block applying of the
SYSTAB_0backup log, whose content continued to be applied directly based on the table ID, which could ovewrite the autoincrement values stored in
SYSTAB_0for unrelated tables. (Bug #27917769, Bug #27831990)
References: See also: Bug #27832033.
ndb_restore employed a mechanism for restoring autoincrement values which was not atomic, and thus could yield incorrect autoincrement values being restored when multiple instances of ndb_restore were used in parallel. (Bug #27832033)
References: See also: Bug #27917769, Bug #27831990.
When tables with
BLOBcolumns were dropped and then re-created with a different number of
BLOBcolumns the event definitions for monitoring table changes could become inconsistent in certain error situations involving communication errors when the expected cleanup of the corresponding events was not performed. In particular, when the new versions of the tables had more
BLOBcolumns than the original tables, some events could be missing. (Bug #27072756)
When query memory was exhausted in the
DBSPJkernel block while storing correlation IDs for deferred operations, the query was aborted with error status 20000 Query aborted due to out of query memory. (Bug #26995027)
References: See also: Bug #86537.
When running a cluster with 4 or more data nodes under very high loads, data nodes could sometimes fail with Error 899 Rowid already allocated. (Bug #25960230)
mysqld shut down unexpectedly when a purge of the binary log was requested before the server had completely started, and it was thus not yet ready to delete rows from the
ndb_binlog_indextable. Now when this occurs, requests for any needed purges of the
ndb_binlog_indextable are saved in a queue and held for execution when the server has completely started. (Bug #25817834)
MaxBufferedEpochsis used on data nodes to avoid excessive buffering of row changes due to lagging
NDBevent API subscribers; when epoch acknowledgements from one or more subscribers lag by this number of epochs, an asynchronous disconnection is triggered, allowing the data node to release the buffer space used for subscriptions. Since this disconnection is asynchronous, it may be the case that it has not completed before additional new epochs are completed on the data node, resulting in new epochs not being able to seize GCP completion records, generating warnings such as those shown here:
[ndbd] ERROR -- c_gcp_list.seize() failed... ... [ndbd] WARNING -- ACK wo/ gcp record...
And leading to the following warning:
Disconnecting node %u because it has exceeded MaxBufferedEpochs (100 > 100), epoch ....
This fix performs the following modifications:
Modifies the size of the GCP completion record pool to ensure that there is always some extra headroom to account for the asynchronous nature of the disconnect processing previously described, thus avoiding
Modifies the wording of the
MaxBufferedEpochswarning to avoid the contradictory phrase “100 > 100”.
Asynchronous disconnection of mysqld from the cluster caused any subsequent attempt to start an NDB API transaction to fail. If this occurred during a bulk delete operation, the SQL layer called
HA::end_bulk_delete(), whose implementation by
ha_ndbclusterassumed that a transaction had been started, and could fail if this was not the case. This problem is fixed by checking that the transaction pointer used by this method is set before referencing it. (Bug #20116393)
Removed warnings raised when compiling
NDBwith Clang 6. (Bug #93634, Bug #29112560)
When executing the redo log in debug mode it was possible for a data node to fail when deallocating a row. (Bug #93273, Bug #28955797)
As part of this fix,
ON DELETE CASCADEis no longer supported for foreign keys on
NDBtables when the child table contains a column that uses any of the
TEXTtypes. (Bug #89511, Bug #27484882)