MySQL NDB Cluster 8.0.16 is a new development release of NDB 8.0,
based on MySQL Server 8.0 and including features in version 8.0 of
the NDB
storage engine, as well as
fixing recently discovered bugs in previous NDB Cluster releases.
Obtaining NDB Cluster 8.0. NDB Cluster 8.0 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.
For an overview of changes made in NDB Cluster 8.0, see What is New in MySQL NDB Cluster 8.0.
This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 8.0 through MySQL 8.0.16 (see Changes in MySQL 8.0.16 (2019-04-25, General Availability)).
-
Incompatible Change: Distribution of privileges amongst MySQL servers connected to NDB Cluster, as implemented in NDB 7.6 and earlier, does not function in NDB 8.0, and most code supporting these has now been removed. When a mysqld detects such tables in
NDB
, it creates shadow tables local to itself using theInnoDB
storage engine; these shadow tables are created on each MySQL server connected to an NDB cluster. Privilege tables using theNDB
storage engine are not employed for access control; once all connected MySQL servers are upgraded, the privilege tables inNDB
can be removed safely using ndb_drop_table.For compatibility reasons, ndb_restore
--restore-privilege-tables
can still be used to restore distributed privilege tables present in a backup taken from a previous release of NDB Cluster to a cluster running NDB 8.0. These tables are handled as described in the preceeding paragraph.For additional information regarding upgrades from previous NDB Cluster release series to NDB 8.0, see Upgrading and Downgrading NDB Cluster. (WL #12507, WL #12511)
-
Incompatible Change: For consistency with
InnoDB
, theNDB
storage engine now uses a generated constraint name if theCONSTRAINT
clause is not specified, or thesymbol
CONSTRAINT
keyword is specified without asymbol
. In previousNDB
releases,NDB
used theFOREIGN KEY
value.index_name
This change described above may introduce incompatibilities for applications that depend on the previous foreign key constraint naming behavior. (Bug #29173134)
Packaging: A Docker image for this release can be obtained from https://hub.docker.com/r/mysql/mysql-cluster/. (Bug #96084, Bug #30010921)
-
Allocation of resources in the transaction corrdinator (see The DBTC Block) is now performed using dynamic memory pools. This means that resource allocation determined by data node configuration parameters such as those discussed in Transaction parameters and Transaction temporary storage is now limited so as not to exceed the total resources available to the transaction coordinator.
As part of this work, several new data node parameters controlling transactional resources in
DBTC
, listed here, have also been added. For more information about these new parameters, see Transaction resource allocation parameters. (Bug #29164271, Bug #29194843, WL #9756, WL #12523)References: See also: Bug #29131828.
-
NDB
backups can now be performed in a parallel fashion on individual data nodes using multiple local data managers (LDMs). (Previously, backups were done in parallel across data nodes, but were always serial within data node processes.) No special syntax is required for theSTART BACKUP
command in the ndb_mgm client to enable this feature, but all data nodes must be using multiple LDMs. This means that data nodes must be running ndbmtd and they must be configured to use multiple LDMs prior to taking the backup (see Multi-Threading Configuration Parameters (ndbmtd)).The
EnableMultithreadedBackup
data node parameter introduced in this release is enabled (set to 1) by default. You can disable multi-threaded backups and force the creation of single-threaded backups by setting this parameter to 0 on all data nodes or in the[ndbd default]
section of the cluster's global configuration file (config.ini
).ndb_restore also now detects a multi-threaded backup and automatically attempts to restore it in parallel. It is also possible to restore backups taken in parallel to a previous version of NDB Cluster by slightly modifying the usual restore procedure.
For more information about taking and restoring NDB Cluster backups that were created using parallelism on the data nodes, see Taking an NDB Backup with Parallel Data Nodes, and Restoring from a backup taken in parallel. (Bug #28563639, Bug #28993400, WL #8517)
The compile-cluster script included in the
NDB
source distribution no longer supports in-source builds. (WL #12303)Building with CMake3 is now supported by the compile-cluster script included in the
NDB
source distribution. (WL #12303)-
As part of its automatic synchronization mechanism,
NDB
now implements a metadata change monitor thread for detecting changes made to metadata for data objects such as tables, tablespaces, and log file groups with the MySQL data dictionary. This thread runs in the background, checking every 60 seconds for inconsistencies between theNDB
dictionary and the MySQL data dictionary.The monitor polling interval can be adjusted by setting the value of the
ndb_metadata_check_interval
system variable, and can be disabled altogether by settingndb_metadata_check
to OFF. The number of times that inconsistencies have been detected since mysqld was last started is shown as the status variable,Ndb_metadata_detected_count
. (WL #11913) -
Condition pushdown is no longer limited to predicate terms referring to column values from the same table to which the condition was being pushed; column values from tables earlier in the query plan can now also be referred to from pushed conditions. This lets the data nodes filter out more rows (in parallel), leaving less work to be performed by a single mysqld process, which is expected to provide significant improvements in query performance.
For more information, see Engine Condition Pushdown Optimization. (WL #12686)
Important Change; NDB Disk Data: mysqldump terminated unexpectedly when attempting to dump
NDB
disk data tables. The underlying reason for this was that mysqldump expected to find information relating to undo log buffers in theEXTRA
column of theINFORMATION_SCHEMA.FILES
table but this information had been removed in NDB 8.0.13. This information is now restored to theEXTRA
column. (Bug #28800252)Important Change: When restoring to a cluster using data node IDs different from those in the original cluster, ndb_restore tried to open files corresponding to node ID 0. To keep this from happening, the
--nodeid
and--backupid
options—neither of which has a default value—are both now explicitly required when invoking ndb_restore. (Bug #28813708)Important Change: Starting with this release, the default value of the
ndb_log_bin
system variable is nowFALSE
. (Bug #27135706)Packaging; MySQL NDB ClusterJ:
libndbclient
was missing from builds on some platforms. (Bug #28997603)-
NDB Disk Data: When a log file group had more than 18 undo logs, it was not possible to restart the cluster. (Bug #251155785)
References: See also: Bug #28922609.
-
NDB Disk Data: Concurrent
CREATE TABLE
statements using tablespaces caused deadlocks between metadata locks. This occurred whenNdb_metadata_change_monitor
acquired exclusive metadata locks on tablespaces and logfile groups after detecting metadata changes, due to the fact that each exclusive metadata lock in turn acquired a global schema lock. This fix attempts to solve that issue by downgrading the locks taken byNdb_metadata_change_monitor
toMDL_SHARED_READ
. (Bug #29175268)References: See also: Bug #29394407.
NDB Disk Data: The error message returned when validation of
MaxNoOfOpenFiles
in relation toInitialNoOfOpenFiles
failed has been improved to make the nature of the problem clearer to users. (Bug #28943749)NDB Disk Data: Schema distribution of
ALTER TABLESPACE
andALTER LOGFILE GROUP
statements failed on a participant MySQL server if the referenced tablespace or log file group did not exist in its data dictionary. Now in such cases, the effects of the statement are distributed successfully regardless of any initial mismatch between MySQL servers. (Bug #28866336)NDB Disk Data: Repeated execution of
ALTER TABLESPACE ... ADD DATAFILE
against the same tablespace caused data nodes to hang and left them, after being killed manually, unable to restart. (Bug #22605467)NDB Replication: A
DROP DATABASE
operation involving certain very large tables could lead to an unplanned shutdown of the cluster. (Bug #28855062)NDB Replication: When writes on the master—done in such a way that multiple changes affecting
BLOB
column values belonging to the same primary key were part of the same epoch—were replicated to the slave, Error 1022 occurred due to constraint violations in theNDB$BLOB_
table. (Bug #28746560)id
_part
-
NDB Cluster APIs:
NDB
now identifies short-lived transactions not needing the reduction of lock contention provided byNdbBlob::close()
and no longer invokes this method in cases (such as when autocommit is enabled) in which unlocking merely causes extra work and round trips to be performed prior to committing or aborting the transaction. (Bug #29305592)References: See also: Bug #49190, Bug #11757181.
NDB Cluster APIs: When the most recently failed operation was released, the pointer to it held by
NdbTransaction
became invalid and when accessed led to failure of the NDB API application. (Bug #29275244)-
NDB Cluster APIs: When the
NDB
kernel'sSUMA
block sends aTE_ALTER
event, it does not keep track of when all fragments of the event are sent. WhenNDB
receives the event, it buffers the fragments, and processes the event when all fragments have arrived. An issue could possibly arise for very large table definitions, when the time between transmission and reception could span multiple epochs; during this time,SUMA
could send aSUB_GCP_COMPLETE_REP
signal to indicate that it has sent all data for an epoch, even though in this case that is not entirely true since there may be fragments of aTE_ALTER
event still waiting on the data node to be sent. Reception of theSUB_GCP_COMPLETE_REP
leads to closing the buffers for that epoch. Thus, whenTE_ALTER
finally arrives, NDB assumes that it is a duplicate from an earlier epoch, and silently discards it.We fix the problem by making sure that the
SUMA
kernel block never sends aSUB_GCP_COMPLETE_REP
for any epoch in which there are unsent fragments for aSUB_TABLE_DATA
signal.This issue could have an impact on NDB API applications making use of
TE_ALTER
events. (SQL nodes do not make any use ofTE_ALTER
events and so they and applications using them were not affected.) (Bug #28836474) -
When a pushed join executing in the
DBSPJ
block had to store correlation IDs during query execution, memory for these was allocated for the lifetime of the entire query execution, even though these specific correlation IDs are required only when producing the most recent batch in the result set. Subsequent batches require additional correlation IDs to be stored and allocated; thus, if the query took sufficiently long to complete, this led to exhaustion of query memory (error 20008). Now in such cases, memory is allocated only for the lifetime of the current result batch, and is freed and made available for re-use following completion of the batch. (Bug #29336777)References: See also: Bug #26995027.
-
When comparing or hashing a fixed-length string that used a
NO_PAD
collation, any trailing padding characters (typically spaces) were sent to the hashing and comparison functions such that they became significant, even though they were not supposed to be. Now any such trailing spaces are trimmed from a fixed-length string whenever aNO_PAD
collation is specified.NoteSince
NO_PAD
collations were introduced as part of UCA-9.0 collations in MySQL 8.0, there should be no impact relating to this fix on upgrades to NDB 8.0 from previous GA releases of NDB Cluster.(Bug #29322313)
-
When a
NOT IN
orNOT BETWEEN
predicate was evaluated as a pushed condition,NULL
values were not eliminated by the condition as specified in the SQL standard. (Bug #29232744)References: See also: Bug #28672214.
-
Internally,
NDB
treatsNULL
as less than any other value, and predicates of the form
orcolumn
<value
are checked for possible nulls. Predicates of the formcolumn
<=value
orvalue
>column
were not checked, which could lead to errors. Now in such cases, these predicates are rewritten so that the column comes first, so that they are also checked for the presence ofvalue
>=column
NULL
. (Bug #29231709)References: See also: Bug #92407, Bug #28643463.
After folding of constants was implemented in the MySQL Optimizer, a condition containing a
DATE
orDATETIME
literal could no longer be pushed down byNDB
. (Bug #29161281)When a join condition made a comparison between a column of a temporal data type such as
DATE
orDATETIME
and a constant of the same type, the predicate was pushed if the condition was expressed in the form
, but not when in inverted order (ascolumn
operator
constant
). (Bug #29058732)constant
inverse_operator
column
When processing a pushed condition,
NDB
did not detect errors or warnings thrown when a literal value being compared was outside the range of the data type it was being compared with,and thus truncated. This could lead to excess or missing rows in the result. (Bug #29054626)-
If an
EQ_REF
orREF
key in the child of a pushed join referred to any columns of a table not a member of the pushed join, this table was not anNDB
table (because its format was of nonnative endianness), and the data type of the column being joined on was stored in an endian-sensitive format, then the key generated was generated, likely resulting in the return of an (invalid) empty join result.Since only big endian platforms may store tables in nonnative (little endian) formats, this issue was expected only on such platforms, most notably SPARC, and not on x86 platforms. (Bug #29010641)
API and data nodes running NDB 7.6 and later could not use an existing parsed configuration from an earlier release series due to being overly strict with regard to having values defined for configuration parameters new to the later release, which placed a restriction on possible upgrade paths. Now NDB 7.6 and later are less strict about having all new parameters specified explicitly in the configuration which they are served, and use hard-coded default values in such cases. (Bug #28993400)
NDB 7.6 SQL nodes hung when trying to connect to an NDB 8.0 cluster. (Bug #28985685)
The schema distribution data maintained in the
NDB
binary logging thread keeping track of the number of subscribers to theNDB
schema table always allocated some memory structures for 256 data nodes regardless of the actual number of nodes. NowNDB
allocates only as many of these structures as are actually needed. (Bug #28949523)Added
DUMP 406
(NdbfsDumpRequests
) to provideNDB
file system information to global checkpoint and local checkpoint stall reports in the node logs. (Bug #28922609)When a joined table was eliminated early as not pushable, it could not be referred to in any subsequent join conditions from other tables without eliminating those conditions from consideration even if those conditions were otherwise pushable. (Bug #28898811)
When starting or restarting an SQL node and connecting to a cluster where
NDB
was already started,NDB
reported Error 4009 Cluster Failure because it could not acquire a global schema lock. This was because the MySQL Server as part of initialization acquires exclusive metadata locks in order to modify internal data structures, and thendbcluster
plugin acquires the global schema lock. If the connection toNDB
was not yet properly set up during mysqld initialization, mysqld received a warning fromndbcluster
when the latter failed to acquire global schema lock, and printed it to the log file, causing an unexpected error in the log. This is fixed by not pushing any warnings to background threads when failure to acquire a global schema lock occurs and pushing theNDB
error as a warning instead. (Bug #28898544)-
A race condition between the
DBACC
andDBLQH
kernel blocks occurred when different operations in a transaction on the same row were concurrently being prepared and aborted. This could result inDBTUP
attempting to prepare an operation when a preceding operation had been aborted, which was unexpected and could thus lead to undefined behavior including potential data node failures. To solve this issue,DBACC
andDBLQH
now check that all dependencies are still valid before attempting to prepare an operation.NoteThis fix also supersedes a previous one made for a related issue which was originally reported as Bug #28500861.
(Bug #28893633)
-
Where a data node was restarted after a configuration change whose result was a decrease in the sum of
MaxNoOfTables
,MaxNoOfOrderedIndexes
, andMaxNoOfUniqueHashIndexes
, it sometimes failed with a misleading error message which suggested both a temporary error and a bug, neither of which was the case.The failure itself is expected, being due to the fact that there is at least one table object with an ID greater than the (new) sum of the parameters just mentioned, and that this table cannot be restored since the maximum value for the ID allowed is limited by that sum. The error message has been changed to reflect this, and now indicates that this is a permanent error due to a problem configuration. (Bug #28884880)
The
ndbinfo.cpustat
table reported inaccurate information regarding send threads. (Bug #28884157)Execution of an LCP_COMPLETE_REP signal from the master while the LCP status was IDLE led to an assertion. (Bug #28871889)
NDB
now provides on-the-fly.frm
file translation during discovery of tables created in versions of the software that did not support the MySQL Data Dictionary. Previously, such translation of tables that had old-style metadata was supported only during schema synchronization during MySQL server startup, but not subsequently, which led to errors whenNDB
tables having old-style metadata, created by ndb_restore and other such tools after mysqld had been started, were accessed usingSHOW CREATE TABLE
orSELECT
; these tables were usable only after restarting mysqld. With this fix, the restart is no longer required. (Bug #28841009)An in-place upgrade to an NDB 8.0 release from an earlier relase did not remove
.ndb
files, even though these are no longer used in NDB 8.0. (Bug #28832816)-
Removed
storage/ndb/demos
and the demonstration scripts and support files it contained from the source tree. These were obsolete and unmaintained, and did not function with any current version of NDB Cluster.Also removed
storage/ndb/include/newtonapi
, which included files relating to an obsolete and unmaintained API not supported in any release of NDB Cluster, as well as references elsewhere to these files. (Bug #28808766) -
There was no version compatibility table for NDB 8.x; this meant that API nodes running NDB 8.0.13 or 7.6.x could not connect to data nodes running NDB 8.0.14. This issue manifested itself for NDB API users as a failure in
wait_until_ready()
. (Bug #28776365)References: See also: Bug #18886034, Bug #18874849.
Issuing a
STOP
command in the ndb_mgm client caused ndbmtd processes which had recently been added to the cluster to hang in Phase 4 during shutdown. (Bug #28772867)-
A fix for a previous issue disabled the usage of pushed conditions for lookup type (
eq_ref
) operations in pushed joins. It was thought at the time that not pushing a lookup condition would not have any measurable impact on performance, since only a single row could be eliminated if the condition failed. The solution implemented at that time did not take into account the possibility that, in a pushed join, a lookup operation could be a parent operation for other lookups, and even scan operations, which meant that eliminating a single row could actually result in an entire branch being eliminated in error. (Bug #28728603)References: This issue is a regression of: Bug #27397802.
-
When a local checkpoint (LCP) was complete on all data nodes except one, and this node failed,
NDB
did not continue with the steps required to finish the LCP. This led to the following issues:No new LCPs could be started.
Redo and Undo logs were not trimmed and so grew excessively large, causing an increase in times for recovery from disk. This led to write service failure, which eventually led to cluster shutdown when the head of the redo log met the tail. This placed a limit on cluster uptime.
Node restarts were no longer possible, due to the fact that a data node restart requires that the node's state be made durable on disk before it can provide redundancy when joining the cluster. For a cluster with two data nodes and two fragment replicas, this meant that a restart of the entire cluster (system restart) was required to fix the issue (this was not necessary for a cluster with two fragment replicas and four or more data nodes). (Bug #28728485, Bug #28698831)
References: See also: Bug #11757421.
The pushability of a condition to
NDB
was limited in that all predicates joined by a logicalAND
within a given condition had to be pushable toNDB
in order for the entire condition to be pushed. In some cases this severely restricted the pushability of conditions. This fix breaks up the condition into its components, and evaluates the pushability of each predicate; if some of the predicates cannot be pushed, they are returned as a remainder condition which can be evaluated by the MySQL server. (Bug #28728007)Running
ANALYZE TABLE
on anNDB
table with an index having longer than the supported maximum length caused data nodes to fail. (Bug #28714864)-
It was possible in certain cases for nodes to hang during an initial restart. (Bug #28698831)
References: See also: Bug #27622643.
When a condition was pushed to a storage engine, it was re-evaluated by the server, in spite of the fact that only rows matching the pushed condition should ever be returned to the server in such cases. (Bug #28672214)
-
In some cases, one and sometimes more data nodes underwent an unplanned shutdown while running ndb_restore. This occurred most often, but was not always restircted to, when restoring to a cluster having a different number of data nodes from the cluster on which the original backup had been taken.
The root cause of this issue was exhaustion of the pool of
SafeCounter
objects, used by theDBDICT
kernel block as part of executing schema transactions, and taken from a per-block-instance pool shared with protocols used forNDB
event setup and subscription processing. The concurrency of event setup and subscription processing is such that theSafeCounter
pool can be exhausted; event and subscription processing can handle pool exhaustion, but schema transaction processing could not, which could result in the node shutdown experienced during restoration.This problem is solved by giving
DBDICT
schema transactions an isolated pool of reservedSafeCounters
which cannot be exhausted by concurrentNDB
event activity. (Bug #28595915) When a backup aborted due to buffer exhaustion, synchronization of the signal queues prior to the expected drop of triggers for insert, update, and delete operations resulted in abort signals being processed before the
STOP_BACKUP
phase could continue. The abort changed the backup status toABORT_BACKUP_ORD
, which led to an unplanned shutdown of the data node since resumingSTOP_BACKUP
requires that the state beSTOP_BACKUP_REQ
. Now the backup status is not set toSTOP_BACKUP_REQ
(requesting the backup to continue) until after signal queue synchronization is complete. (Bug #28563639)The output of ndb_config
--configinfo
--xml
--query-all
now shows that configuration changes for theThreadConfig
andMaxNoOfExecutionThreads
data node parameters require system initial restarts (restart="system" initial="true"
). (Bug #28494286)After a commit failed due to an error, mysqld shut down unexpectedly while trying to get the name of the table involved. This was due to an issue in the internal function
ndbcluster_print_error()
. (Bug #28435082)API nodes should observe that a node is moving through
SL_STOPPING
phases (graceful stop) and stop using the node for new transactions, which minimizes potential disruption in the later phases of the node shutdown process. API nodes were only informed of node state changes via periodic heartbeat signals, and so might not be able to avoid interacting with the node shutting down. This generated unnecessary failures when the heartbeat interval was long. Now when a data node is being gracefully stopped, all API nodes are notified directly, allowing them to experience minimal disruption. (Bug #28380808)ndb_config
--diff-default
failed when trying to read a parameter whose default value was the empty string (""
). (Bug #27972537)-
ndb_restore did not restore autoincrement values correctly when one or more staging tables were in use. As part of this fix, we also in such cases block applying of the
SYSTAB_0
backup log, whose content continued to be applied directly based on the table ID, which could ovewrite the autoincrement values stored inSYSTAB_0
for unrelated tables. (Bug #27917769, Bug #27831990)References: See also: Bug #27832033.
-
ndb_restore employed a mechanism for restoring autoincrement values which was not atomic, and thus could yield incorrect autoincrement values being restored when multiple instances of ndb_restore were used in parallel. (Bug #27832033)
References: See also: Bug #27917769, Bug #27831990.
Executing
SELECT
* FROM
INFORMATION_SCHEMA.TABLES
caused SQL nodes to restart in some cases. (Bug #27613173)When tables with
BLOB
columns were dropped and then re-created with a different number ofBLOB
columns the event definitions for monitoring table changes could become inconsistent in certain error situations involving communication errors when the expected cleanup of the corresponding events was not performed. In particular, when the new versions of the tables had moreBLOB
columns than the original tables, some events could be missing. (Bug #27072756)-
When query memory was exhausted in the
DBSPJ
kernel block while storing correlation IDs for deferred operations, the query was aborted with error status 20000 Query aborted due to out of query memory. (Bug #26995027)References: See also: Bug #86537.
When running a cluster with 4 or more data nodes under very high loads, data nodes could sometimes fail with Error 899 Rowid already allocated. (Bug #25960230)
mysqld shut down unexpectedly when a purge of the binary log was requested before the server had completely started, and it was thus not yet ready to delete rows from the
ndb_binlog_index
table. Now when this occurs, requests for any needed purges of thendb_binlog_index
table are saved in a queue and held for execution when the server has completely started. (Bug #25817834)-
MaxBufferedEpochs
is used on data nodes to avoid excessive buffering of row changes due to laggingNDB
event API subscribers; when epoch acknowledgements from one or more subscribers lag by this number of epochs, an asynchronous disconnection is triggered, allowing the data node to release the buffer space used for subscriptions. Since this disconnection is asynchronous, it may be the case that it has not completed before additional new epochs are completed on the data node, resulting in new epochs not being able to seize GCP completion records, generating warnings such as those shown here:[ndbd] ERROR -- c_gcp_list.seize() failed... ... [ndbd] WARNING -- ACK wo/ gcp record...
And leading to the following warning:
Disconnecting node %u because it has exceeded MaxBufferedEpochs (100 > 100), epoch ....
This fix performs the following modifications:
Modifies the size of the GCP completion record pool to ensure that there is always some extra headroom to account for the asynchronous nature of the disconnect processing previously described, thus avoiding
c_gcp_list
seize failures.Modifies the wording of the
MaxBufferedEpochs
warning to avoid the contradictory phrase “100 > 100”.
(Bug #20344149)
Asynchronous disconnection of mysqld from the cluster caused any subsequent attempt to start an NDB API transaction to fail. If this occurred during a bulk delete operation, the SQL layer called
HA::end_bulk_delete()
, whose implementation byha_ndbcluster
assumed that a transaction had been started, and could fail if this was not the case. This problem is fixed by checking that the transaction pointer used by this method is set before referencing it. (Bug #20116393)Removed warnings raised when compiling
NDB
with Clang 6. (Bug #93634, Bug #29112560)When executing the redo log in debug mode it was possible for a data node to fail when deallocating a row. (Bug #93273, Bug #28955797)
-
An
NDB
table having both a foreign key on anotherNDB
table usingON DELETE CASCADE
and one or moreTEXT
orBLOB
columns leaked memory.As part of this fix,
ON DELETE CASCADE
is no longer supported for foreign keys onNDB
tables when the child table contains a column that uses any of theBLOB
orTEXT
types. (Bug #89511, Bug #27484882)