MySQL NDB Cluster 7.6.8 is a new release of NDB 7.6, based on
MySQL Server 5.7 and including features in version 7.6 of the
NDB storage engine, as well as fixing
recently discovered bugs in previous NDB Cluster releases.
Obtaining NDB Cluster 7.6. NDB Cluster 7.6 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.
For an overview of changes made in NDB Cluster 7.6, see What is New in NDB Cluster 7.6.
This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 5.7 through MySQL 5.7.24 (see Changes in MySQL 5.7.24 (2018-10-22, General Availability)).
Performance: This release introduces a number of significant improvements in the performance of scans; these are listed here:
Row checksums help detect hardware issues, but do so at the expense of performance.
NDBnow offers the possibility of disabling these by setting the new
ndb_row_checksumserver system variable to 0; doing this means that row checksums are not used for new or altered tables. This can have a significant impact (5 to 10 percent, in some cases) on performance for all types of queries. This variable is set to 1 by default, to provide compatibility with the previous behavior.
A query consisting of a scan can execute for a longer time in the LDM threads when the queue is not busy.
Previously, columns were read before checking a pushed condition; now checking of a pushed condition is done before reading any columns.
Performance of pushed joins should see significant improvement when using range scans as part of join execution.
Packaging: Expected NDB header files were in the
develRPM package instead of
libndbclient-devel. (Bug #84580, Bug #26448330)
NDB Disk Data: While restoring a local checkpoint, it is possible to insert a row that already exists in the database; this is expected behavior which is handled by deleting the existing row first, then inserting the new copy of that row. In some cases involving data on disk,
NDBfailed to delete the existing row. (Bug #91627, Bug #28341843)
NDB Client Programs: Removed a memory leak in
NdbImportUtil::RangeListthat was revealed in ASAN builds. (Bug #91479, Bug #28264144)
MySQL NDB ClusterJ: When a table containing a
TEXTfield was being queried with ClusterJ for a record that did not exist, an exception (“The method is not valid in current blob state”) was thrown. (Bug #28536926)
MySQL NDB ClusterJ: A
NullPointerExceptionwas thrown when a full table scan was performed with ClusterJ on tables containing either a BLOB or a TEXT field. It was because the proper object initializations were omitted, and they have now been added by this fix. (Bug #28199372, Bug #91242)
When copying deleted rows from a live node to a node just starting, it is possible for one or more of these rows to have a global checkpoint index equal to zero. If this happened at the same time that a full local checkpoint was started due to the undo log getting full, the
LCP_SKIPbit was set for a row having GCI = 0, leading to an unplanned shutdown of the data node. (Bug #28372628)
ndbmtd sometimes experienced a hang when exiting due to log thread shutdown. (Bug #28027150)
SUMAkernel block receives a
SUB_STOP_REQsignal, it executes the signal then replies with
SUB_STOP_CONF. (After this response is relayed back to the API, the API is open to send more
SUB_STOP_REQsignals.) After sending the
SUB_STOP_CONF, SUMA drops the subscription if no subscribers are present, which involves sending multiple
DBTUP. LocalProxy can handle up to 21 of these requests in parallel; any more than this are queued in the Short Time Queue. When execution of a
DROP_TRIG_IMPL_REQwas delayed, there was a chance for the queue to become overloaded, leading to a data node shutdown with Error in short time queue.
This issue is fixed by delaying the execution of the
DBTUPis already handling
DROP_TRIG_IMPL_REQsignals at full capacity, rather than queueing up the
DROP_TRIG_IMPL_REQsignals. (Bug #26574003)
Having a large number of deferred triggers could sometimes lead to job buffer exhaustion. This could occur due to the fact that a single trigger can execute many operations—for example, a foreign key parent trigger may perform operations on multiple matching child table rows—and that a row operation on a base table can execute multiple triggers. In such cases, row operations are executed in batches. When execution of many triggers was deferred—meaning that all deferred triggers are executed at pre-commit—the resulting concurrent execution of a great many trigger operations could cause the data node job buffer or send buffer to be exhausted, leading to failure of the node.
This issue is fixed by limiting the number of concurrent trigger operations as well as the number of trigger fire requests outstanding per transaction.
For immediate triggers, limiting of concurrent trigger operations may increase the number of triggers waiting to be executed, exhausting the trigger record pool and resulting in the error Too many concurrently fired triggers (increase MaxNoOfFiredTriggers. This can be avoided by increasing
MaxNoOfFiredTriggers, reducing the user transaction batch size, or both. (Bug #22529864)
References: See also: Bug #18229003, Bug #27310330.
ndberrbecame invalid after exiting from
mgmd_run(), and redirecting to them before the next call to
mgmd_run()caused a segmentation fault, during an ndb_mgmd service restart. This fix ensures that
ndberrremain valid at all times. (Bug #17732772, Bug #28536919)
Running out of undo log buffer memory was reported using error 921 Out of transaction memory ... (increase SharedGlobalMemory).
This problem is fixed by introducing a new error code 923 Out of undo buffer memory (increase UNDO_BUFFER_SIZE). (Bug #92125, Bug #28537319)
When moving an
OperationRecfrom the serial to the parallel queue,
Dbacc::startNext()failed to update the
Operationrec::OP_ACC_LOCK_MODEflag which is required to reflect the accumulated
OP_LOCK_MODEof all previous operations in the parallel queue. This inconsistency in the ACC lock queues caused the scan lock takeover mechanism to fail, as it incorrectly concluded that a lock to take over was not held. The same failure caused an assert when aborting an operation that was a member of such an inconsistent parallel lock queue. (Bug #92100, Bug #28530928)
A data node failed during startup due to the arrival of a
SCAN_FRAGREQsignal during the restore phase. This signal originated from a scan begun before the node had previously failed and which should have been aborted due to the involvement of the failed node in it. (Bug #92059, Bug #28518448)
DBTUPsent the error Tuple corruption detected when a read operation attempted to read the value of a tuple inserted within the same transaction. (Bug #92009, Bug #28500861)
References: See also: Bug #28893633.
False constraint violation errors could occur when executing updates on self-referential foreign keys. (Bug #91965, Bug #28486390)
References: See also: Bug #90644, Bug #27930382.
NDBinternal trigger definition could be dropped while pending instances of the trigger remained to be executed, by attempting to look up the definition for a trigger which had already been released. This caused unpredictable and thus unsafe behavior possibly leading to data node failure. The root cause of the issue lay in an invalid assumption in the code relating to determining whether a given trigger had been released; the issue is fixed by ensuring that the behavior of
NDB, when a trigger definition is determined to have been released, is consistent, and that it meets expectations. (Bug #91894, Bug #28451957)
In some cases, a workload that included a high number of concurrent inserts caused data node failures when using debug builds. (Bug #91764, Bug #28387450, Bug #29055038)
During an initial node restart with disk data tables present and
DBTUPused an unsafe scan in disk order. Such scans are no longer employed in this case. (Bug #91724, Bug #28378227)
Checking for old LCP files tested the table version, but this was not always dependable. Now, instead of relying on the table version, the check regards as invalid any LCP file having a
maxGCIsmaller than its
createGci. (Bug #91637, Bug #28346565)
In certain cases, a cascade update trigger was fired repeatedly on the same record, which eventually consumed all available concurrent operations, leading to Error 233 Out of operation records in transaction coordinator (increase MaxNoOfConcurrentOperations). If
MaxNoOfConcurrentOperationswas set to a value sufficiently high to avoid this, the issue manifested as data nodes consuming very large amounts of CPU, very likely eventually leading to a timeout. (Bug #91472, Bug #28262259)
Inserting a row into an
NDBtable having a self-referencing foreign key that referenced a unique index on the table other than the primary key failed with
ER_NO_REFERENCED_ROW_2. This was due to the fact that
NDBchecked foreign key constraints before the unique index was updated, so that the constraint check was unable to use the index for locating the row. Now, in such cases,
NDBwaits until all unique index values have been updated before checking foreign key constraints on the inserted row. (Bug #90644, Bug #27930382)
References: See also: Bug #91965, Bug #28486390.
A connection string beginning with a slash (
/) character is now rejected by ndb_mgmd.
Our thanks to Daniël van Eeden for contributing this fix. (Bug #90582, Bug #27912892)