Performance: This release introduces a number of significant improvements in the performance of scans; these are listed here:
Row checksums help detect hardware issues, but do so at the expense of performance.
NDBnow offers the possibility of disabling these by setting the new
ndb_row_checksumserver system variable to 0; doing this means that row checksums are not used for new or altered tables. This can have a significant impact (5 to 10 percent, in some cases) on performance for all types of queries. This variable is set to 1 by default, to provide compatibility with the previous behavior.
A query consisting of a scan can execute for a longer time in the LDM threads when the queue is not busy.
Previously, columns were read before checking a pushed condition; now checking of a pushed condition is done before reading any columns.
Performance of pushed joins should see significant improvement when using range scans as part of join execution.
NDB Disk Data:
NDBnow implements schema distribution of disk data objects including tablespaces and log file groups by SQL nodes when they connect to a cluster, just as it does for
NDBdatabases and in-memory tables. This eliminates a possible mismatch between the MySQL data dictionary and the
NDBdictionary following a native backup and restore that could arise when disk data tablespaces and undo log file groups were restored to the
NDBdictionary, but not to the MySQL Server's data dictionary.
NDB Disk Data:
NDBnow makes use of the MySQL data dictionary to ensure correct distribution of tablespaces and log file groups across all cluster SQL nodes when connecting to the cluster.
The extra metadata property for
NDBtables is now used to store information from the MySQL data dictionary. Because this information is significantly larger than the binary representation previously stored here (a
.frmfile, no longer used), the hard-coded size limit for this extra metadata has been increased.
This change can have an impact on downgrades: Trying to read
NDBtables created in NDB 8.0.14 and later may cause data nodes running NDB 8.0.13 or earlier to fail on startup with
NDBerror code 2355 Failure to restore schema: Permanent error, external action needed: Resource configuration error. This can happen if the table's metadata exceeds 6K in size, which was the old limit. Tables created in NDB 8.0.13 and earlier can be read by later versions without any issues.
Packaging: Expected NDB header files were in the
develRPM package instead of
libndbclient-devel. (Bug #84580, Bug #26448330)
version_commentsystem variable was not correctly configured in mysqld binaries and returned a generic pattern instead of the proper value. This affected all NDB Cluster binary releases with the exception of
.debpackages. (Bug #29054235)
Trying to build from source using
-Werrorfailed with GCC 8. (Bug #28707282)
When copying deleted rows from a live node to a node just starting, it is possible for one or more of these rows to have a global checkpoint index equal to zero. If this happened at the same time that a full local checkpoint was started due to the undo log getting full, the
LCP_SKIPbit was set for a row having GCI = 0, leading to an unplanned shutdown of the data node. (Bug #28372628)
ndbmtd sometimes experienced a hang when exiting due to log thread shutdown. (Bug #28027150)
NDBhas an upper limit of 128 characters for a fully qualified table name. Due to the fact that mysqld names
NDBtables using the format
def, it is possible for statements such as
CREATE TABLEto fail in spite of the fact that neither the table name nor the database name exceeds the 63-character limit imposed by
NDB. The error raised in such cases was misleading and has been replaced. (Bug #27769521)
References: See also: Bug #27769801.
SUMAkernel block receives a
SUB_STOP_REQsignal, it executes the signal then replies with
SUB_STOP_CONF. (After this response is relayed back to the API, the API is open to send more
SUB_STOP_REQsignals.) After sending the
SUB_STOP_CONF, SUMA drops the subscription if no subscribers are present, which involves sending multiple
DBTUP. LocalProxy can handle up to 21 of these requests in parallel; any more than this are queued in the Short Time Queue. When execution of a
DROP_TRIG_IMPL_REQwas delayed, there was a chance for the queue to become overloaded, leading to a data node shutdown with Error in short time queue.
This issue is fixed by delaying the execution of the
DBTUPis already handling
DROP_TRIG_IMPL_REQsignals at full capacity, rather than queueing up the
DROP_TRIG_IMPL_REQsignals. (Bug #26574003)
ndb_restore returned -1 instead of the expected exit code in the event of an index rebuild failure. (Bug #25112726)
When starting, a data node copies metadata, while a local checkpoint updates metadata. To avoid any conflict, any ongoing LCP activity is paused while metadata is being copied. An issue arose when a local checkpoint was paused on a given node, and another node that was also restarting checked for a complete LCP on this node; the check actually caused the LCP to be completed before copying of metadata was complete and so ended the pause prematurely. Now in such cases, the LCP completion check waits to complete a paused LCP until copying of metadata is finished and the pause ends as expected, within the LCP in which it began. (Bug #24827685)
ndberrbecame invalid after exiting from
mgmd_run(), and redirecting to them before the next call to
mgmd_run()caused a segmentation fault, during an ndb_mgmd service restart. This fix ensures that
ndberrremain valid at all times. (Bug #17732772, Bug #28536919)
NdbScanFilterdid not always handle
NULLaccording to the SQL standard, which could result in sending non-qualifying rows to be filtered (otherwise not necessary) by the MySQL server. (Bug #92407, Bug #28643463)
References: See also: Bug #93977, Bug #29231709.
The internal function
ndb_my_error()was used in
prepare_inplace_alter_table()to report errors when the function failed to interact with
NDB. The function was expected to push the NDB error as warning on the stack and then set an error by translating the NDB error to a MySQL error and then finally call
my_error()with the translated error. When calling
my_error(), the function extracts a format string that may contain placeholders and use the format string in a function similar to
sprintf(), which in this case could read arbitrary memory leading to a segmentation fault, due to the fact that
my_error()was called without any arguments.
The fix is always to push the NDB error as a warning and then set an error with a provided message. A new helper function has been added to
Thd_ndbto be used in place of
ndb_my_error(). (Bug #92244, Bug #28575934)
Running out of undo log buffer memory was reported using error 921 Out of transaction memory ... (increase SharedGlobalMemory).
This problem is fixed by introducing a new error code 923 Out of undo buffer memory (increase UNDO_BUFFER_SIZE). (Bug #92125, Bug #28537319)
When moving an
OperationRecfrom the serial to the parallel queue,
Dbacc::startNext()failed to update the
Operationrec::OP_ACC_LOCK_MODEflag which is required to reflect the accumulated
OP_LOCK_MODEof all previous operations in the parallel queue. This inconsistency in the ACC lock queues caused the scan lock takeover mechanism to fail, as it incorrectly concluded that a lock to take over was not held. The same failure caused an assert when aborting an operation that was a member of such an inconsistent parallel lock queue. (Bug #92100, Bug #28530928)
ndb_restore did not free all memory used after being called to restore a table that already existed. (Bug #92085, Bug #28525898)
A data node failed during startup due to the arrival of a
SCAN_FRAGREQsignal during the restore phase. This signal originated from a scan begun before the node had previously failed and which should have been aborted due to the involvement of the failed node in it. (Bug #92059, Bug #28518448)
DBTUPsent the error Tuple corruption detected when a read operation attempted to read the value of a tuple inserted within the same transaction. (Bug #92009, Bug #28500861)
References: See also: Bug #28893633.
False constraint violation errors could occur when executing updates on self-referential foreign keys. (Bug #91965, Bug #28486390)
References: See also: Bug #90644, Bug #27930382.
NDBinternal trigger definition could be dropped while pending instances of the trigger remained to be executed, by attempting to look up the definition for a trigger which had already been released. This caused unpredictable and thus unsafe behavior possibly leading to data node failure. The root cause of the issue lay in an invalid assumption in the code relating to determining whether a given trigger had been released; the issue is fixed by ensuring that the behavior of
NDB, when a trigger definition is determined to have been released, is consistent, and that it meets expectations. (Bug #91894, Bug #28451957)
In some cases, a workload that included a high number of concurrent inserts caused data node failures when using debug builds. (Bug #91764, Bug #28387450, Bug #29055038)
During an initial node restart with disk data tables present and
DBTUPused an unsafe scan in disk order. Such scans are no longer employed in this case. (Bug #91724, Bug #28378227)
Checking for old LCP files tested the table version, but this was not always dependable. Now, instead of relying on the table version, the check regards as invalid any LCP file having a
maxGCIsmaller than its
createGci. (Bug #91637, Bug #28346565)
In certain cases, a cascade update trigger was fired repeatedly on the same record, which eventually consumed all available concurrent operations, leading to Error 233 Out of operation records in transaction coordinator (increase MaxNoOfConcurrentOperations). If
MaxNoOfConcurrentOperationswas set to a value sufficiently high to avoid this, the issue manifested as data nodes consuming very large amounts of CPU, very likely eventually leading to a timeout. (Bug #91472, Bug #28262259)