-
Performance: This release introduces a number of significant improvements in the performance of scans; these are listed here:
Row checksums help detect hardware issues, but do so at the expense of performance.
NDB
now offers the possibility of disabling these by setting the newndb_row_checksum
server system variable to 0; doing this means that row checksums are not used for new or altered tables. This can have a significant impact (5 to 10 percent, in some cases) on performance for all types of queries. This variable is set to 1 by default, to provide compatibility with the previous behavior.A query consisting of a scan can execute for a longer time in the LDM threads when the queue is not busy.
Previously, columns were read before checking a pushed condition; now checking of a pushed condition is done before reading any columns.
Performance of pushed joins should see significant improvement when using range scans as part of join execution.
(WL #11722)
NDB Disk Data:
NDB
now implements schema distribution of disk data objects including tablespaces and log file groups by SQL nodes when they connect to a cluster, just as it does forNDB
databases and in-memory tables. This eliminates a possible mismatch between the MySQL data dictionary and theNDB
dictionary following a native backup and restore that could arise when disk data tablespaces and undo log file groups were restored to theNDB
dictionary, but not to the MySQL Server's data dictionary. (WL #12172)NDB Disk Data:
NDB
now makes use of the MySQL data dictionary to ensure correct distribution of tablespaces and log file groups across all cluster SQL nodes when connecting to the cluster. (WL #12333)-
The extra metadata property for
NDB
tables is now used to store information from the MySQL data dictionary. Because this information is significantly larger than the binary representation previously stored here (a.frm
file, no longer used), the hard-coded size limit for this extra metadata has been increased.This change can have an impact on downgrades: Trying to read
NDB
tables created in NDB 8.0.14 and later may cause data nodes running NDB 8.0.13 or earlier to fail on startup withNDB
error code 2355 Failure to restore schema: Permanent error, external action needed: Resource configuration error. This can happen if the table's metadata exceeds 6K in size, which was the old limit. Tables created in NDB 8.0.13 and earlier can be read by later versions without any issues.For more information, see Changes in NDB table extra metadata, and See also MySQL Data Dictionary. (Bug #27230681, WL #10665)
Packaging: Expected NDB header files were in the
devel
RPM package instead oflibndbclient-devel
. (Bug #84580, Bug #26448330)-
ndbmemcache:
libndbclient.so
was not able to find and loadlibssl.so
, which could cause issues withndbmemcache
and Java-based programs usingNDB
. (Bug #26824659)References: See also: Bug #27882088, Bug #28410275.
MySQL NDB ClusterJ: The
ndb.clusterj
test for NDB 8.0.13 failed when being run more than once. This was deal to a new, stricter rule with NDB 8.0.13 that did not allow temporary files being left behind in the variable folder ofmysql-test-run (mtr)
. With this fix, the temporary files are deleted before the test is executed. (Bug #28279038)MySQL NDB ClusterJ: A
NullPointerException
was thrown when a full table scan was performed with ClusterJ on tables containing either a BLOB or a TEXT field. It was because the proper object initializations were omitted, and they have now been added by this fix. (Bug #28199372, Bug #91242)The
version_comment
system variable was not correctly configured in mysqld binaries and returned a generic pattern instead of the proper value. This affected all NDB Cluster binary releases with the exception of.deb
packages. (Bug #29054235)Trying to build from source using
-DWITH_NDBCLUSTER
and-Werror
failed with GCC 8. (Bug #28707282)When copying deleted rows from a live node to a node just starting, it is possible for one or more of these rows to have a global checkpoint index equal to zero. If this happened at the same time that a full local checkpoint was started due to the undo log getting full, the
LCP_SKIP
bit was set for a row having GCI = 0, leading to an unplanned shutdown of the data node. (Bug #28372628)ndbmtd sometimes experienced a hang when exiting due to log thread shutdown. (Bug #28027150)
-
NDB
has an upper limit of 128 characters for a fully qualified table name. Due to the fact that mysqld namesNDB
tables using the format
, wheredatabase_name
/catalog_name
/table_name
catalog_name
is alwaysdef
, it is possible for statements such asCREATE TABLE
to fail in spite of the fact that neither the table name nor the database name exceeds the 63-character limit imposed byNDB
. The error raised in such cases was misleading and has been replaced. (Bug #27769521)References: See also: Bug #27769801.
-
When the
SUMA
kernel block receives aSUB_STOP_REQ
signal, it executes the signal then replies withSUB_STOP_CONF
. (After this response is relayed back to the API, the API is open to send moreSUB_STOP_REQ
signals.) After sending theSUB_STOP_CONF
, SUMA drops the subscription if no subscribers are present, which involves sending multipleDROP_TRIG_IMPL_REQ
messages toDBTUP
. LocalProxy can handle up to 21 of these requests in parallel; any more than this are queued in the Short Time Queue. When execution of aDROP_TRIG_IMPL_REQ
was delayed, there was a chance for the queue to become overloaded, leading to a data node shutdown with Error in short time queue.This issue is fixed by delaying the execution of the
SUB_STOP_REQ
signal ifDBTUP
is already handlingDROP_TRIG_IMPL_REQ
signals at full capacity, rather than queueing up theDROP_TRIG_IMPL_REQ
signals. (Bug #26574003) ndb_restore returned -1 instead of the expected exit code in the event of an index rebuild failure. (Bug #25112726)
When starting, a data node copies metadata, while a local checkpoint updates metadata. To avoid any conflict, any ongoing LCP activity is paused while metadata is being copied. An issue arose when a local checkpoint was paused on a given node, and another node that was also restarting checked for a complete LCP on this node; the check actually caused the LCP to be completed before copying of metadata was complete and so ended the pause prematurely. Now in such cases, the LCP completion check waits to complete a paused LCP until copying of metadata is finished and the pause ends as expected, within the LCP in which it began. (Bug #24827685)
ndbout
andndberr
became invalid after exiting frommgmd_run()
, and redirecting to them before the next call tomgmd_run()
caused a segmentation fault, during an ndb_mgmd service restart. This fix ensures thatndbout
andndberr
remain valid at all times. (Bug #17732772, Bug #28536919)-
NdbScanFilter
did not always handleNULL
according to the SQL standard, which could result in sending non-qualifying rows to be filtered (otherwise not necessary) by the MySQL server. (Bug #92407, Bug #28643463)References: See also: Bug #93977, Bug #29231709.
-
The internal function
ndb_my_error()
was used inndbcluster_get_tablespace_statistics()
andprepare_inplace_alter_table()
to report errors when the function failed to interact withNDB
. The function was expected to push the NDB error as warning on the stack and then set an error by translating the NDB error to a MySQL error and then finally callmy_error()
with the translated error. When callingmy_error()
, the function extracts a format string that may contain placeholders and use the format string in a function similar tosprintf()
, which in this case could read arbitrary memory leading to a segmentation fault, due to the fact thatmy_error()
was called without any arguments.The fix is always to push the NDB error as a warning and then set an error with a provided message. A new helper function has been added to
Thd_ndb
to be used in place ofndb_my_error()
. (Bug #92244, Bug #28575934) -
Running out of undo log buffer memory was reported using error 921 Out of transaction memory ... (increase SharedGlobalMemory).
This problem is fixed by introducing a new error code 923 Out of undo buffer memory (increase UNDO_BUFFER_SIZE). (Bug #92125, Bug #28537319)
When moving an
OperationRec
from the serial to the parallel queue,Dbacc::startNext()
failed to update theOperationrec::OP_ACC_LOCK_MODE
flag which is required to reflect the accumulatedOP_LOCK_MODE
of all previous operations in the parallel queue. This inconsistency in the ACC lock queues caused the scan lock takeover mechanism to fail, as it incorrectly concluded that a lock to take over was not held. The same failure caused an assert when aborting an operation that was a member of such an inconsistent parallel lock queue. (Bug #92100, Bug #28530928)ndb_restore did not free all memory used after being called to restore a table that already existed. (Bug #92085, Bug #28525898)
A data node failed during startup due to the arrival of a
SCAN_FRAGREQ
signal during the restore phase. This signal originated from a scan begun before the node had previously failed and which should have been aborted due to the involvement of the failed node in it. (Bug #92059, Bug #28518448)-
DBTUP
sent the error Tuple corruption detected when a read operation attempted to read the value of a tuple inserted within the same transaction. (Bug #92009, Bug #28500861)References: See also: Bug #28893633.
-
False constraint violation errors could occur when executing updates on self-referential foreign keys. (Bug #91965, Bug #28486390)
References: See also: Bug #90644, Bug #27930382.
An
NDB
internal trigger definition could be dropped while pending instances of the trigger remained to be executed, by attempting to look up the definition for a trigger which had already been released. This caused unpredictable and thus unsafe behavior possibly leading to data node failure. The root cause of the issue lay in an invalid assumption in the code relating to determining whether a given trigger had been released; the issue is fixed by ensuring that the behavior ofNDB
, when a trigger definition is determined to have been released, is consistent, and that it meets expectations. (Bug #91894, Bug #28451957)In some cases, a workload that included a high number of concurrent inserts caused data node failures when using debug builds. (Bug #91764, Bug #28387450, Bug #29055038)
During an initial node restart with disk data tables present and
TwoPassInitialNodeRestartCopy
enabled,DBTUP
used an unsafe scan in disk order. Such scans are no longer employed in this case. (Bug #91724, Bug #28378227)Checking for old LCP files tested the table version, but this was not always dependable. Now, instead of relying on the table version, the check regards as invalid any LCP file having a
maxGCI
smaller than itscreateGci
. (Bug #91637, Bug #28346565)In certain cases, a cascade update trigger was fired repeatedly on the same record, which eventually consumed all available concurrent operations, leading to Error 233 Out of operation records in transaction coordinator (increase MaxNoOfConcurrentOperations). If
MaxNoOfConcurrentOperations
was set to a value sufficiently high to avoid this, the issue manifested as data nodes consuming very large amounts of CPU, very likely eventually leading to a timeout. (Bug #91472, Bug #28262259)