-
Performance: This release introduces a number of significant improvements in the performance of scans; these are listed here:
Row checksums help detect hardware issues, but do so at the expense of performance.
NDB
now offers the possibility of disabling these by setting the newndb_row_checksum
server system variable to 0; doing this means that row checksums are not used for new or altered tables. This can have a significant impact (5 to 10 percent, in some cases) on performance for all types of queries. This variable is set to 1 by default, to provide compatibility with the previous behavior.A query consisting of a scan can execute for a longer time in the LDM threads when the queue is not busy.
Previously, columns were read before checking a pushed condition; now checking of a pushed condition is done before reading any columns.
Performance of pushed joins should see significant improvement when using range scans as part of join execution.
(WL #11722)
Packaging: Expected NDB header files were in the
devel
RPM package instead oflibndbclient-devel
. (Bug #84580, Bug #26448330)NDB Disk Data: While restoring a local checkpoint, it is possible to insert a row that already exists in the database; this is expected behavior which is handled by deleting the existing row first, then inserting the new copy of that row. In some cases involving data on disk,
NDB
failed to delete the existing row. (Bug #91627, Bug #28341843)NDB Client Programs: Removed a memory leak in
NdbImportUtil::RangeList
that was revealed in ASAN builds. (Bug #91479, Bug #28264144)MySQL NDB ClusterJ: When a table containing a
BLOB
or aTEXT
field was being queried with ClusterJ for a record that did not exist, an exception (“The method is not valid in current blob state”) was thrown. (Bug #28536926)MySQL NDB ClusterJ: A
NullPointerException
was thrown when a full table scan was performed with ClusterJ on tables containing either a BLOB or a TEXT field. It was because the proper object initializations were omitted, and they have now been added by this fix. (Bug #28199372, Bug #91242)When copying deleted rows from a live node to a node just starting, it is possible for one or more of these rows to have a global checkpoint index equal to zero. If this happened at the same time that a full local checkpoint was started due to the undo log getting full, the
LCP_SKIP
bit was set for a row having GCI = 0, leading to an unplanned shutdown of the data node. (Bug #28372628)ndbmtd sometimes experienced a hang when exiting due to log thread shutdown. (Bug #28027150)
-
When the
SUMA
kernel block receives aSUB_STOP_REQ
signal, it executes the signal then replies withSUB_STOP_CONF
. (After this response is relayed back to the API, the API is open to send moreSUB_STOP_REQ
signals.) After sending theSUB_STOP_CONF
, SUMA drops the subscription if no subscribers are present, which involves sending multipleDROP_TRIG_IMPL_REQ
messages toDBTUP
. LocalProxy can handle up to 21 of these requests in parallel; any more than this are queued in the Short Time Queue. When execution of aDROP_TRIG_IMPL_REQ
was delayed, there was a chance for the queue to become overloaded, leading to a data node shutdown with Error in short time queue.This issue is fixed by delaying the execution of the
SUB_STOP_REQ
signal ifDBTUP
is already handlingDROP_TRIG_IMPL_REQ
signals at full capacity, rather than queueing up theDROP_TRIG_IMPL_REQ
signals. (Bug #26574003) -
Having a large number of deferred triggers could sometimes lead to job buffer exhaustion. This could occur due to the fact that a single trigger can execute many operations—for example, a foreign key parent trigger may perform operations on multiple matching child table rows—and that a row operation on a base table can execute multiple triggers. In such cases, row operations are executed in batches. When execution of many triggers was deferred—meaning that all deferred triggers are executed at pre-commit—the resulting concurrent execution of a great many trigger operations could cause the data node job buffer or send buffer to be exhausted, leading to failure of the node.
This issue is fixed by limiting the number of concurrent trigger operations as well as the number of trigger fire requests outstanding per transaction.
For immediate triggers, limiting of concurrent trigger operations may increase the number of triggers waiting to be executed, exhausting the trigger record pool and resulting in the error Too many concurrently fired triggers (increase MaxNoOfFiredTriggers. This can be avoided by increasing
MaxNoOfFiredTriggers
, reducing the user transaction batch size, or both. (Bug #22529864)References: See also: Bug #18229003, Bug #27310330.
ndbout
andndberr
became invalid after exiting frommgmd_run()
, and redirecting to them before the next call tomgmd_run()
caused a segmentation fault, during an ndb_mgmd service restart. This fix ensures thatndbout
andndberr
remain valid at all times. (Bug #17732772, Bug #28536919)-
Running out of undo log buffer memory was reported using error 921 Out of transaction memory ... (increase SharedGlobalMemory).
This problem is fixed by introducing a new error code 923 Out of undo buffer memory (increase UNDO_BUFFER_SIZE). (Bug #92125, Bug #28537319)
When moving an
OperationRec
from the serial to the parallel queue,Dbacc::startNext()
failed to update theOperationrec::OP_ACC_LOCK_MODE
flag which is required to reflect the accumulatedOP_LOCK_MODE
of all previous operations in the parallel queue. This inconsistency in the ACC lock queues caused the scan lock takeover mechanism to fail, as it incorrectly concluded that a lock to take over was not held. The same failure caused an assert when aborting an operation that was a member of such an inconsistent parallel lock queue. (Bug #92100, Bug #28530928)A data node failed during startup due to the arrival of a
SCAN_FRAGREQ
signal during the restore phase. This signal originated from a scan begun before the node had previously failed and which should have been aborted due to the involvement of the failed node in it. (Bug #92059, Bug #28518448)-
DBTUP
sent the error Tuple corruption detected when a read operation attempted to read the value of a tuple inserted within the same transaction. (Bug #92009, Bug #28500861)References: See also: Bug #28893633.
-
False constraint violation errors could occur when executing updates on self-referential foreign keys. (Bug #91965, Bug #28486390)
References: See also: Bug #90644, Bug #27930382.
An
NDB
internal trigger definition could be dropped while pending instances of the trigger remained to be executed, by attempting to look up the definition for a trigger which had already been released. This caused unpredictable and thus unsafe behavior possibly leading to data node failure. The root cause of the issue lay in an invalid assumption in the code relating to determining whether a given trigger had been released; the issue is fixed by ensuring that the behavior ofNDB
, when a trigger definition is determined to have been released, is consistent, and that it meets expectations. (Bug #91894, Bug #28451957)In some cases, a workload that included a high number of concurrent inserts caused data node failures when using debug builds. (Bug #91764, Bug #28387450, Bug #29055038)
During an initial node restart with disk data tables present and
TwoPassInitialNodeRestartCopy
enabled,DBTUP
used an unsafe scan in disk order. Such scans are no longer employed in this case. (Bug #91724, Bug #28378227)Checking for old LCP files tested the table version, but this was not always dependable. Now, instead of relying on the table version, the check regards as invalid any LCP file having a
maxGCI
smaller than itscreateGci
. (Bug #91637, Bug #28346565)In certain cases, a cascade update trigger was fired repeatedly on the same record, which eventually consumed all available concurrent operations, leading to Error 233 Out of operation records in transaction coordinator (increase MaxNoOfConcurrentOperations). If
MaxNoOfConcurrentOperations
was set to a value sufficiently high to avoid this, the issue manifested as data nodes consuming very large amounts of CPU, very likely eventually leading to a timeout. (Bug #91472, Bug #28262259)-
Inserting a row into an
NDB
table having a self-referencing foreign key that referenced a unique index on the table other than the primary key failed withER_NO_REFERENCED_ROW_2
. This was due to the fact thatNDB
checked foreign key constraints before the unique index was updated, so that the constraint check was unable to use the index for locating the row. Now, in such cases,NDB
waits until all unique index values have been updated before checking foreign key constraints on the inserted row. (Bug #90644, Bug #27930382)References: See also: Bug #91965, Bug #28486390.
-
A connection string beginning with a slash (
/
) character is now rejected by ndb_mgmd.Our thanks to Daniël van Eeden for contributing this fix. (Bug #90582, Bug #27912892)