Incompatible Change: The minimum value for the
RedoOverCommitCounterdata node configuration parameter has been increased from 0 to 1. The minimum value for the
RedoOverCommitLimitdata node configuration parameter has also been increased from 0 to 1.
You should check the cluster global configuration file and make any necessary adjustments to values set for these parameters before upgrading. (Bug #29752703)
If a transaction was aborted while getting a page from the disk page buffer and the disk system was overloaded, the transaction hung indefinitely. This could also cause restarts to hang and node failure handling to fail. (Bug #30397083, Bug #30360681)
References: See also: Bug #30152258.
Data node failures with the error Another node failed during system restart... occurred during a partial restart. (Bug #30368622)
The wrong number of bytes was reported in the cluster log for a completed local checkpoint. (Bug #30274618)
References: See also: Bug #29942998.
The number of data bytes for the summary event written in the cluster log when a backup completed was truncated to 32 bits, so that there was a significant mismatch between the number of log records and the number of data records printed in the log for this event. (Bug #29942998)
Using 2 LDM threads on a 2-node cluster with 10 threads per node could result in a partition imbalance, such that one of the LDM threads on each node was the primary for zero fragments. Trying to restore a multi-threaded backup from this cluster failed because the datafile for one LDM contained only the 12-byte data file header, which ndb_restore was unable to read. The same problem could occur in other cases, such as when taking a backup immediately after adding an empty node online.
It was found that this occurred when
ODirectwas enabled for an EOF backup data file write whose size was less than 512 bytes and the backup was in the
STOPPINGstate. This normally occurs only for an aborted backup, but could also happen for a successful backup for which an LDM had no fragments. We fix the issue by introducing an additional check to ensure that writes are skipped only if the backup actually contains an error which should cause it to abort. (Bug #29892660)
References: See also: Bug #30371389.
In some cases the
SignalSenderclass, used as part of the implementation of ndb_mgmd and
ndbinfo, buffered excessive numbers of unneeded
API_REGCONFsignals, leading to unnecessary consumption of memory. (Bug #29520353)
References: See also: Bug #20075747, Bug #29474136.
The maximum global checkpoint (GCP) commit lag and GCP save timeout are recalculated whenever a node shuts down, to take into account the change in number of data nodes. This could lead to the unintentional shutdown of a viable node when the threshold decreased below the previous value. (Bug #27664092)
References: See also: Bug #26364729.
A transaction which inserts a child row may run concurrently with a transaction which deletes the parent row for that child. One of the transactions should be aborted in this case, lest an orphaned child row result.
Before committing an insert on a child row, a read of the parent row is triggered to confirm that the parent exists. Similarly, before committing a delete on a parent row, a read or scan is performed to confirm that no child rows exist. When insert and delete transactions were run concurrently, their prepare and commit operations could interact in such a way that both transactions committed. This occurred because the triggered reads were performed using
NdbOperation::LockMode), which are not strong enough to prevent such error scenarios.
This problem is fixed by using the stronger
LM_SimpleReadlock mode for both triggered reads. The use of
LM_CommittedReadlocks ensures that at least one transaction aborts in every possible scenario involving transactions which concurrently insert into child rows and delete from parent rows. (Bug #22180583)