NDB Disk Data: ndbmtd sometimes terminated unexpectedly when it could not complete a lookup for a log file group during a restore operation. (Bug #31284086)
NDB Disk Data: An uninitialized variable led to issues when performing Disk Data DDL operations following a restart of the cluster. (Bug #30592528)
ndb_restore
--remap-column
did not handle columns containingNULL
values correctly. Now any offset specified by the mapping function used with this option is not applied toNULL
, so thatNULL
is preserved as expected. (Bug #31966676)-
MaxDiskWriteSpeedOwnRestart
was not honored as an upper bound for local checkpoint writes during a node restart. (Bug #31337487)References: See also: Bug #29943227.
-
During a node restart, the
SUMA
block of the node that is starting must get a copy of the subscriptions (events with subscribers) and subscribers (NdbEventOperation
instances which are executing) from a node already running. Before the copy is complete, nodes which are still starting ignore any user-levelSUB_START
orSUB_STOP
requests; after the copy is done, they can participate in such requests. While the copy operation is in progress, user-levelSUB_START
andSUB_STOP
requests are blocked using aDICT
lock.An issue was found whereby a starting node could participate in
SUB_START
andSUB_STOP
requests after the lock was requested, but before it is granted, which resulted in unsuccessfulSUB_START
andSUB_STOP
requests. This fix ensures that the nodes cannot participate in these requests until after theDICT
lock has actually been granted. (Bug #31302657) DUMP 1001
(DumpPageMemoryOnFail
) now prints out information about the internal state of the data node page memory manager when allocation of pages fails due to resource constraints. (Bug #31231286)Statistics generated by
NDB
for use in tracking internal objects allocated and deciding when to release them were not calculated correctly, with the result that the threshold for resource usage was 50% higher than intended. This fix corrects the issue, and should allow for reduced memory usage. (Bug #31127237)The Dojo toolkit included with NDB Cluster and used by the Auto-Installer was upgraded to version 1.15.3. (Bug #31029110)
A packed version 1 configuration file returned by ndb_mgmd could contain duplicate entries following an upgrade to NDB 8.0, which made the file incompatible with clients using version 1. This occurs due to the fact that the code for handling backwards compatibility assumed that the entries in each section were already sorted when merging it with the default section. To fix this, we now make sure that this sort is performed prior to merging. (Bug #31020183)
-
When executing any of the
SHUTDOWN
,ALL STOP
, orALL RESTART
management commands, it is possible for different nodes to attempt to stop on different global checkpoint index (CGI) boundaries. If they succeed in doing so, then a subsequent system restart is slower than normal because any nodes having an earlier stop GCI must undergo takeover as part of the process. When nodes failing on the first GCI boundary cause surviving nodes to be nonviable, surviving nodes suffer an arbitration failure; this has the positive effect of causing such nodes to halt at the correct GCI, but can give rise to spurious errors or similar.To avoid such issues, extra synchronization is now performed during a planned shutdown to reduce the likelihood that different data nodes attempt to shut down at different GCIs as well as the use of unnecessary node takeovers during system restarts. (Bug #31008713)
When responding to a
SCANTABREQ
, an API node can provide a distribution key if it knows that the scan should work on only one fragment, in which case the distribution key should be the fragment ID, but in some cases a hash of the partition key was used instead, leading to failures inDBTC
. (Bug #30774226)Several memory leaks found in ndb_import have been removed. (Bug #30756434, Bug #30727956)
-
The master node in a backup shut down unexpectedly on receiving duplicate replies to a
DEFINE_BACKUP_REQ
signal. These occurred when a data node other than the master errored out during the backup, and the backup master handled the situation by sending itself aDEFINE_BACKUP_REF
signal on behalf of the missing node, which resulted in two replies being received from the same node (aCONF
signal from the problem node prior to shutting down and theREF
signal from the master on behalf of this node), even though the master expected only one reply per node. This scenario was also encountered forSTART_BACKUP_REQ
andSTOP_BACKUP_REQ
signals.This is fixed in such cases by allowing duplicate replies when the error is the result of an unplanned node shutdown. (Bug #30589827)
-
When updating
NDB_TABLE
comment options usingALTER TABLE
, other options which has been set to non-default values when the table was created but which were not specified in theALTER TABLE
statement could be reset to their defaults.See Setting NDB Comment Options, for more information. (Bug #30428829)
When, during a restart, a data node received a
GCP_SAVEREQ
signal prior to beginning start phase 9, and thus needed to perform a global checkpoint index write to a local data manager's local checkpoint control file, it did not record information from theDIH
block originating with the node that sent the signal as part of the data written. This meant that, later in start phase 9, when attempting to send aGCP_SAVECONF
signal in response to theGCP_SAVEREQ
, this information was not available, which meant the response could not be sent, resulting in an unplanned shutdown of the data node. (Bug #30187949)-
Setting
EnableRedoControl
tofalse
did not fully disableMaxDiskWriteSpeed
,MaxDiskWriteSpeedOtherNodeRestart
, andMaxDiskWriteSpeedOwnRestart
as expected. (Bug #29943227)References: See also: Bug #31337487.
Removed a memory leak found in the ndb_import utility. (Bug #29820879)
-
A
BLOB
value is stored byNDB
in multiple parts; when reading such a value, one read operation is executed per part. If a part is not found, the read fails with a row not found error, which indicates a corruptedBLOB
, since aBLOB
should never have any missing parts. A problem can arise because this error is reported as the overall result of the read operation, which means that mysqld sees no error and reports zero rows returned.This issue is fixed by adding a check specifically for the case in wich a blob part is not found. Now, when this occurs, overwriting the row not found error with corrupted blob, which causes the originating
SELECT
statement to fail as expected. Users of the NDB API should be aware that, despite this change, theNdbBlob::getValue()
method continues to report the error as row not found in such cases. (Bug #28590428) Incorrect handling of operations on fragment replicas during node restarts could result in a forced shutdown, or in content diverging between fragment replicas, when primary keys with nonbinary (case-sensitive) equality conditions were used. (Bug #98526, Bug #30884622)