ndb_restore now reports the specific
NDBerror number and message when it is unable to load a table descriptor from a backup
.ctlfile. This can happen when attempting to restore a backup taken from a later version of the NDB Cluster software to a cluster running an earlier version—for example, when the backup includes a table using a character set which is unknown to the version of ndb_restore being used to restore it. (Bug #30184265)
When executing a global schema lock (GSL),
NDBused a single
Ndb_table_guardobject for successive retires when attempting to obtain a table object reference; it was not possible for this to succeed after failing on the first attempt, since
Ndb_table_guardassumes that the underlying object pointer is determined once only—at initialisation—with the previously retrieved pointer being returned from a cached reference thereafter.
This resulted in infinite waits to obtain the GSL, causing the binlog injector thread to hang so that mysqld considered all
NDBtables to be read-only. To avoid this problem,
NDBnow uses a fresh instance of
Ndb_table_guardfor each such retry. (Bug #30120858)
References: This issue is a regression of: Bug #30086352.
Restoring tables for which
MAX_ROWSwas used to alter partitioning from a backup made from NDB 7.4 to a cluster running NDB 7.6 did not work correctly. This is fixed by ensuring that the upgrade code handling
PartitionBalancesupplies a valid table specification to the
NDBdictionary. (Bug #29955656)
NDBindex statistics are calculated based on the topology of one fragment of an ordered index; the fragment chosen in any particular index is decided at index creation time, both when the index is originally created, and when a node or system restart has recreated the index locally. This calculation is based in part on the number of fragments in the index, which can change when a table is reorganized. This means that, the next time that the node is restarted, this node may choose a different fragment, so that no fragments, one fragment, or two fragments are used to generate index statistics, resulting in errors from
This issue is solved by modifying the online table reorganization to recalculate the chosen fragment immediately, so that all nodes are aligned before and after any subsequent restart. (Bug #29534647)
During a restart when the data nodes had started but not yet elected a president, the management server received a node ID already in use error, which resulted in excessive retries and logging. This is fixed by introducing a new error 1705 Not ready for connection allocation yet for this case.
During a restart when the data nodes had not yet completed node failure handling, a spurious Failed to allocate nodeID error was returned. This is fixed by adding a check to detect an incomplete node start and to return error 1703 Node failure handling not completed instead.
As part of this fix, the frequency of retries has been reduced for not ready to alloc nodeID errors, an error insert has been added to simulate a slow restart for testing purposes, and log messages have been reworded to indicate that the relevant node ID allocation errors are minor and only temporary. (Bug #27484514)