In some contexts, a data node process may be sent
SIGCHLDby other processes. Previously, the data node process bound a signal handler treating this signal as an error, which could cause the process to shut down unexpectedly when run in the foreground in a Kubernetes environment (and possibly under other conditions as well). This occurred despite the fact that a data node process never starts child processes itself, and thus there is no need to take action in such cases.
To fix this, the handler has been modified to use
SIG_IGN, which should result in cleanup of any child processes.
The running node from a node group scans each fragment (
CopyFrag) and sends the rows to the starting peer in order to synchronize it. If a row from the fragment is locked exclusively by a user transaction, it blocks the scan from reading the fragment, causing the copyFrag to stall.
If the starting node fails during the
CopyFragphase then normal node failure handling takes place. The cordinator node's transaction coordinator (TC) performs TC takeover of the user transactions from the TCs on the failed node. Since the scan that aids copying the fragment data over to the starting node is considered internal only, it is not a candidate for takeover, thus the takeover TC marks the
CopyFragscan as closed at the next opportunity, and waits until it is closed.
The current issue arose when the
CopyFragscan was in the
waiting for row lockstate, and the closing of the marked scan was not performed. This led to TC takeover stalling while waiting for the close, causing unfinished node failure handling, and eventually a GCP stall potentially affecting redo logging, local checkpoints, and NDB Replication.
We fix this by closing the marked
CopyFragscan whenever a node failure occurs while the
CopyFragis waiting for a row lock. (Bug #34823988)
References: See also: Bug #35037327.
In certain cases, invalid signal data was not handled correctly. (Bug #34787608)
Following execution of
DROP NODEGROUPin the management client, attempting to creating or altering an
NDBtable specifying an explicit number of partitions or using
MAX_ROWSwas rejected with Got error 771 'Given NODEGROUP doesn't exist in this cluster' from NDB. (Bug #34649576)
In a cluster with multiple management nodes, when one management node connected and later disconnected, any remaining management nodes were not aware of this node and were eventually forced to shut down when stopped nodes reconnected; this happened whenever the cluster still had live data nodes.
On investigation it was found that node disconnection handling was done in the
ConfigManagerbut the expected
NF_COMPLETEREPsignal never actually arrived. We solve this by handling disconnecting management nodes when the
NODE_FAILREPsignal arrives, rather than waiting for
NF_COMPLETEREP. (Bug #34582919)
When reorganizing a table with
ALTER TABLE ... REORGANIZE PARTITIONfollowing addition of new data nodes to the cluster, unique hash indexes were not redistributed properly. (Bug #30049013)
During a rolling restart of a cluster with two data nodes, one of them refused to start, reporting that the redo log fragment file size did not match the configured one and that an initial start of the node was required. Fixed by addressing a previously unhandled error returned by
fsync(), and retrying the write. (Bug #28674694)
For a partial local checkpoint, each fragment LCP must be to be able to determine the precise state of the fragment at the start of the LCP and the precise difference in the fragment between the start of the current LCP and the start of the previous one. This is tracked using row header information and page header information; in cases where physical pages are removed this is also tracked in logical page map information.
A page included in the current LCP, before the LCP scan reaches it, is released due to the commit or rollback of some operation on the fragment, also releasing the last used storage on the page.
Since the released page could not be found by the scan, the release itself set the
LCP_SCANNED_BITof the page map entry it was mapped into, in order to indicate that the page was already handled from the point of view of the current LCP, causing subsequent allocation and release of the pages mapped to the entry during the LCP to be ignored. The state of the entry at the start of the LCP was also set as allocated in the page map entry.
These settings are cleared only when the next LCP is prepared. Any page release associated with the page map entry before the clearance would violate the requirement that the bit is not set; we resolve this issue by removing the (incorrect) requirement. (Bug #23539857)
A data node could hit an overly strict assertion when the thread liveness watchdog triggered while the node was already shutting down. We fix the issue by relaxing this assertion in such cases. (Bug #22159697)
Removed a leak of long message buffer memory that occurred each time an index was scanned for updating index statistics. (Bug #108043, Bug #34568135)
Fixed an uninitialized variable in
Suma.cpp. (Bug #106081, Bug #33764143)