Incompatible Change: When the data nodes are only partially connected to the API nodes, a node used for a pushdown join may get its request from a transaction coordinator on a different node, without (yet) being connected to the API node itself. In such cases, the
NodeInfoobject for the requesting API node contained no valid info about the software version of the API node, which caused the
DBSPJblock to assume (incorrectly) when aborting to assume that the API node used
NDBversion 7.2.4 or earlier, requiring the use of a backward compatability mode to be used during query abort which sent a node failure error instead of the real error causing the abort.
Now, whenever this situation occurs, it is assumed that, if the
NDBsoftware version is not yet available, the API node version is greater than 7.2.4. (Bug #23049170)
NDB Cluster APIs: Deletion of Ndb objects used a dispoportionately high amount of CPU. (Bug #22986823)
Reserved send buffer for the loopback transporter, introduced in MySQL NDB Cluster 7.4.8 and used by API and management nodes for administrative signals, was calculated incorrectly. (Bug #23093656, Bug #22016081)
References: This issue is a regression of: Bug #21664515.
During a node restart, re-creation of internal triggers used for verifying the referential integrity of foreign keys was not reliable, because it was possible that not all distributed TC and LDM instances agreed on all trigger identities. To fix this problem, an extra step is added to the node restart sequence, during which the trigger identities are determined by querying the current master node. (Bug #23068914)
References: See also: Bug #23221573.
Following the forced shutdown of one of the 2 data nodes in a cluster where
NoOfReplicas=2, the other data node shut down as well, due to arbitration failure. (Bug #23006431)
ClusterMgris a internal component of NDB API and ndb_mgmd processes, part of
TransporterFacade—which in turn is a wrapper around the transporter registry—and shared with data nodes. This component is responsible for a number of tasks including connection setup requests; sending and monitoring of heartbeats; provision of node state information; handling of cluster disconnects and reconnects; and forwarding of cluster state indicators.
ClusterMgrmaintains a count of live nodes which is incremented on receiving a report of a node having connected (
reportConnected()method call), and decremented on receiving a report that a node has disconnected (
TransporterRegistry. This count is checked within
reportDisconnected()to verify that is it greater than zero.
The issue addressed here arose when node connections were very brief due to send buffer exhaustion (among other potential causes) and the check just described failed. This occurred because, when a node did not fully connect, it was still possible for the connection attempt to trigger a
reportDisconnected()call in spite of the fact that the connection had not yet been reported to
ClusterMgr; thus, the pairing of
reportDisconnected()calls was not guaranteed, which could cause the count of connected nodes to be set to zero even though there remained nodes that were still in fact connected, causing node crashes with debug builds of MySQL NDB Cluster, and potential errors or other adverse effects with release builds.
To fix this issue,
ClusterMgr::reportDisconnected()now verifies that a disconnected node had actually finished connecting completely before checking and decrementing the number of connected nodes. (Bug #21683144, Bug #22016081)
References: See also: Bug #21664515, Bug #21651400.
To reduce the possibility that a node's loopback transporter becomes disconnected from the transporter registry by
reportError()due to send buffer exhaustion (implemented by the fix for Bug #21651400), a portion of the send buffer is now reserved for the use of this transporter. (Bug #21664515, Bug #22016081)
References: See also: Bug #21651400, Bug #21683144.
The loopback transporter is similar to the TCP transporter, but is used by a node to send signals to itself as part of many internal operations. Like the TCP transporter, it could be disconnected due to certain conditions including send buffer exhaustion, but this could result in blocking of
TransporterFacadeand so cause multiple issues within an ndb_mgmd or API node process. To prevent this, a node whose loopback transporter becomes disconnected is now simply shut down, rather than allowing the node process to hang. (Bug #21651400, Bug #22016081)
References: See also: Bug #21683144, Bug #21664515.