Documentation Home
MySQL NDB Cluster 7.4 Release Notes
Download these Release Notes
PDF (US Ltr) - 0.8Mb
PDF (A4) - 0.8Mb


MySQL NDB Cluster 7.4 Release Notes  /  Changes in MySQL NDB Cluster 7.4.2 (5.6.21-ndb-7.4.2) (2014-11-05, Development Milestone)

Changes in MySQL NDB Cluster 7.4.2 (5.6.21-ndb-7.4.2) (2014-11-05, Development Milestone)

MySQL NDB Cluster 7.4.2 is a new release of NDB Cluster, based on MySQL Server 5.6 and including features under development for version 7.4 of the NDB storage engine, as well as fixing a number of recently discovered bugs in previous NDB Cluster releases.

Obtaining MySQL NDB Cluster 7.4.  MySQL NDB Cluster 7.4 source code and binaries can be obtained from https://dev.mysql.com/downloads/cluster/.

For an overview of changes made in MySQL NDB Cluster 7.4, see What is New in NDB Cluster 7.4.

This release also incorporates all bug fixes and changes made in previous NDB Cluster releases, as well as all bug fixes and feature changes which were added in mainline MySQL 5.6 through MySQL 5.6.21 (see Changes in MySQL 5.6.21 (2014-09-23, General Availability)).

Functionality Added or Changed

  • Added the restart_info table to the ndbinfo information database to provide current status and timing information relating to node and system restarts. By querying this table, you can observe the progress of restarts in real time. (Bug #19795152)

  • After adding new data nodes to the configuration file of a MySQL NDB Cluster having many API nodes, but prior to starting any of the data node processes, API nodes tried to connect to these missing data nodes several times per second, placing extra loads on management nodes and the network. To reduce unnecessary traffic caused in this way, it is now possible to control the amount of time that an API node waits between attempts to connect to data nodes which fail to respond; this is implemented in two new API node configuration parameters StartConnectBackoffMaxTime and ConnectBackoffMaxTime.

    Time elapsed during node connection attempts is not taken into account when applying these parameters, both of which are given in milliseconds with approximately 100 ms resolution. As long as the API node is not connected to any data nodes as described previously, the value of the StartConnectBackoffMaxTime parameter is applied; otherwise, ConnectBackoffMaxTime is used.

    In a MySQL NDB Cluster with many unstarted data nodes, the values of these parameters can be raised to circumvent connection attempts to data nodes which have not yet begun to function in the cluster, as well as moderate high traffic to management nodes.

    For more information about the behavior of these parameters, see Defining SQL and Other API Nodes in an NDB Cluster. (Bug #17257842)

Bugs Fixed

  • NDB Replication: The fix for Bug #18770469 in the MySQL Server made changes in the transactional behavior of the temporary conversion tables used when replicating between tables with different schemas. These changes as implemented are not compatible with NDB, and thus the fix for this bug has been reverted in MySQL NDB Cluster. (Bug #19692387)

    References: See also: Bug #19704825. Reverted patches: Bug #18770469.

  • When performing a batched update, where one or more successful write operations from the start of the batch were followed by write operations which failed without being aborted (due to the AbortOption being set to AO_IgnoreError), the failure handling for these by the transaction coordinator leaked CommitAckMarker resources. (Bug #19875710)

    References: This issue is a regression of: Bug #19451060, Bug #73339.

  • Online downgrades to MySQL NDB Cluster 7.3 failed when a MySQL NDB Cluster 7.4 master attempted to request a local checkpoint with 32 fragments from a data node already running NDB 7.3, which supports only 2 fragments for LCPs. Now in such cases, the NDB 7.4 master determines how many fragments the data node can handle before making the request. (Bug #19600834)

  • The fix for a previous issue with the handling of multiple node failures required determining the number of TC instances the failed node was running, then taking them over. The mechanism to determine this number sometimes provided an invalid result which caused the number of TC instances in the failed node to be set to an excessively high value. This in turn caused redundant takeover attempts, which wasted time and had a negative impact on the processing of other node failures and of global checkpoints. (Bug #19193927)

    References: This issue is a regression of: Bug #18069334.

  • The server side of an NDB transporter disconnected an incoming client connection very quickly during the handshake phase if the node at the server end was not yet ready to receive connections from the other node. This led to problems when the client immediately attempted once again to connect to the server socket, only to be disconnected again, and so on in a repeating loop, until it suceeded. Since each client connection attempt left behind a socket in TIME_WAIT, the number of sockets in TIME_WAIT increased rapidly, leading in turn to problems with the node on the server side of the transporter.

    Further analysis of the problem and code showed that the root of the problem lay in the handshake portion of the transporter connection protocol. To keep the issue described previously from occurring, the node at the server end now sends back a WAIT message instead of disconnecting the socket when the node is not yet ready to accept a handshake. This means that the client end should no longer need to create a new socket for the next retry, but can instead begin immediately with a new handshake hello message. (Bug #17257842)

  • Corrupted messages to data nodes sometimes went undetected, causing a bad signal to be delivered to a block which aborted the data node. This failure in combination with disconnecting nodes could in turn cause the entire cluster to shut down.

    To keep this from happening, additional checks are now made when unpacking signals received over TCP, including checks for byte order, compression flag (which must not be used), and the length of the next message in the receive buffer (if there is one).

    Whenever two consecutive unpacked messages fail the checks just described, the current message is assumed to be corrupted. In this case, the transporter is marked as having bad data and no more unpacking of messages occurs until the transporter is reconnected. In addition, an entry is written to the cluster log containing the error as well as a hex dump of the corrupted message. (Bug #73843, Bug #19582925)

  • During restore operations, an attribute's maximum length was used when reading variable-length attributes from the receive buffer instead of the attribute's actual length. (Bug #73312, Bug #19236945)