Documentation Home
MySQL NDB Cluster 7.4 Release Notes
Download these Release Notes
PDF (US Ltr) - 0.8Mb
PDF (A4) - 0.8Mb


MySQL NDB Cluster 7.4 Release Notes  /  Release Series Changelogs: MySQL NDB Cluster 7.4  /  Changes in MySQL NDB Cluster 7.4.13 (5.6.34-ndb-7.4.13) (2016-10-18, General Availability)

Changes in MySQL NDB Cluster 7.4.13 (5.6.34-ndb-7.4.13) (2016-10-18, General Availability)

Bugs Fixed

  • NDB Cluster APIs: Reuse of transaction IDs could occur when Ndb objects were created and deleted concurrently. As part of this fix, the NDB API methods lock_ndb_objects() and unlock_ndb_objects are now declared as const. (Bug #23709232)

  • NDB Cluster APIs: When the management server was restarted while running an MGM API application that continuously monitored events, subsequent events were not reported to the application, with timeouts being returned indefinitely instead of an error.

    This occurred because sockets for event listeners were not closed when restarting mgmd. This is fixed by ensuring that event listener sockets are closed when the management server shuts down, causing applications using functions such as ndb_logevent_get_next() to receive a read error following the restart. (Bug #19474782)

  • Passing a nonexistent node ID to CREATE NODEGROUP led to random data node failures. (Bug #23748958)

  • DROP TABLE followed by a node shutdown and subesequent master takeover—and with the containing local checkpoint not yet complete prior to the takeover—caused the LCP to be ignored, and in some cases, the data node to fail. (Bug #23735996)

    References: See also: Bug #23288252.

  • Removed an invalid assertion to the effect that all cascading child scans are closed at the time API connection records are released following an abort of the main transaction. The assertion was invalid because closing of scans in such cases is by design asynchronous with respect to the main transaction, which means that subscans may well take some time to close after the main transaction is closed. (Bug #23709284)

  • A number of potential buffer overflow issues were found and fixed in the NDB codebase. (Bug #23152979)

  • A SIGNAL_DROPPED_REP handler invoked in response to long message buffer exhaustion was defined in the SPJ kernel block, but not actually used. This meant that the default handler from SimulatedBlock was used instead in such cases, which shut down the data node. (Bug #23048816)

    References: See also: Bug #23251145, Bug #23251423.

  • When a data node has insufficient redo buffer during a system restart, it does not participate in the restart until after the other nodes have started. After this, it performs a takeover of its fragments from the nodes in its node group that have already started; during this time, the cluster is already running and user activity is possible, including DML and DDL operations.

    During a system restart, table creation is handled differently in the DIH kernel block than normally, as this creation actually consists of reloading table definition data from disk on the master node. Thus, DIH assumed that any table creation that occurred before all nodes had restarted must be related to the restart and thus always on the master node. However, during the takeover, table creation can occur on non-master nodes due to user activity; when this happened, the cluster underwent a forced shutdown.

    Now an extra check is made during system restarts to detect in such cases whether the executing node is the master node, and use that information to determine whether the table creation is part of the restart proper, or is taking place during a subsequent takeover. (Bug #23028418)

  • ndb_restore set the MAX_ROWS attribute for a table for which it had not been set prior to taking the backup. (Bug #22904640)

  • Whenever data nodes are added to or dropped from the cluster, the NDB kernel's Event API is notified of this using a SUB_GCP_COMPLETE_REP signal with either the ADD (add) flag or SUB (drop) flag set, as well as the number of nodes to add or drop; this allows NDB to maintain a correct count of SUB_GCP_COMPLETE_REP signals pending for every incomplete bucket. In addition to handling the bucket for the epoch associated with the addition or removal, it must also compensate for any later incomplete buckets associated with later epochs. Although it was possible to complete such buckets out of order, there was no handling of these, leading a stall in to event reception.

    This fix adds detection and handling of such out of order bucket completion. (Bug #20402364)

    References: See also: Bug #82424, Bug #24399450.

  • When restoring a backup taken from a database containing tables that had foreign keys, ndb_restore disabled the foreign keys for data, but not for the logs. (Bug #83155, Bug #24736950)

  • The count displayed by the c_exec column in the ndbinfo.threadstat table was incomplete. (Bug #82635, Bug #24482218)

  • The internal function ndbcluster_binlog_wait(), which provides a way to make sure that all events originating from a given thread arrive in the binary log, is used by SHOW BINLOG EVENTS as well as when resetting the binary log. This function waits on an injector condition while the latest global epoch handled by NDB is more recent than the epoch last committed in this session, which implies that this condition must be signalled whenever the binary log thread completes and updates a new latest global epoch. Inspection of the code revealed that this condition signalling was missing, and that, instead of being awakened whenever a new latest global epoch completes (~100ms), client threads waited for the maximum timeout (1 second).

    This fix adds the missing injector condition signalling, while also changing it to a condition broadcast to make sure that all client threads are alerted. (Bug #82630, Bug #24481551)

  • During a node restart, a fragment can be restored using information obtained from local checkpoints (LCPs); up to 2 restorable LCPs are retained at any given time. When an LCP is reported to the DIH kernel block as completed, but the node fails before the last global checkpoint index written into this LCP has actually completed, the latest LCP is not restorable. Although it should be possible to use the older LCP, it was instead assumed that no LCP existed for the fragment, which slowed the restart process. Now in such cases, the older, restorable LCP is used, which should help decrease long node restart times. (Bug #81894, Bug #23602217)

  • While a mysqld was waiting to connect to the management server during initialization of the NDB handler, it was not possible to shut down the mysqld. If the mysqld was not able to make the connection, it could become stuck at this point. This was due to an internal wait condition in the utility and index statistics threads that could go unmet indefinitely. This condition has been augmented with a maximum timeout of 1 second, which makes it more likely that these threads terminate themselves properly in such cases.

    In addition, the connection thread waiting for the management server connection performed 2 sleeps in the case just described, instead of 1 sleep, as intended. (Bug #81585, Bug #23343673)

  • The list of deferred tree node lookup requests created when preparing to abort a DBSPJ request were not cleared when this was complete, which could lead to deferred operations being started even after the DBSPJ request aborted. (Bug #81355, Bug #23251423)

    References: See also: Bug #23048816.

  • Error and abort handling in Dbspj::execTRANSID_AI() was implemented such that its abort() method was called before processing of the incoming signal was complete. Since this method sends signals to the LDM, this partly overwrote the contents of the signal which was later required by execTRANSID_AI(). This could result in aborted DBSPJ requests cleaning up their allocated resources too early, or not at all. (Bug #81353, Bug #23251145)

    References: See also: Bug #23048816.

  • Several object constructors and similar functions in the NDB codebase did not always perform sanity checks when creating new instances. These checks are now performed under such circumstances. (Bug #77408, Bug #21286722)

  • An internal call to malloc() was not checked for NULL. The function call was replaced with a direct write. (Bug #77375, Bug #21271194)