MySQL Cluster 7.3 Release Notes  /  Changes in MySQL Cluster NDB 7.3.5 (5.6.17-ndb-7.3.5) (2014-04-07)

Changes in MySQL Cluster NDB 7.3.5 (5.6.17-ndb-7.3.5) (2014-04-07)

MySQL Cluster NDB 7.3.5 is a new release of MySQL Cluster, based on MySQL Server 5.6 and including features from version 7.3 of the NDB storage engine, as well as fixing a number of recently discovered bugs in previous MySQL Cluster releases.

Obtaining MySQL Cluster NDB 7.3. MySQL Cluster NDB 7.3 source code and binaries can be obtained from

For an overview of changes made in MySQL Cluster NDB 7.3, see MySQL Cluster Development in MySQL Cluster NDB 7.3.

This release also incorporates all bugfixes and changes made in previous MySQL Cluster releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.6 through MySQL 5.6.17 (see Changes in MySQL 5.6.17 (2014-03-27)).

Functionality Added or Changed

  • Handling of LongMessageBuffer shortages and statistics has been improved as follows:

    • The default value of LongMessageBuffer has been increased from 4 MB to 64 MB.

    • When this resource is exhausted, a suitable informative message is now printed in the data node log describing possible causes of the problem and suggesting possible solutions.

    • LongMessageBuffer usage information is now shown in the ndbinfo.memoryusage table. See the description of this table for an example and additional information.

Bugs Fixed

  • Important Change: The server system variables ndb_index_cache_entries and ndb_index_stat_freq, which had been deprecated in a previous MySQL Cluster release series, have now been removed. (Bug #11746486, Bug #26673)

  • When an ALTER TABLE statement changed table schemas without causing a change in the table's partitioning, the new table definition did not copy the hash map from the old definition, but used the current default hash map instead. However, the table data was not reorganized according to the new hashmap, which made some rows inaccessible using a primary key lookup if the two hash maps had incompatible definitions.

    To keep this situation from occurring, any ALTER TABLE that entails a hashmap change now triggers a reorganisation of the table. In addition, when copying a table definition in such cases, the hashmap is now also copied. (Bug #18436558)

  • When certain queries generated signals having more than 18 data words prior to a node failure, such signals were not written correctly in the trace file. (Bug #18419554)

  • Checking of timeouts is handled by the signal TIME_SIGNAL. Previously, this signal was generated by the QMGR NDB kernel block in the main thread, and sent to the QMRG, DBLQH, and DBTC blocks (see NDB Kernel Blocks) as needed to check (respectively) heartbeats, disk writes, and transaction timeouts. In ndbmtd (as opposed to ndbd), these blocks all execute in different threads. This meant that if, for example, QMGR was actively working and some other thread was put to sleep, the previously sleeping thread received a large number of TIME_SIGNAL messages simultaneously when it was woken up again, with the effect that effective times moved very quickly in DBLQH as well as in DBTC. In DBLQH, this had no noticeable adverse effects, but this was not the case in DBTC; the latter block could not work on transactions even though time was still advancing, leading to a situation in which many operations appeared to time out because the transaction coordinator (TC) thread was comparatively slow in answering requests.

    In addition, when the TC thread slept for longer than 1500 milliseconds, the data node crashed due to detecting that the timeout handling loop had not yet stopped. To rectify this problem, the generation of the TIME_SIGNAL has been moved into the local threads instead of QMGR; this provides for better control over how quickly TIME_SIGNAL messages are allowed to arrive. (Bug #18417623)

  • The ServerPort and TcpBind_INADDR_ANY configuration parameters were not included in the output of ndb_mgmd --print-full-config. (Bug #18366909)

  • After dropping an NDB table, neither the cluster log nor the output of the REPORT MemoryUsage command showed that the IndexMemory used by that table had been freed, even though the memory had in fact been deallocated. This issue was introduced in MySQL Cluster NDB 7.3.2. (Bug #18296810)

  • ndb_show_tables sometimes failed with the error message Unable to connect to management server and immediately terminated, without providing the underlying reason for the failure. To provide more useful information in such cases, this program now also prints the most recent error from the Ndb_cluster_connection object used to instantiate the connection. (Bug #18276327)

  • -DWITH_NDBMTD=0 did not function correctly, which could cause the build to fail on platforms such as ARM and Raspberry Pi which do not define the memory barrier functions required to compile ndbmtd. (Bug #18267919)

    References: See also Bug #16620938.

  • The block threads managed by the multi-threading scheduler communicate by placing signals in an out queue or job buffer which is set up between all block threads. This queue has a fixed maximum size, such that when it is filled up, the worker thread must wait for the consumer to drain the queue. In a highly loaded system, multiple threads could end up in a circular wait lock due to full out buffers, such that they were preventing each other from performing any useful work. This condition eventually led to the data node being declared dead and killed by the watchdog timer.

    To fix this problem, we detect situations in which a circular wait lock is about to begin, and cause buffers which are otherwise held in reserve to become available for signal processing by queues which are highly loaded. (Bug #18229003)

  • An issue found when compiling the MySQL Cluster software for Solaris platforms could lead to problems when using ThreadConfig on such systems. (Bug #18181656)

  • The ndb_mgm client START BACKUP command (see Commands in the MySQL Cluster Management Client) could experience occasional random failures when a ping was received prior to an expected BackupCompleted event. Now the connection established by this command is not checked until it has been properly set up. (Bug #18165088)

  • When creating a table with foreign key referencing an index in another table, it sometimes appeared possible to create the foreign key even if the order of the columns in the indexes did not match, due to the fact that an appropriate error was not always returned internally. This fix improves the error used internally to work in most cases; however, it is still possible for this situation to occur in the event that the parent index is a unique index. (Bug #18094360)

  • Updating parent tables of foreign keys used excessive scan resources and so required unusually high settings for MaxNoOfLocalScans and MaxNoOfConcurrentScans. (Bug #18082045)

  • Dropping a nonexistent foreign key on an NDB table (using, for example, ALTER TABLE) appeared to succeed. Now in such cases, the statement fails with a relevant error message, as expected. (Bug #17232212)

  • Data nodes running ndbmtd could stall while performing an online upgrade of a MySQL Cluster containing a great many tables from a version prior to NDB 7.2.5 to version 7.2.5 or later. (Bug #16693068)

  • Replication: Log rotation events could cause group_relay_log_pos to be moved forward incorrectly within a group. This meant that, when the transaction was retried, or if the SQL thread was stopped in the middle of a transaction following one or more log rotations (such that the transaction or group spanned multiple relay log files), part or all of the group was silently skipped.

    This issue has been addressed by correcting a problem in the logic used to avoid touching the coordinates of the SQL thread when updating the log position as part of a relay log rotation whereby it was possible to update the SQL thread's coordinates when not using a multi-threaded slave, even in the middle of a group. (Bug #18482854)

  • Cluster Replication: A slave in MySQL Cluster Replication now monitors the progression of epoch numbers received from its immediate upstream master, which can both serve as a useful check on the low-level functioning of replication, and provide a warning in the event replication is restarted accidentally at an already-applied position.

    As a result of this enhancement, an epoch ID collision has the following results, depending on the state of the slave SQL thread:

    • Following a RESET SLAVE statement, no action is taken, in order to allow the execution of this statement without spurious warnings.

    • Following START SLAVE, a warning is produced that the slave is being positioned at an epoch that has already been applied.

    • In all other cases, the slave SQL thread is stopped against the possibility that a system malfunction has resulted in the re-application of an existing epoch.

    (Bug #17461576)

    References: See also Bug #17369118.

  • Cluster API: When an NDB API client application received a signal with an invalid block or signal number, NDB provided only a very brief error message that did not accurately convey the nature of the problem. Now in such cases, appropriate printouts are provided when a bad signal or message is detected. In addition, the message length is now checked to make certain that it matches the size of the embedded signal. (Bug #18426180)

  • Cluster API: Refactoring that was performed in MySQL Cluster NDB 7.3.4 inadvertently introduced a dependency in Ndb.hpp on a file that is not included in the distribution, which caused NDB API applications to fail to compile. The dependency has been removed. (Bug #18293112, Bug #71803)

    References: This bug was introduced by Bug #17647637.

  • Cluster API: An NDB API application sends a scan query to a data node; the scan is processed by the transaction coordinator (TC). The TC forwards a LQHKEYREQ request to the appropriate LDM, and aborts the transaction if it does not receive a LQHKEYCONF response within the specified time limit. After the transaction is successfully aborted, the TC sends a TCROLLBACKREP to the NDBAPI client, and the NDB API client processes this message by cleaning up any Ndb objects associated with the transaction.

    The client receives the data which it has requested in the form of TRANSID_AI signals, buffered for sending at the data node, and may be delivered after a delay. On receiving such a signal, NDB checks the transaction state and ID: if these are as expected, it processes the signal using the Ndb objects associated with that transaction.

    The current bug occurs when all the following conditions are fulfilled:

    • The transaction coordinator aborts a transaction due to delays and sends a TCROLLBACPREP signal to the client, while at the same time a TRANSID_AI which has been buffered for delivery at an LDM is delivered to the same client.

    • The NDB API client considers the transaction complete on receipt of a TCROLLBACKREP signal, and immediately closes the transaction.

    • The client has a separate receiver thread running concurrently with the thread that is engaged in closing the transaction.

    • The arrival of the late TRANSID_AI interleaves with the closing of the user thread's transaction such that TRANSID_AI processing passes normal checks before closeTransaction() resets the transaction state and invalidates the receiver.

    When these conditions are all met, the receiver thread proceeds to continue working on the TRANSID_AI signal using the invalidated receiver. Since the receiver is already invalidated, its usage results in a node failure.

    Now the Ndb object cleanup done for TCROLLBACKREP includes invalidation of the transaction ID, so that, for a given transaction, any signal which is received after the TCROLLBACKREP arrives does not pass the transaction ID check and is silently dropped. This fix is also implemented for the TC_COMMITREF, TCROLLBACKREF, TCKEY_FAILCONF, and TCKEY_FAILREF signals as well.

    See also Operations and Signals, for additional information about NDB messaging. (Bug #18196562)

  • Cluster API: The example ndbapi-examples/ndbapi_blob_ndbrecord/main.cpp included an internal header file (ndb_global.h) not found in the MySQL Cluster binary distribution. The example now uses stdlib.h and string.h instead of this file. (Bug #18096866, Bug #71409)

  • Cluster API: When Dictionary::dropTable() attempted (as a normal part of its internal operations) to drop an index used by a foreign key constraint, the drop failed. Now in such cases, invoking dropTable() causes all foreign keys on the table to be dropped, whether this table acts as a parent table, child table, or both.

    This issue did not affect dropping of indexes using SQL statements. (Bug #18069680)

    References: See also Bug #17591531.

  • Cluster API: ndb_restore could sometimes report Error 701 System busy with other schema operation unnecessarily when restoring in parallel. (Bug #17916243)

Download these Release Notes
PDF (US Ltr) - 384.0Kb
PDF (A4) - 387.5Kb
EPUB - 125.5Kb