MySQL Cluster NDB 7.1.31 is a new release of MySQL Cluster,
incorporating new features in the
NDB storage engine and fixing
recently discovered bugs in previous MySQL Cluster NDB 7.1
Obtaining MySQL Cluster NDB 7.1. The latest MySQL Cluster NDB 7.1 binaries for supported platforms can be obtained from http://dev.mysql.com/downloads/cluster/. Source code for the latest MySQL Cluster NDB 7.1 release can be obtained from the same location. You can also access the MySQL Cluster NDB 7.1 development source tree at https://code.launchpad.net/~mysql/mysql-server/mysql-cluster-7.1.
This release also incorporates all bugfixes and changes made in previous MySQL Cluster releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.1 through MySQL 5.1.73 (see Changes in MySQL 5.1.73 (2013-12-03)).
LongMessageBuffershortages and statistics has been improved as follows:
The default value of
LongMessageBufferhas been increased from 4 MB to 64 MB.
When this resource is exhausted, a suitable informative message is now printed in the data node log describing possible causes of the problem and suggesting possible solutions.
LongMessageBufferusage information is now shown in the
ndbinfo.memoryusagetable. See the description of this table for an example and additional information.
Important Change: The server system variables
ndb_index_stat_freq, which had been deprecated in a previous MySQL Cluster release series, have now been removed. (Bug #11746486, Bug #26673)
ALTER TABLEstatement changed table schemas without causing a change in the table's partitioning, the new table definition did not copy the hash map from the old definition, but used the current default hash map instead. However, the table data was not reorganized according to the new hashmap, which made some rows inaccessible using a primary key lookup if the two hash maps had incompatible definitions.
To keep this situation from occurring, any
ALTER TABLEthat entails a hashmap change now triggers a reorganisation of the table. In addition, when copying a table definition in such cases, the hashmap is now also copied. (Bug #18436558)
When certain queries generated signals having more than 18 data words prior to a node failure, such signals were not written correctly in the trace file. (Bug #18419554)
After dropping an
NDBtable, neither the cluster log nor the output of the
REPORT MemoryUsagecommand showed that the
IndexMemoryused by that table had been freed, even though the memory had in fact been deallocated. This issue was introduced in MySQL Cluster NDB 7.1.28. (Bug #18296810)
ndb_show_tables sometimes failed with the error message Unable to connect to management server and immediately terminated, without providing the underlying reason for the failure. To provide more useful information in such cases, this program now also prints the most recent error from the
Ndb_cluster_connectionobject used to instantiate the connection. (Bug #18276327)
The block threads managed by the multi-threading scheduler communicate by placing signals in an out queue or job buffer which is set up between all block threads. This queue has a fixed maximum size, such that when it is filled up, the worker thread must wait for the consumer to drain the queue. In a highly loaded system, multiple threads could end up in a circular wait lock due to full out buffers, such that they were preventing each other from performing any useful work. This condition eventually led to the data node being declared dead and killed by the watchdog timer.
To fix this problem, we detect situations in which a circular wait lock is about to begin, and cause buffers which are otherwise held in reserve to become available for signal processing by queues which are highly loaded. (Bug #18229003)
The ndb_mgm client
START BACKUPcommand (see Commands in the MySQL Cluster Management Client) could experience occasional random failures when a ping was received prior to an expected
BackupCompletedevent. Now the connection established by this command is not checked until it has been properly set up. (Bug #18165088)
When performing a copying
ALTER TABLEoperation, mysqld creates a new copy of the table to be altered. This intermediate table, which is given a name bearing the prefix
#sql-, has an updated schema but contains no data. mysqld then copies the data from the original table to this intermediate table, drops the original table, and finally renames the intermediate table with the name of the original table.
mysqld regards such a table as a temporary table and does not include it in the output from
SHOW TABLES; mysqldump also ignores an intermediate table. However,
NDBsees no difference between such an intermediate table and any other table. This difference in how intermediate tables are viewed by mysqld (and MySQL client programs) and by the
NDBstorage engine can give rise to problems when performing a backup and restore if an intermediate table existed in
NDB, possibly left over from a failed
ALTER TABLEthat used copying. If a schema backup is performed using mysqldump and the mysql client, this table is not included. However, in the case where a data backup was done using the ndb_mgm client's
BACKUPcommand, the intermediate table was included, and was also included by ndb_restore, which then failed due to attempting to load data into a table which was not defined in the backed up schema.
To prevent such failures from occurring, ndb_restore now by default ignores intermediate tables created during
ALTER TABLEoperations (that is, tables whose names begin with the prefix
#sql-). A new option
--exclude-intermediate-sql-tablesis added that makes it possible to override the new behavior. The option's default value is
TRUE; to cause ndb_restore to revert to the old behavior and to attempt to restore intermediate tables, set this option to
FALSE. (Bug #17882305)
Data nodes running ndbmtd could stall while performing an online upgrade of a MySQL Cluster containing a great many tables from a version prior to NDB 7.1.20 to version 7.1.20 or later. (Bug #16693068)
Cluster Replication: A slave in MySQL Cluster Replication now monitors the progression of epoch numbers received from its immediate upstream master, which can both serve as a useful check on the low-level functioning of replication, and provide a warning in the event replication is restarted accidentally at an already-applied position.
As a result of this enhancement, an epoch ID collision has the following results, depending on the state of the slave SQL thread:
RESET SLAVEstatement, no action is taken, in order to allow the execution of this statement without spurious warnings.
START SLAVE, a warning is produced that the slave is being positioned at an epoch that has already been applied.
In all other cases, the slave SQL thread is stopped against the possibility that a system malfunction has resulted in the re-application of an existing epoch.
References: See also Bug #17369118.
Cluster API: When an NDB API client application received a signal with an invalid block or signal number,
NDBprovided only a very brief error message that did not accurately convey the nature of the problem. Now in such cases, appropriate printouts are provided when a bad signal or message is detected. In addition, the message length is now checked to make certain that it matches the size of the embedded signal. (Bug #18426180)
Cluster API: Refactoring that was performed in MySQL Cluster NDB 7.1.30 inadvertently introduced a dependency in
Ndb.hppon a file that is not included in the distribution, which caused NDB API applications to fail to compile. The dependency has been removed. (Bug #18293112, Bug #71803)
References: This bug was introduced by Bug #17647637.
Cluster API: An NDB API application sends a scan query to a data node; the scan is processed by the transaction coordinator (TC). The TC forwards a
LQHKEYREQrequest to the appropriate LDM, and aborts the transaction if it does not receive a
LQHKEYCONFresponse within the specified time limit. After the transaction is successfully aborted, the TC sends a
TCROLLBACKREPto the NDBAPI client, and the NDB API client processes this message by cleaning up any
Ndbobjects associated with the transaction.
The client receives the data which it has requested in the form of
TRANSID_AIsignals, buffered for sending at the data node, and may be delivered after a delay. On receiving such a signal,
NDBchecks the transaction state and ID: if these are as expected, it processes the signal using the
Ndbobjects associated with that transaction.
The current bug occurs when all the following conditions are fulfilled:
The transaction coordinator aborts a transaction due to delays and sends a
TCROLLBACPREPsignal to the client, while at the same time a
TRANSID_AIwhich has been buffered for delivery at an LDM is delivered to the same client.
The NDB API client considers the transaction complete on receipt of a
TCROLLBACKREPsignal, and immediately closes the transaction.
The client has a separate receiver thread running concurrently with the thread that is engaged in closing the transaction.
The arrival of the late
TRANSID_AIinterleaves with the closing of the user thread's transaction such that
TRANSID_AIprocessing passes normal checks before
closeTransaction()resets the transaction state and invalidates the receiver.
When these conditions are all met, the receiver thread proceeds to continue working on the
TRANSID_AIsignal using the invalidated receiver. Since the receiver is already invalidated, its usage results in a node failure.
Ndbobject cleanup done for
TCROLLBACKREPincludes invalidation of the transaction ID, so that, for a given transaction, any signal which is received after the
TCROLLBACKREParrives does not pass the transaction ID check and is silently dropped. This fix is also implemented for the
TCKEY_FAILREFsignals as well.
See also Operations and Signals, for additional information about NDB messaging. (Bug #18196562)
Cluster API: ndb_restore could sometimes report Error 701 System busy with other schema operation unnecessarily when restoring in parallel. (Bug #17916243)