MySQL Cluster NDB 7.1.26 is a new release of MySQL Cluster,
incorporating new features in the
NDB storage engine and fixing
recently discovered bugs in previous MySQL Cluster NDB 7.1
Obtaining MySQL Cluster NDB 7.1. The latest MySQL Cluster NDB 7.1 binaries for supported platforms can be obtained from http://dev.mysql.com/downloads/cluster/. Source code for the latest MySQL Cluster NDB 7.1 release can be obtained from the same location. You can also access the MySQL Cluster NDB 7.1 development source tree at https://code.launchpad.net/~mysql/mysql-server/mysql-cluster-7.1.
This release also incorporates all bugfixes and changes made in previous MySQL Cluster releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.1 through MySQL 5.1.67 (see Changes in MySQL 5.1.67 (2012-12-21)).
Added several new columns to the
transporterstable and counters for the
counterstable of the
ndbinfoinformation database. The information provided may help in troublehsooting of transport overloads and problems with send buffer memory allocation. For more information, see the descriptions of these tables. (Bug #15935206)
To provide information which can help in assessing the current state of arbitration in a MySQL Cluster as well as in diagnosing and correcting arbitration problems, 3 new tables—
arbitrator_validity_summary—have been added to the
ndbinfoinformation database. (Bug #13336549)
NDBtable grew to contain approximately one million rows or more per partition, it became possible to insert rows having duplicate primary or unique keys into it. In addition, primary key lookups began to fail, even when matching rows could be found in the table by other means.
This issue was introduced in MySQL Cluster NDB 7.0.36, MySQL Cluster NDB 7.1.26, and MySQL Cluster NDB 7.2.9. Signs that you may have been affected include the following:
Rows left over that should have been deleted
Rows unchanged that should have been updated
Rows with duplicate unique keys due to inserts or updates (which should have been rejected) that failed to find an existing row and thus (wrongly) inserted a new one
This issue does not affect simple scans, so you can see all rows in a given
SELECT * FROMand similar queries that do not depend on a primary or unique key.
Upgrading to or downgrading from an affected release can be troublesome if there are rows with duplicate primary or unique keys in the table; such rows should be merged, but the best means of doing so is application dependent.
In addition, since the key operations themselves are faulty, a merge can be difficult to achieve without taking the MySQL Cluster offline, and it may be necessary to dump, purge, process, and reload the data. Depending on the circumstances, you may want or need to process the dump with an external application, or merely to reload the dump while ignoring duplicates if the result is acceptable.
Another possibility is to copy the data into another table without the original table' unique key constraints or primary key (recall that
CREATE TABLE t2 SELECT * FROM t1does not by default copy
t1's primary or unique key definitions to
t2). Following this, you can remove the duplicates from the copy, then add back the unique constraints and primary key definitions. Once the copy is in the desired state, you can either drop the original table and rename the copy, or make a new dump (which can be loaded later) from the copy. (Bug #16023068, Bug #67928)
The management client command
ALL REPORT BackupStatusfailed with an error when used with data nodes having multiple LQH worker threads (ndbmtd data nodes). The issue did not effect the
form of this command. (Bug #15908907)
The multi-threaded job scheduler could be suspended prematurely when there were insufficient free job buffers to allow the threads to continue. The general rule in the job thread is that any queued messages should be sent before the thread is allowed to suspend itself, which guarantees that no other threads or API clients are kept waiting for operations which have already completed. However, the number of messages in the queue was specified incorrectly, leading to increased latency in delivering signals, sluggish response, or otherwise suboptimal performance. (Bug #15908684)
The setting for the
DefaultOperationRedoProblemActionAPI node configuration parameter was ignored, and the default value used instead. (Bug #15855588)
Node failure during the dropping of a table could lead to the node hanging when attempting to restart.
When this happened, the
NDBinternal dictionary (
DBDICT) lock taken by the drop table operation was held indefinitely, and the logical global schema lock taken by the SQL the drop table operation from which the drop operation originated was held until the
NDBinternal operation timed out. To aid in debugging such occurrences, a new dump code,
DUMP DictDumpLockQueue), which dumps the contents of the
DICTlock queue, has been added in the ndb_mgm client. (Bug #14787522)
Job buffers act as the internal queues for work requests (signals) between block threads in ndbmtd and could be exhausted if too many signals are sent to a block thread.
Performing pushed joins in the
DBSPJkernel block can execute multiple branches of the query tree in parallel, which means that the number of signals being sent can increase as more branches are executed. If
DBSPJexecution cannot be completed before the job buffers are filled, the data node can fail.
This problem could be identified by multiple instances of the message sleeploop 10!! in the cluster out log, possibly followed by job buffer full. If the job buffers overflowed more gradually, there could also be failures due to error 1205 (Lock wait timeout exceeded), shutdowns initiated by the watchdog timer, or other timeout related errors. These were due to the slowdown caused by the 'sleeploop'.
Normally up to a 1:4 fanout ratio between consumed and produced signals is permitted. However, since there can be a potentially unlimited number of rows returned from the scan (and multiple scans of this type executing in parallel), any ratio greater 1:1 in such cases makes it possible to overflow the job buffers.
The fix for this issue defers any lookup child which otherwise would have been executed in parallel with another is deferred, to resume when its parallel child completes one of its own requests. This restricts the fanout ratio for bushy scan-lookup joins to 1:1. (Bug #14709490)
References: See also: Bug #14648712.
During an online upgrade, certain SQL statements could cause the server to hang, resulting in the error Got error 4012 'Request ndbd time-out, maybe due to high load or communication problems' from NDBCLUSTER. (Bug #14702377)
The recently added LCP fragment scan watchdog occasionally reported problems with LCP fragment scans having very high table id, fragment id, and row count values.
This was due to the watchdog not accounting for the time spent draining the backup buffer used to buffer rows before writing to the fragment checkpoint file.
Now, in the final stage of an LCP fragment scan, the watchdog switches from monitoring rows scanned to monitoring the buffer size in bytes. The buffer size should decrease as data is written to the file, after which the file should be promptly closed. (Bug #14680057)
Under certain rare circumstances, MySQL Cluster data nodes could crash in conjunction with a configuration change on the data nodes from a single-threaded to a multi-threaded transaction coordinator (using the
ThreadConfigconfiguration parameter for ndbmtd). The problem occurred when a mysqld that had been started prior to the change was shut down following the rolling restart of the data nodes required to effect the configuration change. (Bug #14609774)