MySQL Cluster NDB 7.2.6 is a new release of MySQL Cluster,
incorporating new features in the
NDB storage engine, and fixing
recently discovered bugs in previous MySQL Cluster NDB 7.2
Obtaining MySQL Cluster NDB 7.2. MySQL Cluster NDB 7.2 source code and binaries can be obtained from http://dev.mysql.com/downloads/cluster/.
This release also incorporates all bugfixes and changes made in previous MySQL Cluster releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.5 through MySQL 5.5.22 (see Changes in MySQL 5.5.22 (2012-03-21)).
Important Change: The
ALTER ONLINE TABLE ... REORGANIZE PARTITIONstatement can be used to create new table partitions after new empty nodes have been added to a MySQL Cluster. Usually, the number of partitions to create is determined automatically, such that, if no new partitions are required, then none are created. This behavior can be overridden by creating the original table using the
MAX_ROWSoption, which indicates that extra partitions should be created to store a large number of rows. However, in this case
ALTER ONLINE TABLE ... REORGANIZE PARTITIONsimply uses the
MAX_ROWSvalue specified in the original
CREATE TABLEstatement to determine the number of partitions required; since this value remains constant, so does the number of partitions, and so no new ones are created. This means that the table is not rebalanced, and the new data nodes remain empty.
To solve this problem, support is added for
ALTER ONLINE TABLE ... MAX_ROWS=, where
newvalueis greater than the value used with
MAX_ROWSin the original
CREATE TABLEstatement. This larger
MAX_ROWSvalue implies that more partitions are required; these are allocated on the new data nodes, which restores the balanced distribution of the table data.
An error handling routine in the local query handler (
DBLQH) used the wrong code path, which could corrupt the transaction ID hash, causing the data node process to fail. This could in some cases possibly lead to failures of other data nodes in the same node group when the failed node attempted to restart. (Bug #14083116)
When a fragment scan occurring as part of a local checkpoint (LCP) stopped progressing, this kept the entire LCP from completing, which could result it redo log exhaustion, write service outage, inability to recover nodes, and longer system recovery times. To help keep this from occurring, MySQL Cluster now implements an LCP watchdog mechanism, which monitors the fragment scans making up the LCP and takes action if the LCP is observed to be delinquent.
This is intended to guard against any scan related system-level I/O errors or other issues causing problems with LCP and thus having a negative impact on write service and recovery times. Each node independently monitors the progress of local fragment scans occurring as part of an LCP. If no progress is made for 20 seconds, warning logs are generated every 10 seconds thereafter for up to 1 minute. At this point, if no progress has been made, the fragment scan is considered to have hung, and the node is restarted to enable the LCP to continue.
In addition, a new ndbd exit code NDBD_EXIT_LCP_SCAN_WATCHDOG_FAIL is added to identify when this occurs. See LQH Errors, for more information. (Bug #14075825)
It could sometimes happen that a query pushed down to the data nodes could refer to buffered rows which had been released, and possibly overwritten by other rows. Such rows, if overwritten, could lead to incorrect results from a pushed query, and possibly even to failure of one or more data nodes or SQL nodes. (Bug #14010406)
DUMP 2303in the ndb_mgm client now includes the status of the single fragment scan record reserved for a local checkpoint. (Bug #13986128)
Pushed joins performed as part of a stored procedure or trigger could cause spurious Out of memory errors on the SQL node where they were executed. (Bug #13945264)
References: See also: Bug #13944272.
References: See also: Bug #13945264.
When upgrading or downgrading between a MySQL Cluster version supporting distributed pushdown joins (MySQL Cluster NDB 7.2 and later) and one that did not, queries that the later MySQL Cluster version tried to push down could cause data nodes still running the earlier version to fail. Now the SQL nodes check the version of the software running on the data nodes, so that queries are not pushed down if there are any data nodes in the cluster that do not support pushdown joins. (Bug #13894817)
ndbmemcachedexited unexpectedly when more than 128 clients attempted to connect concurrently using prefixes. In addition, a NOT FOUND error was returned when the
memcachedengine encountered a temporary error from
NDB; now the error No Ndb Instances in freelist is returned instead. (Bug #13890064, Bug #13891085)
The performance of
ndbmemcachewith a workload that consisted mostly of primary key reads became degraded. (Bug #13868787, Bug #64713)
The memcached server failed to build correctly on 64-bit Solaris/SPARC. (Bug #13854122)
ALTER ONLINE TABLEfailed when a
DEFAULToption was used. (Bug #13830980)
In some cases, restarting data nodes spent a very long time in Start Phase 101, when API nodes must connect to the starting node (using
NdbEventOperation), when the API nodes trying to connect failed in a live-lock scenario. This connection process uses a handshake during which a small number of messages are exchanged, with a timeout used to detect failures during the handshake.
Prior to this fix, this timeout was set such that, if one API node encountered the timeout, all other nodes connecting would do the same. The fix also decreases this timeout. This issue (and the effects of the fix) are most likely to be observed on relatively large configurations having 10 or more data nodes and 200 or more API nodes. (Bug #13825163)
ndbmtd failed to restart when the size of a table definition exceeded 32K.
(The size of a table definition is dependent upon a number of factors, but in general the 32K limit is encountered when a table has 250 to 300 columns.) (Bug #13824773)
An initial start using ndbmtd could sometimes hang. This was due to a state which occurred when several threads tried to flush a socket buffer to a remote node. In such cases, to minimize flushing of socket buffers, only one thread actually performs the send, on behalf of all threads. However, it was possible in certain cases for there to be data in the socket buffer waiting to be sent with no thread ever being chosen to perform the send. (Bug #13809781)
When trying to use ndb_size.pl
portto connect to a MySQL server running on a nonstandard port, the
portargument was ignored. (Bug #13364905, Bug #62635)
transaction_allow_batchingserver system variable was inadvertently removed from the NDB 7.2 codebase prior to General Availability. This fix restores the variable. (Bug #64697, Bug #13891116, Bug #13947227)
Cluster Replication: Error handling in conflict detection and resolution has been improved to include errors generated for reasons other than operation execution errors, and to distinguish better between permanent errors and transient errors. Transactions failing due to transient problems are now retried rather than leading to SQL node shutdown as occurred in some cases. (Bug #13428909)
Cluster Replication: DDL statements could sometimes be missed during replication channel cutover, due to the fact that there may not be any epochs following the last applied epoch when the slave is up to date and no new epoch has been finalized on the master. Because epochs are not consecutively numbered, there may be a gap between the last applied epoch and the next epoch; thus it is not possible to determine the number assigned to the next epoch. This meant that, if the new master did not have all epochs, it was possible for those epochs containing only DDL statements to be skipped over.
The fix for this problem includes modifications to mysqld binary logging code so that the next position in the binary log following the
COMMITevent at the end of an epoch transaction, as well as the addition of two new columns
mysql.ndb_binlog_indextable. In addition, a new replication channel cutover mechanism is defined that employs these new columns. To make use of the new cutover mechanism, it is necessary to modify the query used to obtain the start point; in addition, to simplify prevention of possible errors caused by duplication of DDL statements, a new shorthand value
ddl_exist_errorsis implemented for use with the mysqld option
--slave-skip-errors. It is highly recommended that you use this option and value on the new replication slave when using the modified query.
For more information, see Implementing Failover with MySQL Cluster Replication.
Note that the existing replication channel cutover mechanism continues to function as before, including the same limitations described previously. (Bug #11762277, Bug #54854)
Cluster API: An assert in
memcachedto fail. (Bug #13874027)