MySQL NDB Cluster 7.5.0 is a new release of MySQL NDB Cluster 7.5,
based on MySQL Server 5.7 and including features in version 7.5 of
NDB storage engine, as well as
fixing recently discovered bugs in previous NDB Cluster releases.
Obtaining MySQL NDB Cluster 7.5. MySQL NDB Cluster 7.5 source code and binaries can be obtained from http://dev.mysql.com/downloads/cluster/.
For an overview of changes made in MySQL NDB Cluster 7.5, see What is New in NDB Cluster 7.5.
This release also incorporates all bugfixes and changes made in previous NDB Cluster releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.7 through MySQL 5.7.10 (see Changes in MySQL 5.7.10 (2015-12-07, General Availability)).
Important Change: Previously, the
NDBscheduler always optimized for speed against throughput in a predetermined manner (this was hard coded); this balance can now be set using the
SchedulerResponsivenessdata node configuration parameter. This parameter accepts an integer in the range of 0-10 inclusive, with 5 as the default. Higher values provide better response times relative to throughput. Lower values provide increased throughput, but impose longer response times. (Bug #78531, Bug #21889312)
Important Change: A number of MySQL NDB Cluster data node configuration parameters were deprecated in earlier versions of MySQL NDB Cluster, and have been removed with this release. These parameters include
Disklessinstead), as well as
DiskCheckpointSpeedInRestart. The archaic and unused
ByteOrdercomputer configuration parameter has also been removed, as well as the unused
MaxNoOfSavedEventsmanagement node confugration parameter. These parameters are no longer supported; most of them already did not have (or no longer had) any effect. Trying to use any of these parameters in a MySQL NDB Cluster configuration file now results in an error.
For more information, see What is New in NDB Cluster 7.5. (Bug #77404, Bug #21280428)
Important Change: The
ExecuteOnComputerconfiguration parameter for management, data, and API nodes is now deprecated, and is subject to removal in a future MySQL NDB Cluster release. For all types of MySQL NDB Cluster nodes, you should now use the
HostNameparameter exclusively for identifying hosts in the cluster configuration file.
Important Change: The
ndbinfodatabase can now provide default and current information about MySQL NDB Cluster node configuration parameters as a result of the following changes:
config_paramstable has been enhanced with additional columns providing information about each configuration parameter, including its type, default, and maximum and minimum values (where applicable).
config_valuestable has been added. A row in this table shows the current value of a parameter on a given node.
You can obtain values of MySQL NDB Cluster configuration parameters by name using a join on these two tables such as the one shown here:
SELECT p.param_name AS Name, v.node_id AS Node, p.param_type AS Type, p.param_default AS 'Default', v.config_value AS Current FROM config_params p JOIN config_values v ON p.param_number = v.config_param WHERE p. param_name IN ('NodeId', 'HostName','DataMemory', 'IndexMemory');
NDB Replication: Normally,
RESET SLAVEcauses all entries to be deleted from the
mysql.ndb_apply_statustable. This release adds the
ndb_clear_apply_statussystem variable, which makes it possible to override this behavior. This variable is
ONby default; setting it to
RESET SLAVEfrom purging the
ndb_apply_statustable. (Bug #12630403)
Deprecated MySQL NDB Cluster node configuration parameters are now indicated as such by ndb_config
--xml. For each parameter currently deprecated, the corresponding
<param/>tag in the XML output now includes the attribute
deprecated="true". (Bug #21127135)
--ndb-cluster-connection-pool-nodeidsoption for mysqld, which can be used to specify a list of nodes by node ID for connection pooling. The number of node IDs in the list must equal the value set for
--ndb-cluster-connection-pool. (Bug #19521789)
PROMPTcommand in the ndb_mgm client. This command has the syntax
PROMPT, which sets the client's prompt to
string. Issuing the command without an argument causes the prompt to be reset to the default (
ndb_mgm>). See Commands in the NDB Cluster Management Client, for more information. (Bug #18421338)
--databaseoption has not been specified for ndb_show_tables, and no tables are found in the
TEST_DBdatabase, an appropriate warning message is now issued. (Bug #50633, Bug #11758430)
NDBstorage engine now uses the improved records-per-key interface for index statistics introduced for the optimizer in MySQL 5.7. Some improvements due to this change are listed here:
The optimizer can now choose better execution plans for queries on
NDBtables in many cases where a less optimal join index or table join order would previously have been chosen.
EXPLAINnow provides more accurate row estimates than previously.
Improved cardinality estimates can be obtained from
Incompatible Change; NDB Cluster APIs: The
pollEvents2()method now returns -1, indicating an error, whenever a negative value is used for the time argument. (Bug #20762291)
Important Change; NDB Cluster APIs:
Ndb::pollEvents()is now compatible with the
TE_OUT_OF_MEMORYevent types introduced in MySQL NDB Cluster 7.4.3. For detailed information about this change, see the description of this method in the MySQL NDB Cluster API Developer Guide. (Bug #20646496)
The behavior of
Ndb::pollEvents()has also been modified such that it now returns NDB_FAILURE_GCI (equal to
~(Uint64) 0) when a cluster failure has been detected. (Bug #18753887)
Important Change; NDB Cluster APIs: To release the memory used for dropped event operations, the event API formerly depended on
nextEvent()to consume all events possibly referring to the dropped events. This dependency between
dropEventOperation()and the first two methods required the entire event buffer to be read before attempting to release event operation memory (that is, until successive calls to
nextEvent()returned no more events).
A related cleanup issue arose following the reset of the event buffer (when all event operations had previously been dropped), and the event buffer was truncated by the first
createEventOperation()call subsequent to the reset.
To fix these problems, the event buffer is now cleared when the last event operation is dropped, rather than waiting for a subsequent create operation which might or might not occur. Memory taken up by dropped event operations is also now released when the event queue has been cleared, which removes the hidden requirement for consuming all events to free up memory. In addition, event operation memory is now released as soon as all events referring to the operation have been consumed, rather than waiting for the entire event buffer to be consumed. (Bug #78145, Bug #21661297)
Important Change; NDB Cluster APIs: The MGM API error-handling functions
ndb_mgm_get_latest_error_desc()each failed when used with a
NULLhandle. You should note that, although these functions are now null-safe, values returned in this case are arbitrary and not meaningful. (Bug #78130, Bug #21651706)
Important Change; NDB Cluster APIs: The following NDB API methods were not actually implemented and have been removed from the sources:
Important Change: The options controlling behavior of
NDBprograms with regard to the number and timing of successive attempts to connect to a management server have changed as listed here:
The minimum value for the
--connect-retry-delayoption common to all
NDBprograms has been changed from 0 to 1; this means that all
NDBprograms now wait at least 1 second between successive connection attempts, and it is no longer possible to set a waiting time equal to 0.
The semantics for the
--connect-retriesoption have changed slightly, such that the value of this option now sets the number of times an
NDBprogram tries to connect to a management server. Setting this option to 0 now causes the program to attempt the connection indefinitely, until it either succeeds or is terminated by other means (such as kill).
In addition, the default for the
--connect-retriesoption for the ndb_mgm client has been changed from 3 to 12, so that the minimum, maximum, and default values for this option when used with ndb_mgm are now exactly the same as for all other
--try-reconnectoption, although deprecated in MySQL NDB Cluster 7.4, continues to be supported as a synonym for ndb_mgm
--connect-retriesto provide backwards compatibility. The default value for
--try-reconnecthas also been changed from 3 to 12, respectively, so that this option continues to behave in the exactly in the same way as
Important Change: In previous versions of MySQL NDB Cluster, other DDL operations could not be part of
ALTER ONLINE TABLE ... RENAME .... (This was disallowed by the fix for BUG#16021021.) MySQL NDB Cluster 7.5 makes the following changes:
Support for the
OFFLINEkeywords, which was deprecated in MySQL NDB Cluster 7.3, is now removed, and use of these now causes a syntax error; the
NDBstorage engine now accepts only
ALGORITHM = DEFAULT,
ALGORITHM = COPY, and
ALGORITHM = INPLACEto specify whether the
ALTERoperation is copying or in-place, just as in the standard MySQL Server.
ALTER TABLE ... ALGORITHM=COPYING RENAME.
(Bug #20804269, Bug #76543, Bug #20479917, Bug #75797)
References: See also: Bug #16021021.
NDB Disk Data: A unique index on a column of an
NDBtable is implemented with an associated internal ordered index, used for scanning. While dropping an index, this ordered index was dropped first, followed by the drop of the unique index itself. This meant that, when the drop was rejected due to (for example) a constraint violation, the statement was rejected but the associated ordered index remained deleted, so that any subsequent operation using a scan on this table failed. We fix this problem by causing the unique index to be removed first, before removing the ordered index; removal of the related ordered index is no longer performed when removal of a unique index fails. (Bug #78306, Bug #21777589)
NDB Replication: While the binary log injector thread was handling failure events, it was possible for all
NDBtables to be left indefinitely in read-only mode. This was due to a race condition between the binary log injector thread and the utility thread handling events on the ndb_schema table, and to the fact that, when handling failure events, the binary log injector thread places all
NDBtables in read-only mode until all such events are handled and the thread restarts itself.
When the binary log inject thread receives a group of one or more failure events, it drops all other existing event operations and expects no more events from the utility thread until it has handled all of the failure events and then restarted itself. However, it was possible for the utility thread to continue attempting binary log setup while the injector thread was handling failures and thus attempting to create the schema distribution tables as well as event subscriptions on these tables. If the creation of these tables and event subscriptions occurred during this time, the binary log injector thread's expectation that there were no further event operations was never met; thus, the injector thread never restarted, and
NDBtables remained in read-only as described previously.
To fix this problem, the
Ndbobject that handles schema events is now definitely dropped once the
ndb_schematable drop event is handled, so that the utility thread cannot create any new events until after the injector thread has restarted, at which time, a new
Ndbobject for handling schema events is created. (Bug #17674771, Bug #19537961, Bug #22204186, Bug #22361695)
NDB Cluster APIs: The binary log injector did not work correctly with
TE_INCONSISTENTevent type handling by
Ndb::nextEvent(). (Bug #22135541)
References: See also: Bug #20646496.
NDB Cluster APIs: While executing
dropEvent(), if the coordinator
DBDICTfailed after the subscription manager (
SUMAblock) had removed all subscriptions but before the coordinator had deleted the event from the system table, the dropped event remained in the table, causing any subsequent drop or create event with the same name to fail with
NDBerror 1419 Subscription already dropped or error 746 Event name already exists. This occurred even when calling
dropEvent()with a nonzero force argument.
Now in such cases, error 1419 is ignored, and
DBDICTdeletes the event from the table. (Bug #21554676)
NDB Cluster APIs: Creation and destruction of
Ndb_cluster_connectionobjects by multiple threads could make use of the same application lock, which in some cases led to failures in the global dictionary cache. To alleviate this problem, the creation and destruction of several internal NDB API objects have been serialized. (Bug #20636124)
NDB Cluster APIs: When an
Ndbobject created prior to a failure of the cluster was reused, the event queue of this object could still contain data node events originating from before the failure. These events could reference “old” epochs (from before the failure occurred), which in turn could violate the assumption made by the
nextEvent()method that epoch numbers always increase. This issue is addressed by explicitly clearing the event queue in such cases. (Bug #18411034)
References: See also: Bug #20888668.
NDB Cluster APIs:
pollEvents2()were slow to receive events, being dependent on other client threads or blocks to perform polling of transporters on their behalf. This fix allows a client thread to perform its own transporter polling when it has to wait in either of these methods.
Introduction of transporter polling also revealed a problem with missing mutex protection in the
ndbcluster_binloghandler, which has been added as part of this fix. (Bug #79311, Bug #20957068, Bug #22224571)
NDB Cluster APIs: After the initial restart of a node following a cluster failure, the cluster failure event added as part of the restart process was deleted when an event that existed prior to the restart was later deleted. This meant that, in such cases, an Event API client had no way of knowing that failure handling was needed. In addition, the GCI used for the final cleanup of deleted event operations, performed by
nextEvent()when these methods have consumed all available events, was lost. (Bug #78143, Bug #21660947)
A serious regression was inadvertently introduced in MySQL NDB Cluster 7.4.8 whereby local checkpoints and thus restarts often took much longer than expected. This occurred due to the fact that the setting for
MaxDiskWriteSpeedOwnRestartwas ignored during restarts and the value of
MaxDiskWriteSpeedOtherNodeRestart, which is much lower by default than the default for
MaxDiskWriteSpeedOwnRestart, was used instead. This issue affected restart times and performance only and did not have any impact on normal operations. (Bug #22582233)
The epoch for the latest restorable checkpoint provided in the cluster log as part of its reporting for
EventBufferStatusevents (see NDB Cluster: Messages in the Cluster Log) was not well defined and thus unreliable; depending on various factors, the reported epoch could be the one currently being consumed, the one most recently consumed, or the next one queued for consumption.
This fix ensures that the latest restorable global checkpoint is always regarded as the one that was most recently completely consumed by the user, and thus that it was the latest restorable global checkpoint that existed at the time the report was generated. (Bug #22378288)
--ndb-allow-copying-alter-tableoption for mysqld. Setting this option (or the equivalent system variable
ALTER TABLEstatements from performing copying operations. The default value is
ON. (Bug #22187649)
References: See also: Bug #17400320.
Attempting to create an
NDBtable having greater than the maximum supported combined width for all
BITcolumns (4096) caused data node failure when these columns were defined with
COLUMN_FORMAT DYNAMIC. (Bug #21889267)
Creating a table with the maxmimum supported number of columns (512) all using
COLUMN_FORMAT DYNAMICled to data node failures. (Bug #21863798)
In a MySQL NDB Cluster with multiple LDM instances, all instances wrote to the node log, even inactive instances on other nodes. During restarts, this caused the log to be filled with messages from other nodes, such as the messages shown here:
2015-06-24 00:20:16 [ndbd] INFO -- We are adjusting Max Disk Write Speed, a restart is ongoing now ... 2015-06-24 01:08:02 [ndbd] INFO -- We are adjusting Max Disk Write Speed, no restarts ongoing anymore
Now this logging is performed only by the active LDM instance. (Bug #21362380)
Backup block states were reported incorrectly during backups. (Bug #21360188)
References: See also: Bug #20204854, Bug #21372136.
For a timeout in
GET_TABINFOREQwhile executing a
CREATE INDEXstatement, mysqld returned Error 4243 (Index not found) instead of the expected Error 4008 (Receive from NDB failed).
The fix for this bug also fixes similar timeout issues for a number of other signals that are sent the
DBDICTkernel block as part of DDL operations, including
LIST_TABLES_REQ, as well as several internal functions used in handling
NDBschema operations. (Bug #21277472)
References: See also: Bug #20617891, Bug #20368354, Bug #19821115.
Previously, multiple send threads could be invoked for handling sends to the same node; these threads then competed for the same send lock. While the send lock blocked the additional send threads, work threads could be passed to other nodes.
This issue is fixed by ensuring that new send threads are not activated while there is already an active send thread assigned to the same node. In addition, a node already having an active send thread assigned to it is no longer visible to other, already active, send threads; that is, such a node is longer added to the node list when a send thread is currently assigned to it. (Bug #20954804, Bug #76821)
Queueing of pending operations when the redo log was overloaded (
DefaultOperationRedoProblemActionAPI node configuration parameter) could lead to timeouts when data nodes ran out of redo log space (P_TAIL_PROBLEM errors). Now when the redo log is full, the node aborts requests instead of queuing them. (Bug #20782580)
References: See also: Bug #20481140.
NDBevent buffer can be used with an
Ndbobject to subscribe to table-level row change event streams. Users subscribe to an existing event; this causes the data nodes to start sending event data signals (
SUB_TABLE_DATA) and epoch completion signals (
SUB_GCP_COMPLETE) to the
SUB_GCP_COMPLETE_REPsignals can arrive for execution in concurrent receiver thread before completion of the internal method call used to start a subscription.
SUB_GCP_COMPLETE_REPsignals depends on the total number of
SUMAbuckets (sub data streams), but this may not yet have been set, leading to the present issue, when the counter used for tracking the
TOTAL_BUCKETS_INIT) was found to be set to erroneous values. Now
TOTAL_BUCKETS_INITis tested to be sure it has been set correctly before it is used. (Bug #20575424, Bug #76255)
References: See also: Bug #20561446, Bug #21616263.
NDBstatistics queries could be delayed by the error delay set for
ndb_index_stat_option(default 60 seconds) when the index that was queried had been marked with internal error. The same underlying issue could also cause
ANALYZE TABLEto hang when executed against an
NDBtable having multiple indexes where an internal error occured on one or more but not all indexes.
Now in such cases, any existing statistics are returned immediately, without waiting for any additonal statistics to be discovered. (Bug #20553313, Bug #20707694, Bug #76325)
Memory allocated when obtaining a list of tables or databases was not freed afterward. (Bug #20234681, Bug #74510)
References: See also: Bug #18592390, Bug #72322.
BackupDiskWriteSpeedPctdata node parameter. Setting this parameter causes the data node to reserve a percentage of its maximum write speed (as determined by the value of
MaxDiskWriteSpeed) for use in local checkpoints while performing a backup.
BackupDiskWriteSpeedPctis interpreted as a percentage which can be set between 0 and 90 inclusive, with a default value of 50. (Bug #20204854)
References: See also: Bug #21372136.
After restoring the database schema from backup using ndb_restore, auto-discovery of restored tables in transactions having multiple statements did not work correctly, resulting in Deadlock found when trying to get lock; try restarting transaction errors.
This issue was encountered both in the mysql client, as well as when such transactions were executed by application programs using Connector/J and possibly other MySQL APIs.
Prior to upgrading, this issue can be worked around by executing
SELECT TABLE_NAME, TABLE_SCHEMA FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE = 'NDBCLUSTER'on all SQL nodes following the restore operation, before executing any other statements. (Bug #18075170)
STOP -fto force a node shutdown even when it triggered a complete shutdown of the cluster, it was possible to lose data when a sufficient number of nodes were shut down, triggering a cluster shutodwn, and the timing was such that
SUMAhandovers had been made to nodes already in the process of shutting down. (Bug #17772138)
When using a sufficiently large value for
TransactionDeadlockDetectionTimeoutand the default value for
ORDER BY transidwith multiple concurrent conflicting or deadlocked transactions, each transaction having several pending operations, caused the SQL node where the query was run to fail. (Bug #16731538, Bug #67596)
ndbinfo.config_paramstable is now read-only. (Bug #11762750, Bug #55383)
NDBfailed during a node restart due to the status of the current local checkpoint being set but not as active, even though it could have other states under such conditions. (Bug #78780, Bug #21973758)
ndbmtd checked for signals being sent only after a full cycle in
run_job_buffers, which is performed for all job buffer inputs. Now this is done as part of
run_job_buffersitself, which avoids executing for extended periods of time without sending to other nodes or flushing signals to other threads. (Bug #78530, Bug #21889088)
When attempting to enable index statistics, creation of the required system tables, events and event subscriptions often fails when multiple mysqld processes using index statistics are started concurrently in conjunction with starting, restarting, or stopping the cluster, or with node failure handling. This is normally recoverable, since the affected mysqld process or processes can (and do) retry these operations shortly thereafter. For this reason, such failures are no longer logged as warnings, but merely as informational events. (Bug #77760, Bug #21462846)
It was possible to end up with a lock on the send buffer mutex when send buffers became a limiting resource, due either to insufficient send buffer resource configuration, problems with slow or failing communications such that all send buffers became exhausted, or slow receivers failing to consume what was sent. In this situation worker threads failed to allocate send buffer memory for signals, and attempted to force a send in order to free up space, while at the same time the send thread was busy trying to send to the same node or nodes. All of these threads competed for taking the send buffer mutex, which resulted in the lock already described, reported by the watchdog as
Stuck in Send. This fix is made in two parts, listed here:
The send thread no longer holds the global send thread mutex while getting the send buffer mutex; it now releases the global mutex prior to locking the send buffer mutex. This keeps worker threads from getting stuck in send in such cases.
Locking of the send buffer mutex done by the send threads now uses a try-lock. If the try-lock fails, the node to make the send to is reinserted at the end of the list of send nodes in order to be retried later. This removes the
Stuck in Sendcondition for the send threads.
(Bug #77081, Bug #21109605)