Incompatible Change: The changes listed here follow up and build further on work done in MySQL NDB Cluster 7.4.7 to improve handling of local checkpoints (LCPs) under conditions of insert overload:
Changes have been made in the minimum values for a number of parameters applying to data buffers for backups and LCPs. These parameters, listed here, can no longer be set so as to make the system impossible to run:
In addition, the
BackupMemorydata node parameter is now deprecated and subject to removal in a future MySQL NDB Cluster version. Use
When a backup was unsuccessful due to insufficient resources, a subsequent retry worked only for those parts of the backup that worked in the same thread, since delayed signals are only supported in the same thread. Delayed signals are no longer sent to other threads in such cases.
An instance of an internal list object used in searching for queued scans was not actually destroyed before calls to functions that could manipulate the base object used to create it.
ACC scans were queued in the category of range scans, which could lead to starting an ACC scan when
DBACChad no free slots for scans. We fix this by implementing a separate queue for ACC scans.
(Bug #76890, Bug #20981491, Bug #77597, Bug #21362758, Bug #77612, Bug #21370839)
References: See also: Bug #76742, Bug #20904721.
--databaseoption has not been specified for ndb_show_tables, and no tables are found in the
TEST_DBdatabase, an appropriate warning message is now issued. (Bug #50633, Bug #11758430)
Important Change; NDB Cluster APIs: The MGM API error-handling functions
ndb_mgm_get_latest_error_desc()each failed when used with a
NULLhandle. You should note that, although these functions are now null-safe, values returned in this case are arbitrary and not meaningful. (Bug #78130, Bug #21651706)
Important Change: When ndb_restore was run without
--rebuild-indexeson a table having a unique index, it was possible for rows to be restored in an order that resulted in duplicate values, causing it to fail with duplicate key errors. Running
ndb_restoreon such a table now requires using at least one of these options; failing to do so now results in an error. (Bug #57782, Bug #11764893)
References: See also: Bug #22329365, Bug #22345748.
NDB Cluster APIs: While executing
dropEvent(), if the coordinator
DBDICTfailed after the subscription manager (
SUMAblock) had removed all subscriptions but before the coordinator had deleted the event from the system table, the dropped event remained in the table, causing any subsequent drop or create event with the same name to fail with
NDBerror 1419 Subscription already dropped or error 746 Event name already exists. This occurred even when calling
dropEvent()with a nonzero force argument.
Now in such cases, error 1419 is ignored, and
DBDICTdeletes the event from the table. (Bug #21554676)
NDB Cluster APIs: If the total amount of memory allocated for the event buffer exceeded approximately 40 MB, the calculation of memory usage percentages could overflow during computation. This was due to the fact that the associated routine used 32-bit arithmetic; this has now been changed to use
Uint64values instead. (Bug #78454, Bug #21847552)
NDB Cluster APIs: The
nextEvent2()method continued to return exceptional events such as
TE_OUT_OF_MEMORYfor event operations which already had been dropped. (Bug #78167, Bug #21673318)
NDB Cluster APIs: After the initial restart of a node following a cluster failure, the cluster failure event added as part of the restart process was deleted when an event that existed prior to the restart was later deleted. This meant that, in such cases, an Event API client had no way of knowing that failure handling was needed. In addition, the GCI used for the final cleanup of deleted event operations, performed by
nextEvent()when these methods have consumed all available events, was lost. (Bug #78143, Bug #21660947)
NDB Cluster APIs: The internal value representing the latest global checkpoint was not always updated when a completed epoch of event buffers was inserted into the event queue. This caused subsequent calls to
pollEvents2()to fail when trying to obtain the correct GCI for the events available in the event buffers. This could also result in later calls to
nextEvent2()seeing events that had not yet been discovered. (Bug #78129, Bug #21651536)
mysql_upgrade failed when performing an upgrade from MySQL NDB Cluster 7.2 to MySQL NDB Cluster 7.4. The root cause of this issue was an accidental duplication of code in
ndbinfo_offlinemode to be turned off too early, which in turn led a subsequent
CREATE VIEWstatement to fail. (Bug #21841821)
ClusterMgris a internal component of NDB API and ndb_mgmd processes, part of
TransporterFacade—which in turn is a wrapper around the transporter registry—and shared with data nodes. This component is responsible for a number of tasks including connection setup requests; sending and monitoring of heartbeats; provision of node state information; handling of cluster disconnects and reconnects; and forwarding of cluster state indicators.
ClusterMgrmaintains a count of live nodes which is incremented on receiving a report of a node having connected (
reportConnected()method call), and decremented on receiving a report that a node has disconnected (
TransporterRegistry. This count is checked within
reportDisconnected()to verify that is it greater than zero.
The issue addressed here arose when node connections were very brief due to send buffer exhaustion (among other potential causes) and the check just described failed. This occurred because, when a node did not fully connect, it was still possible for the connection attempt to trigger a
reportDisconnected()call in spite of the fact that the connection had not yet been reported to
ClusterMgr; thus, the pairing of
reportDisconnected()calls was not guaranteed, which could cause the count of connected nodes to be set to zero even though there remained nodes that were still in fact connected, causing node crashes with debug builds of MySQL NDB Cluster, and potential errors or other adverse effects with release builds.
To fix this issue,
ClusterMgr::reportDisconnected()now verifies that a disconnected node had actually finished connecting completely before checking and decrementing the number of connected nodes. (Bug #21683144, Bug #22016081)
References: See also: Bug #21664515, Bug #21651400.
To reduce the possibility that a node's loopback transporter becomes disconnected from the transporter registry by
reportError()due to send buffer exhaustion (implemented by the fix for Bug #21651400), a portion of the send buffer is now reserved for the use of this transporter. (Bug #21664515, Bug #22016081)
References: See also: Bug #21651400, Bug #21683144.
The loopback transporter is similar to the TCP transporter, but is used by a node to send signals to itself as part of many internal operations. Like the TCP transporter, it could be disconnected due to certain conditions including send buffer exhaustion, but this could result in blocking of
TransporterFacadeand so cause multiple issues within an ndb_mgmd or API node process. To prevent this, a node whose loopback transporter becomes disconnected is now simply shut down, rather than allowing the node process to hang. (Bug #21651400, Bug #22016081)
References: See also: Bug #21683144, Bug #21664515.
NdbEventBufferobject's active subscriptions count (
m_active_op_count) could be decremented more than once when stopping a subscription when this action failed, for example, due to a busy server and was retried. Decrementing of this count could also fail when communication with the data node failed, such as when a timeout occurred. (Bug #21616263)
References: This issue is a regression of: Bug #20575424, Bug #20561446.
In some cases, the management server daemon failed on startup without reporting the reason. Now when ndb_mgmd fails to start due to an error, the error message is printed to
stderr. (Bug #21571055)
In a MySQL NDB Cluster with multiple LDM instances, all instances wrote to the node log, even inactive instances on other nodes. During restarts, this caused the log to be filled with messages from other nodes, such as the messages shown here:
2015-06-24 00:20:16 [ndbd] INFO -- We are adjusting Max Disk Write Speed, a restart is ongoing now ... 2015-06-24 01:08:02 [ndbd] INFO -- We are adjusting Max Disk Write Speed, no restarts ongoing anymore
Now this logging is performed only by the active LDM instance. (Bug #21362380)
Backup block states were reported incorrectly during backups. (Bug #21360188)
References: See also: Bug #20204854, Bug #21372136.
BackupDiskWriteSpeedPctdata node parameter. Setting this parameter causes the data node to reserve a percentage of its maximum write speed (as determined by the value of
MaxDiskWriteSpeed) for use in local checkpoints while performing a backup.
BackupDiskWriteSpeedPctis interpreted as a percentage which can be set between 0 and 90 inclusive, with a default value of 50. (Bug #20204854)
References: See also: Bug #21372136.
When a data node is known to have been alive by other nodes in the cluster at a given global checkpoint, but its
sysfilereports a lower GCI, the higher GCI is used to determine which global checkpoint the data node can recreate. This caused problems when the data node being started had a clean file system (GCI = 0), or when it was more than more global checkpoint behind the other nodes.
Now in such cases a higher GCI known by other nodes is used only when it is at most one GCI ahead. (Bug #19633824)
References: See also: Bug #20334650, Bug #21899993. This issue is a regression of: Bug #29167.
When restoring a specific database or databases with the
--exclude-databasesoption, ndb_restore attempted to apply foreign keys on tables in databases which were not among those being restored. (Bug #18560951)
After restoring the database schema from backup using ndb_restore, auto-discovery of restored tables in transactions having multiple statements did not work correctly, resulting in Deadlock found when trying to get lock; try restarting transaction errors.
This issue was encountered both in the mysql client, as well as when such transactions were executed by application programs using Connector/J and possibly other MySQL APIs.
Prior to upgrading, this issue can be worked around by executing
SELECT TABLE_NAME, TABLE_SCHEMA FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE = 'NDBCLUSTER'on all SQL nodes following the restore operation, before executing any other statements. (Bug #18075170)
inet_ntoa()function used internally in several mgmd threads was not POSIX thread-safe, which meant that the result it returned could sometimes be undefined. To avoid this problem, a thread-safe and platform-independent wrapper for
inet_ntop()is used to take the place of this function. (Bug #17766129)
Operations relating to global checkpoints in the internal event data buffer could sometimes leak memory. (Bug #78205, Bug #21689380)
References: See also: Bug #76165, Bug #20651661.
Trying to create an
NDBtable with a composite foreign key referencing a composite primary key of the parent table failed when one of the columns in the composite foreign key was the table's primary key and in addition this column also had a unique key. (Bug #78150, Bug #21664899)
When attempting to enable index statistics, creation of the required system tables, events and event subscriptions often fails when multiple mysqld processes using index statistics are started concurrently in conjunction with starting, restarting, or stopping the cluster, or with node failure handling. This is normally recoverable, since the affected mysqld process or processes can (and do) retry these operations shortly thereafter. For this reason, such failures are no longer logged as warnings, but merely as informational events. (Bug #77760, Bug #21462846)
Adding a unique key to an
NDBtable failed when the table already had a foreign key. Prior to upgrading, you can work around this issue by creating the unique key first, then adding the foreign key afterwards, using a separate
ALTER TABLEstatement. (Bug #77457, Bug #20309828)