-
Incompatible Change: The changes listed here follow up and build further on work done in MySQL NDB Cluster 7.4.7 to improve handling of local checkpoints (LCPs) under conditions of insert overload:
-
Changes have been made in the minimum values for a number of parameters applying to data buffers for backups and LCPs. These parameters, listed here, can no longer be set so as to make the system impossible to run:
BackupDataBufferSize
: minimum increased from 0 to 2M.BackupLogBufferSize
: minimum increased from 0 to 2M.BackupWriteSize
: minimum increased from 2K to 32K.BackupMaxWriteSize
: minimum increased from 2K to 256K.
In addition, the
BackupMemory
data node parameter is now deprecated and subject to removal in a future MySQL NDB Cluster version. UseBackupDataBufferSize
andBackupLogBufferSize
instead. When a backup was unsuccessful due to insufficient resources, a subsequent retry worked only for those parts of the backup that worked in the same thread, since delayed signals are only supported in the same thread. Delayed signals are no longer sent to other threads in such cases.
An instance of an internal list object used in searching for queued scans was not actually destroyed before calls to functions that could manipulate the base object used to create it.
ACC scans were queued in the category of range scans, which could lead to starting an ACC scan when
DBACC
had no free slots for scans. We fix this by implementing a separate queue for ACC scans.
(Bug #76890, Bug #20981491, Bug #77597, Bug #21362758, Bug #77612, Bug #21370839)
References: See also: Bug #76742, Bug #20904721.
-
When the
--database
option has not been specified for ndb_show_tables, and no tables are found in theTEST_DB
database, an appropriate warning message is now issued. (Bug #50633, Bug #11758430)
Important Change; NDB Cluster APIs: The MGM API error-handling functions
ndb_mgm_get_latest_error()
,ndb_mgm_get_latest_error_msg()
, andndb_mgm_get_latest_error_desc()
each failed when used with aNULL
handle. You should note that, although these functions are now null-safe, values returned in this case are arbitrary and not meaningful. (Bug #78130, Bug #21651706)-
Important Change: When ndb_restore was run without
--disable-indexes
or--rebuild-indexes
on a table having a unique index, it was possible for rows to be restored in an order that resulted in duplicate values, causing it to fail with duplicate key errors. Runningndb_restore
on such a table now requires using at least one of these options; failing to do so now results in an error. (Bug #57782, Bug #11764893)References: See also: Bug #22329365, Bug #22345748.
-
NDB Cluster APIs: While executing
dropEvent()
, if the coordinatorDBDICT
failed after the subscription manager (SUMA
block) had removed all subscriptions but before the coordinator had deleted the event from the system table, the dropped event remained in the table, causing any subsequent drop or create event with the same name to fail withNDB
error 1419 Subscription already dropped or error 746 Event name already exists. This occurred even when callingdropEvent()
with a nonzero force argument.Now in such cases, error 1419 is ignored, and
DBDICT
deletes the event from the table. (Bug #21554676) NDB Cluster APIs: If the total amount of memory allocated for the event buffer exceeded approximately 40 MB, the calculation of memory usage percentages could overflow during computation. This was due to the fact that the associated routine used 32-bit arithmetic; this has now been changed to use
Uint64
values instead. (Bug #78454, Bug #21847552)NDB Cluster APIs: The
nextEvent2()
method continued to return exceptional events such asTE_EMPTY
,TE_INCONSISTENT
, andTE_OUT_OF_MEMORY
for event operations which already had been dropped. (Bug #78167, Bug #21673318)NDB Cluster APIs: After the initial restart of a node following a cluster failure, the cluster failure event added as part of the restart process was deleted when an event that existed prior to the restart was later deleted. This meant that, in such cases, an Event API client had no way of knowing that failure handling was needed. In addition, the GCI used for the final cleanup of deleted event operations, performed by
pollEvents()
andnextEvent()
when these methods have consumed all available events, was lost. (Bug #78143, Bug #21660947)NDB Cluster APIs: The internal value representing the latest global checkpoint was not always updated when a completed epoch of event buffers was inserted into the event queue. This caused subsequent calls to
Ndb::pollEvents()
andpollEvents2()
to fail when trying to obtain the correct GCI for the events available in the event buffers. This could also result in later calls tonextEvent()
ornextEvent2()
seeing events that had not yet been discovered. (Bug #78129, Bug #21651536)mysql_upgrade failed when performing an upgrade from MySQL NDB Cluster 7.2 to MySQL NDB Cluster 7.4. The root cause of this issue was an accidental duplication of code in
mysql_fix_privilege_tables.sql
that causedndbinfo_offline
mode to be turned off too early, which in turn led a subsequentCREATE VIEW
statement to fail. (Bug #21841821)-
ClusterMgr
is a internal component of NDB API and ndb_mgmd processes, part ofTransporterFacade
—which in turn is a wrapper around the transporter registry—and shared with data nodes. This component is responsible for a number of tasks including connection setup requests; sending and monitoring of heartbeats; provision of node state information; handling of cluster disconnects and reconnects; and forwarding of cluster state indicators.ClusterMgr
maintains a count of live nodes which is incremented on receiving a report of a node having connected (reportConnected()
method call), and decremented on receiving a report that a node has disconnected (reportDisconnected()
) fromTransporterRegistry
. This count is checked withinreportDisconnected()
to verify that is it greater than zero.The issue addressed here arose when node connections were very brief due to send buffer exhaustion (among other potential causes) and the check just described failed. This occurred because, when a node did not fully connect, it was still possible for the connection attempt to trigger a
reportDisconnected()
call in spite of the fact that the connection had not yet been reported toClusterMgr
; thus, the pairing ofreportConnected()
andreportDisconnected()
calls was not guaranteed, which could cause the count of connected nodes to be set to zero even though there remained nodes that were still in fact connected, causing node crashes with debug builds of MySQL NDB Cluster, and potential errors or other adverse effects with release builds.To fix this issue,
ClusterMgr::reportDisconnected()
now verifies that a disconnected node had actually finished connecting completely before checking and decrementing the number of connected nodes. (Bug #21683144, Bug #22016081)References: See also: Bug #21664515, Bug #21651400.
-
To reduce the possibility that a node's loopback transporter becomes disconnected from the transporter registry by
reportError()
due to send buffer exhaustion (implemented by the fix for Bug #21651400), a portion of the send buffer is now reserved for the use of this transporter. (Bug #21664515, Bug #22016081)References: See also: Bug #21651400, Bug #21683144.
-
The loopback transporter is similar to the TCP transporter, but is used by a node to send signals to itself as part of many internal operations. Like the TCP transporter, it could be disconnected due to certain conditions including send buffer exhaustion, but this could result in blocking of
TransporterFacade
and so cause multiple issues within an ndb_mgmd or API node process. To prevent this, a node whose loopback transporter becomes disconnected is now simply shut down, rather than allowing the node process to hang. (Bug #21651400, Bug #22016081)References: See also: Bug #21683144, Bug #21664515.
-
The internal
NdbEventBuffer
object's active subscriptions count (m_active_op_count
) could be decremented more than once when stopping a subscription when this action failed, for example, due to a busy server and was retried. Decrementing of this count could also fail when communication with the data node failed, such as when a timeout occurred. (Bug #21616263)References: This issue is a regression of: Bug #20575424, Bug #20561446.
In some cases, the management server daemon failed on startup without reporting the reason. Now when ndb_mgmd fails to start due to an error, the error message is printed to
stderr
. (Bug #21571055)-
In a MySQL NDB Cluster with multiple LDM instances, all instances wrote to the node log, even inactive instances on other nodes. During restarts, this caused the log to be filled with messages from other nodes, such as the messages shown here:
2015-06-24 00:20:16 [ndbd] INFO -- We are adjusting Max Disk Write Speed, a restart is ongoing now ... 2015-06-24 01:08:02 [ndbd] INFO -- We are adjusting Max Disk Write Speed, no restarts ongoing anymore
Now this logging is performed only by the active LDM instance. (Bug #21362380)
-
Backup block states were reported incorrectly during backups. (Bug #21360188)
References: See also: Bug #20204854, Bug #21372136.
-
Added the
BackupDiskWriteSpeedPct
data node parameter. Setting this parameter causes the data node to reserve a percentage of its maximum write speed (as determined by the value ofMaxDiskWriteSpeed
) for use in local checkpoints while performing a backup.BackupDiskWriteSpeedPct
is interpreted as a percentage which can be set between 0 and 90 inclusive, with a default value of 50. (Bug #20204854)References: See also: Bug #21372136.
-
When a data node is known to have been alive by other nodes in the cluster at a given global checkpoint, but its
sysfile
reports a lower GCI, the higher GCI is used to determine which global checkpoint the data node can recreate. This caused problems when the data node being started had a clean file system (GCI = 0), or when it was more than more global checkpoint behind the other nodes.Now in such cases a higher GCI known by other nodes is used only when it is at most one GCI ahead. (Bug #19633824)
References: See also: Bug #20334650, Bug #21899993. This issue is a regression of: Bug #29167.
When restoring a specific database or databases with the
--include-databases
or--exclude-databases
option, ndb_restore attempted to apply foreign keys on tables in databases which were not among those being restored. (Bug #18560951)-
After restoring the database schema from backup using ndb_restore, auto-discovery of restored tables in transactions having multiple statements did not work correctly, resulting in Deadlock found when trying to get lock; try restarting transaction errors.
This issue was encountered both in the mysql client, as well as when such transactions were executed by application programs using Connector/J and possibly other MySQL APIs.
Prior to upgrading, this issue can be worked around by executing
SELECT TABLE_NAME, TABLE_SCHEMA FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE = 'NDBCLUSTER'
on all SQL nodes following the restore operation, before executing any other statements. (Bug #18075170) The
inet_ntoa()
function used internally in several mgmd threads was not POSIX thread-safe, which meant that the result it returned could sometimes be undefined. To avoid this problem, a thread-safe and platform-independent wrapper forinet_ntop()
is used to take the place of this function. (Bug #17766129)ndb_desc used with the
--extra-partition-info
and--blob-info
options failed when run against a table containing one or moreTINYBLOB
. columns. (Bug #14695968)-
Operations relating to global checkpoints in the internal event data buffer could sometimes leak memory. (Bug #78205, Bug #21689380)
References: See also: Bug #76165, Bug #20651661.
Trying to create an
NDB
table with a composite foreign key referencing a composite primary key of the parent table failed when one of the columns in the composite foreign key was the table's primary key and in addition this column also had a unique key. (Bug #78150, Bug #21664899)When attempting to enable index statistics, creation of the required system tables, events and event subscriptions often fails when multiple mysqld processes using index statistics are started concurrently in conjunction with starting, restarting, or stopping the cluster, or with node failure handling. This is normally recoverable, since the affected mysqld process or processes can (and do) retry these operations shortly thereafter. For this reason, such failures are no longer logged as warnings, but merely as informational events. (Bug #77760, Bug #21462846)
Adding a unique key to an
NDB
table failed when the table already had a foreign key. Prior to upgrading, you can work around this issue by creating the unique key first, then adding the foreign key afterwards, using a separateALTER TABLE
statement. (Bug #77457, Bug #20309828)