Performance: A number of performance and other improvements have been made with regard to node starts and restarts. The following list contains a brief description of each of these changes:
Before memory allocated on startup can be used, it must be touched, causing the operating system to allocate the actual physical memory needed. The process of touching each page of memory that was allocated has now been multithreaded, with touch times on the order of 3 times shorter than with a single thread when performed by 16 threads.
When performing a node or system restart, it is necessary to restore local checkpoints for the fragments. This process previously used delayed signals at a point which was found to be critical to performance; these have now been replaced with normal (undelayed) signals, which should shorten significantly the time required to back up a MySQL NDB Cluster or to restore it from backup.
Previously, there could be at most 2 LDM instances active with local checkpoints at any given time. Now, up to 16 LDMs can be used for performing this task, which increases utilization of available CPU power, and can speed up LCPs by a factor of 10, which in turn can greatly improve restart times.
Better reporting of disk writes and increased control over these also make up a large part of this work. New
disk_write_speed_aggregate_nodeprovide information about the speed of disk writes for each LDM thread that is in use. The
DiskCheckpointSpeedInRestartconfiguration parameters have been deprecated, and are subject to removal in a future MySQL NDB Cluster version. This release adds the data node configuration parameters
MaxDiskWriteSpeedOwnRestartto control write speeds for LCPs and backups when the present node, another node, or no node is currently restarting.
For more information, see the descriptions of the
ndbinfotables and MySQL NDB Cluster configuration parameters named previously.
Reporting of MySQL NDB Cluster start phases has been improved, with more frequent printouts. New and better information about the start phases and their implementation has also been provided in the sources and documentation. See Summary of NDB Cluster Start Phases.
Performance: Several internal methods relating to the
NDBreceive thread have been optimized to make mysqld more efficient in processing SQL applications with the
NDBstorage engine. In particular, this work improves the performance of the
NdbReceiver::execTRANSID_AI()method, which is commonly used to receive a record from the data nodes as part of a scan operation. (Since the receiver thread sometimes has to process millions of received records per second, it is critical that this method does not perform unnecessary work, or tie up resources that are not strictly needed.) The associated internal functions
handleReceivedSignal()methods have also been improved, and made more efficient.
Information about memory usage by individual fragments can now be obtained from the
memory_per_fragmentview added in this release to the
ndbinfoinformation database. This information includes pages having fixed, and variable element size, rows, fixed element free slots, variable element free bytes, and hash index memory usage. For information, see The ndbinfo memory_per_fragment Table.
NDB Cluster APIs: When an NDB API client application received a signal with an invalid block or signal number,
NDBprovided only a very brief error message that did not accurately convey the nature of the problem. Now in such cases, appropriate printouts are provided when a bad signal or message is detected. In addition, the message length is now checked to make certain that it matches the size of the embedded signal. (Bug #18426180)
In some cases, transporter receive buffers were reset by one thread while being read by another. This happened when a race condition occurred between a thread receiving data and another thread initiating disconnect of the transporter (disconnection clears this buffer). Concurrency logic has now been implemented to keep this race from taking place. (Bug #19552283, Bug #73790)
When a new data node started, API nodes were allowed to attempt to register themselves with the data node for executing transactions before the data node was ready. This forced the API node to wait an extra heartbeat interval before trying again.
To address this issue, a number of HA_ERR_NO_CONNECTION errors (Error 4009) that could be issued during this time have been changed to Cluster temporarily unavailable errors (Error 4035), which should allow API nodes to use new data nodes more quickly than before. As part of this fix, some errors which were incorrectly categorised have been moved into the correct categories, and some errors which are no longer used have been removed. (Bug #19524096, Bug #73758)
ALTER TABLE ... REORGANIZE PARTITIONafter increasing the number of data nodes in the cluster from 4 to 16 led to a crash of the data nodes. This issue was shown to be a regression caused by previous fix which added a new dump handler using a dump code that was already in use (7019), which caused the command to execute two different handlers with different semantics. The new handler was assigned a new
DUMPcode (7024). (Bug #18550318)
References: This issue is a regression of: Bug #14220269.
When certain queries generated signals having more than 18 data words prior to a node failure, such signals were not written correctly in the trace file. (Bug #18419554)
Failure of multiple nodes while using ndbmtd with multiple TC threads was not handled gracefully under a moderate amount of traffic, which could in some cases lead to an unplanned shutdown of the cluster. (Bug #18069334)
For multithreaded data nodes, some threads do communicate often, with the result that very old signals can remain at the top of the signal buffers. When performing a thread trace, the signal dumper calculated the latest signal ID from what it found in the signal buffers, which meant that these old signals could be erroneously counted as the newest ones. Now the signal ID counter is kept as part of the thread state, and it is this value that is used when dumping signals for trace files. (Bug #73842, Bug #19582807)