New features and other important changes in NDB Cluster 7.6 which are likely to be of interest are shown in the following list:
New Disk Data table file format. A new file format is used in NDB 7.6 for NDB Disk Data tables, which makes it possible for each Disk Data table to be uniquely identified without reusing any table IDs. This should help resolve issues with page and extent handling that were visible to the user as problems with rapid creating and dropping of Disk Data tables, and for which the old format did not provide a ready means to fix.
The new format is now used whenever new undo log file groups and tablespace data files are created. Files relating to existing Disk Data tables continue to use the old format until their tablespaces and undo log file groups are re-created.Important
The old and new formats are not compatible; different data files or undo log files that are used by the same Disk Data table or tablespace cannot use a mix of formats.
To avoid problems relating to the changes in format, you should re-create any existing tablespaces and undo log file groups when upgrading to NDB 7.6. You can do this by performing an initial restart of each data node (that is, using the
--initialoption) as part of the upgrade process. You can expect this step to be made mandatory as part of upgrading from NDB 7.5 or an earlier release series to NDB 7.6 or later.
If you are using Disk Data tables, a downgrade from any NDB 7.6 release—without regard to release status—to any NDB 7.5 or earlier release requires that you restart all data nodes with
--initialas part of the downgrade process. This is because NDB 7.5 and earlier release series are not able to read the new Disk Data file format.
For more information, see Section 3.7, “Upgrading and Downgrading NDB Cluster”.
Data memory pooling and dynamic index memory. Memory required for indexes on
NDBtable columns is now allocated dynamically from that allocated for
DataMemory. For this reason, the
IndexMemoryconfiguration parameter is now deprecated, and subject to removal in a future release series.Important
In NDB 7.6, if
IndexMemoryis set in the
config.inifile, the management server issues the warning IndexMemory is deprecated, use Number bytes on each ndbd(DB) node allocated for storing indexes instead on startup, and any memory assigned to this parameter is automatically added to
In addition, the default value for
DataMemoryhas been increased to 98M; the default for
IndexMemoryhas been decreased to 0.
The pooling together of index memory with data memory simplifies the configuration of
NDB; a further benefit of these changes is that scaling up by increasing the number of LDM threads is no longer limited by having set an insufficiently large value for
IndexMemory.This is because index memory is no longer a static quantity which is allocated only once (when the cluster starts), but can now be allocated and deallocated as required. Previously, it was sometimes the case that increasing the number of LDM threads could lead to index memory exhaustion while large amounts of
As part of this work, a number of instances of
DataMemoryusage not directly related to storage of table data now use transaction memory instead.
For this reason, it may be necessary on some systems to increase
SharedGlobalMemoryto allow transaction memory to increase when needed, such as when using NDB Cluster Replication, which requires a great deal of buffering on the data nodes. On systems performing initial bulk loads of data, it may be necessary to break up very large transactions into smaller parts.
In addition, data nodes now generate
MemoryUsageevents (see Section 6.3.2, “NDB Cluster Log Events”) and write appropriate messages in the cluster log when resource usage reaches 99%, as well as when it reaches 80%, 90%, or 100%, as before.
Other related changes are listed here:
REPORT MEMORYUSAGEand other commands which expose memory consumption now shows index memory consumption using 32K pages (previously these were 8K pages).
ndbinfo.resourcestable now shows the
TRANSACTION_MEMORY, and the
RESERVEDresource has been removed.
ndbinfo processes and config_nodes tables. NDB 7.6 adds two tables to the
ndbinfoinformation database to provide information about cluster nodes; these tables are listed here:
config_nodes: This table the node ID, process type, and host name for each node listed in an NDB cluster's configuration file.
processesshows information about nodes currently connected to the cluster; this information includes the process name and system process ID; for each data node and SQL node, it also shows the process ID of the node's angel process. In addition, the table shows a service address for each connected node; this address can be set in NDB API applications using the
Ndb_cluster_connection::set_service_uri()method, which is also added in NDB 7.6.
System name. The system name of an NDB cluster can be used to identify a specific cluster. In NDB 7.6, the MySQL Server shows this name as the value of the
Ndb_system_namestatus variable; NDB API applications can use the
Ndb_cluster_connection::get_system_name()method which is added in the same release.
A system name based on the time the management server was started is generated automatically>; you can override this value by adding a
[system]section to the cluster's configuration file and setting the
Nameparameter to a value of your choice in this section, prior to starting the management server.
ndb_import CSV import tool. ndb_import, added in NDB Cluster 7.6, loads CSV-formatted data directly into an
NDBtable using the NDB API (a MySQL server is needed only to create the table and database in which it is located). ndb_import can be regarded as an analog of mysqlimport or the
LOAD DATASQL statement, and supports many of the same or similar options for formatting of the data.
Assuming that the database and target
NDBtable exist, ndb_import needs only a connection to the cluster's management server (ndb_mgmd) to perform the importation; for this reason, there must be an
[api]slot available to the tool in the cluster's
See Section 5.14, “ndb_import — Import CSV Data Into NDB”, for more information.
ndb_top monitoring tool. Added the ndb_top utility, which shows CPU load and usage information for an
NDBdata node in real time. This information can be displayed in text format, as an ASCII graph, or both. The graph can be shown in color, or using grayscale.
ndb_top connects to an NDB Cluster SQL node (that is, a MySQL Server). For this reason, the program must be able to connect as a MySQL user having the
SELECTprivilege on tables in the
ndb_top is available for Linux, Solaris, and macOS platforms, but is not currently available for Windows platforms.
For more information, see Section 5.30, “ndb_top — View CPU usage information for NDB threads”.
Code cleanup. A significant number of debugging statements and printouts not necessary for normal operations have been moved into code used only when testing or debugging
NDB, or dispensed with altogether. This removal of overhead should result in a noticeable improvement in the performance of LDM and TC threads on the order of 10% in many cases.
LDM thread and LCP improvements. Previously, when a local data management thread experienced I/O lag, it wrote to local checkpoints more slowly. This could happen, for example, during a disk overload condition. Problems could occur because other LDM threads did not always observe this state, or do likewise.
NDBnow tracks I/O lag mode globally, so that this state is reported as soon as at least one thread is writing in I/O lag mode; it then makes sure that the reduced write speed for this LCP is enforced for all LDM threads for the duration of the slowdown condition. Because the reduction in write speed is now observed by other LDM instances, overall capacity is increased; this enables the disk overload (or other condition inducing I/O lag) to be overcome more quickly in such cases than it was previously.
NDB error identification. Error messages and information can be obtained using the mysql client in NDB 7.6 from a new
error_messagestable in the
ndbinfoinformation database. In addition, NDB 7.6 introduces a new command-line client ndb_perror for obtaining information from NDB error codes; this replaces using perror with
--ndb, which is now deprecated and subject to removal in a future release.
For more information, see Section 6.14.21, “The ndbinfo error_messages Table”, and Section 5.17, “ndb_perror — Obtain NDB Error Message Information”.
SPJ improvements. When executing a scan as a pushed join (that is, the root of the query is a scan), the
DBTCblock sends an SPJ request to a
DBSPJinstance on the same node as the fragment to be scanned. Formerly, one such request was sent for each of the node's fragments. As the number of
DBSPJinstances is normally set less than the number of LDM instances, this means that all SPJ instances were involved in the execution of a single query, and, in fact, some SPJ instances could (and did) receive multiple requests from the same query. NDB 7.6 makes it possible for a single SPJ request to handle a set of root fragments to be scanned, so that only a single SPJ request (
SCAN_FRAGREQ) needs to be sent to any given SPJ instance (
DBSPJblock) on each node.
DBSPJconsumes a relatively small amount of the total CPU used when evaluating a pushed join, unlike the LDM block (which is repsonsible for the majority of the CPU usage), introducing multiple SPJ blocks adds some parallelism, but the additional overhead also increases. By enabling a single SPJ request to handle a set of root fragments to be scanned, such that only a single SPJ request is sent to each
DBSPJinstance on each node and batch sizes are allocated per fragment, the multi-fragment scan can obtain a larger total batch size, allowing for some scheduling optimizations to be done within the SPJ block, which can scan a single fragment at a time (giving it the total batch size allocation), scan all fragments in parallel using smaller sub-batches, or some combination of the two.
This work is expected to increase performance of pushed-down joins for the following reasons:
Since multiple root fragments can be scanned for each SPJ request, it is necessary to request fewer SPJ instances when executing a pushed join
Increased available batch size allocation, and for each fragment, should also in most cases result in fewer requests being needed to complete a join
Improved O_DIRECT handling for redo logs. NDB 7.6 provides a new data node configuration parameter
ODirectSyncFlagwhich causes completed redo log writes using
O_DIRECTto be handled as
ODirectSyncFlagis disabled by default; to enable it, set it to
You should bear in mind that the setting for this parameter is ignored when at least one of the following conditions is true:
ODirectis not enabled.
InitFragmentLogFilesis set to
Locking of CPUs to offline index build threads. In NDB 7.6, offline index builds by default use all cores available to ndbmtd, instead of being limited to the single core reserved for the I/O thread. It also becomes possible to specify a desired set of cores to be used for I/O threads performing offline multithreaded builds of ordered indexes. This can improve restart and restore times and performance, as well as availability.Note
“Offline” as used here refers to an ordered index build that takes place while a given table is not being written to. Such index builds occur during a node or system restart, or when restoring a cluster from backup using ndb_restore
This improvement involves several related changes. The first of these is to change the default value for the
BuildIndexThreadsconfiguration parameter (from 0 to 128), means that offline ordered index builds are now multithreaded by default. The default value for the
TwoPassInitialNodeRestartCopyis also changed (from
true), so that an initial node restart first copies all data without any creation of indexes from a “live” node to the node which is being started, builds the ordered indexes offline after the data has been copied, then again synchronizes with the live node; this can significantly reduce the time required for building indexes. In addition, to facilitate explicit locking of offline index build threads to specific CPUs, a new thread type (
idxbld) is defined for the
As part of this work,
NDBcan now distinguish between execution thread types and other types of threads, and between types of threads which are permanently assigned to specific tasks, and those whose assignments are merely temporary.
NDB 7.6 also introduces the
ThreadCOnfig. By setting this to 1, you can keep a
tcthread from assisting the send threads. This parameter is 0 by default, and cannot be used with I/O threads, send threads, index build threads, or watchdog threads.
For additonal information, see the descriptions of the parameters.
Variable batch sizes for DDL bulk data operations. As part of work ongoing to optimize bulk DDL performance by ndbmtd, it is now possible to obtain performance improvements by increasing the batch size for the bulk data parts of DDL operations processing data using scans. Batch sizes are now made configurable for unique index builds, foreign key builds, and online reorganization, by setting the respective data node configuration parameters listed here:
For each of the parameters just listed, the default value is 64, the minimum is 16, and the maximum is 512.
Increasing the appropriate batch size or sizes can help amortize inter-thread and inter-node latencies and make use of more parallel resources (local and remote) to help scale DDL performance. In each case there can be a tradeoff with ongoing traffic.
Partial LCPs. NDB 7.6 implements partial local checkpoints. Formerly, an LCP always made a copy of the entire database. When working with terabytes of data this process could require a great deal of time, with an adverse impact on node and cluster restarts especially, as well as more space for the redo logs. It is now no longer strictly necessary for LCPs to do this—instead, an LCP now by default saves only a number of records that is based on the quantity of data changed since the previous LCP. This can vary between a full checkpoint and a checkpoint that changes nothing at all. In the event that the checkpoint reflects any changes, the minimum is to write one part of the 2048 making up a local LCP.
As part of this change, two new data node configuration parameters are inroduced in this release:
true, or enabled) enables partial LCPs.
RecoveryWorkcontrols the percentage of space given over to LCPs; it increases with the amount of work which must be performed on LCPs during restarts as opposed to that performed during normal operations. Raising this value causes LCPs during normal operations to require writing fewer records and so decreases the usual workload. Raising this value also means that restarts can take longer.
You must disable partial LCPs explicitly by setting
EnablePartialLcp=false. This uses the least amount of disk, but also tends to maximize the write load for LCPs. To optimize for the lowest workload on LCPs during normal operation, use
RecoveryWork=100. To use the least disk space for partial LCPs, but with bounded writes, use
RecoveryWork=25, which is the minimum for
RecoveryWork. The default is
RecoveryWork=50, which means LCP files require approximately 1.5 times
CompressedLcp=1, this can be further reduced by half. Recovery times using the default settings should also be much faster than when
EnablePartialLcpis set to
The default value for
RecoveryWorkwas increased from 50 to 60.
In addition the data node configuration parameters
BackupMaxWriteSizeare all now deprecated, and subject to removal in a future release of MySQL NDB Cluster.
As part of this enhancement, work has been done to correct several issues with node restarts wherein it was possible to run out of undo log in various situations, most often when restoring a node that had been down for a long time during a period of intensive write activity.
Additional work was done to improve data node survival of long periods of synchronization without timing out, by updating the LCP watchdog during this process, and keeping better track of the progress of disk data synchronization. Previously, there was the possibility of spurious warnings or even node failures if synchronization took longer than the LCP watchdog timeout.Important
When upgrading an NDB Cluster that uses disk data tables to NDB 7.6 or downgrading it from NDB 7.6, it is necessary to restart all data nodes with
Parallel undo log record processing. Formerly, the data node
LGMANkernel block processed undo log records serially; now this is done in parallel. The rep thread, which hands off undo records to LDM threads, waited for an LDM to finish applying a record before fetching the next one; now the rep thread no longer waits, but proceeds immediately to the next record and LDM.
A count of the number of outstanding log records for each LDM in
LGMANis kept, and decremented whenever an LDM has completed the execution of a record. All the records belonging to a page are sent to the same LDM thread but are not guaranteed to be processed in order, so a hash map of pages that have outstanding records maintains a queue for each of these pages. When the page is available in the page cache, all records pending in the queue are applied in order.
A few types of records continue to be processed serially:
There are no user-visible changes in functionality directly associated with this performance enhancement; it is part of work done to improve undo long handling in support of partial local checkpoints in NDB Cluster 7.6.
Reading table and fragment IDs from extent for undo log applier. When applying an undo log, it is necessary to obtain the table ID and fragment ID from the page ID. This was done previously by reading the page from the
PGMANkernel block using an extra
PGMANworker thread, but when applying the undo log it was necessary to read the page again.
O_DIRECTthis was very inefficient since the page was not cached in the OS kernel. To correct this issue, mapping from page ID to table ID and fragment ID is now done using information from the extent header the table IDs and fragment IDs for the pages used within a given extent. The extent pages are always present in the page cache, so no extra reads from disk are required for performing the mapping. In addition, the information can already be read, using existing
TSMANkernel block data structures.
See the description of the
ODirectdata node configuration parameter, for more information.
Shared memory transporter. User-defined shared memory (SHM) connections between a data node and an API node on the same host computer are fully supported in NDB 7.6, and are no longer considered experimental. You can enable an explicit shared memory connection by setting the
UseShmconfiguration parameter to
1for the relevant data node. When explicitly defining shared memory as the connection method, it is also necessary that both the data node and the API node are identified by
Performance of SHM connections can be enhanced through setting parameters such as
[shm default]section of the cluster configuration file (
config.ini). Configuration of SHM is otherwise similar to that of the TCP transporter.
SigNumparameter is not used in the new SHM implementation, and any settings made for it are now ignored. Section 4.3.12, “NDB Cluster Shared Memory Connections”, provides more information about these parameters. In addition, as part of this work,
NDBcode relating to the old SCI transporter has been removed.
For more information, see Section 4.3.12, “NDB Cluster Shared Memory Connections”.
SPJ block inner join optimization. In NDB 7.6, the
SPJkernel block can take into account when it is evaluating a join request in which at least some of the tables are INNER-joined. This means that it can eliminate requests for row, ranges, or both as soon as it becomes known that one or more of the preceding requests did not return any results for a parent row. This saves both the data nodes and the
SPJblock from having to handle requests and result rows which never take part in an INNER-joined result row.
Consider this join query, where
pkis the primary key on tables t2, t3, and t4, and columns x, y, and z are nonindexed columns:
SELECT * FROM t1 JOIN t2 ON t2.pk = t1.x JOIN t3 ON t3.pk = t1.y JOIN t4 ON t4.pk = t1.z;
Previously, this resulted in an
SPJrequest including a scan on table
t1, and lookups on each of the tables
t4; these were evaluated for every row returned from
t1. For these,
LQHKEYREQrequests for tables
SPJtakes into consideration the requirement that, to produce any result rows, an inner join must find a match in all tables joined; as soon as no matches are found for one of the tables, any further requests to tables having the same parent or tables are now skipped.Note
This optimization cannot be applied until all of the data nodes and all of the API nodes in the cluster have been upgraded to NDB 7.6.
NDB wakeup thread.
NDBuses a poll receiver to read from sockets, to execute messages from the sockets, and to wake up other threads. When making only intermittent use of a receive thread, poll ownership is given up before starting to wake up other threads, which provides some degree of parallelism in the receive thread, but, when making constant use of the receive thread, the thread can be overburdened by tasks including wakeup of other threads.
NDB 7.6 supports offloading by the receiver thread of the task of waking up other threads to a new thread that wakes up other threads on request (and otherwise simply sleeps), making it possible to improve the capacity of a single cluster connection by roughly ten to twenty percent.
Adaptive LCP control. NDB 7.6.7 implements an adaptive LCP control mechanism which acts in response to changes in redo log space usage. By controlling LCP disk write speed, you can help protect against a number of resource-related issues, including the following:
Insufficient CPU resources for traffic applications
Insufficient redo log buffer
GCP Stop conditions
Insufficient redo log space
Insufficient undo log space
This work includes the following changes relating to
The default value of the
RecoveryWorkdata node parameter is increased from 50 to 60; that is,
NDBnow uses 1.6 times the size of the data for storage of LCPs.
A new data node configuration parameter
InsertRecoveryWorkprovides additional tuning capabilities through controlling the percentage of
RecoveryWorkthat is reserved for insert operations. The default value is 40 (that is, 40% of the storage space already reserved by
RecoveryWork); the minimum and maximum are 0 and 70, respectively. Increasing this value allows for more writes to be performed during an LCP, while limiting the total size of the LCP. Decreasing
InsertRecoveryWorklimits the number of writes used during an LCP, but results in more space being used for the LCP, which means that recovery takes longer.
This work implements control of LCP speed chiefly to minimize the risk of running out of redo log. This is done in adapative fashion, based on the amount of redo log space used, using the alert levels, with the responses taken when these levels are attained, shown here:
Low: Redo log space usage is greater than 25%, or estimated usage shows insufficient redo log space at a very high transaction rate. In response, use of LCP data buffers is increased during LCP scans, priority of LCP scans is increased, and the amount of data that can be written per real-time break in an LCP scan is also increased.
High: Redo log space usage is greater than 40%, or estimate to run out of redo log space at a high transaction rate. When this level of usage is reached,
MaxDiskWriteSpeedis increased to the value of
MaxDiskWriteSpeedOtherNodeRestart. In addition, the minimum speed is doubled, and priority of LCP scans and what can be written per real-time break are both increased further.
Critical: Redo log space usage is greater than 60%, or estimated usage shows insufficient redo log space at a normal transaction rate. At this level,
MaxDiskWriteSpeedis increased to the value of
MinDiskWriteSpeedis also set to this value. Priority of LCP scans and the amount of data that can be written per real-time break are increased further, and the LCP data buffer is completely available during the LCP scan.
Raising the level also has the effect of increasing the calculated target checkpoint speed.
LCP control has the following benefits for
Clusters should now survive very heavy loads using default configurations much better than previously.
It should now be possible for
NDBto run reliably on systems where the available disk space is (at a rough minimum) 2.1 times the amount of memory allocated to it (
DataMemory). You should note that this figure does not include any disk space used for Disk Data tables.
Restoring by slices. Beginning with NDB 7.6.13, it is possible to divide a backup into roughly equal portions (slices) and to restore these slices in parallel using two new options implemented for ndb_restore:
This makes it possible to employ multiple instances of ndb_restore to restore subsets of the backup in parallel, potentially reducing the amount of time required to perform the restore operation.
ndb_restore: primary key schema changes. NDB 7.6.14 (and later) supports different primary key definitions for source and target tables when restoring an
NDBnative backup with ndb_restore when it is run with the
--allow-pk-changesoption. Both increasing and decreasing the number of columns making up the original primary key are supported.
When the primary key is extended with an additional column or columns, any columns added must be defined as
NOT NULL, and no values in any such columns may be changed during the time that the backup is being taken. Because some applications set all column values in a row when updating it, whether or not all values are actually changed, this can cause a restore operation to fail even if no values in the column to be added to the primary key have changed. You can override this behavior using the
--ignore-extended-pk-updatesoption also added in NDB 7.6.14; in this case, you must ensure that no such values are changed.
A column can be removed from the table's primary key whether or not this column remains part of the table.
ndb_blob_tool enhancements. Beginning with NDB 7.6.14, the ndb_blob_tool utility can detect missing blob parts for which inline parts exist and replace these with placeholder blob parts (consisting of space characters) of the correct length. To check whether there are missing blob parts, use the
--check-missingoption with this program. To replace any missing blob parts with placeholders, use the
For more information, see Section 5.6, “ndb_blob_tool — Check and Repair BLOB and TEXT columns of NDB Cluster Tables”.
Merging backups with ndb_restore. In some cases, it may be desirable to consolidate data originally stored in different instances of NDB Cluster (all using the same schema) into a single target NDB Cluster. This is now supported when using backups created in the ndb_mgm client (see Section 6.8.2, “Using The NDB Cluster Management Client to Create a Backup”) and restoring them with ndb_restore, using the
--remap-columnoption added in NDB 7.6.14 along with
--restore-data(and possibly additional compatible options as needed or desired).
--remap-columncan be employed to handle cases in which primary and unique key values are overlapping between source clusters, and it is necessary that they do not overlap in the target cluster, as well as to preserve other relationships between tables such as foreign keys.
--remap-columntakes as its argument a string having the format
colare, respectively, the names of the database, table, and column,
fnis the name of a remapping function, and
argsis one or more arguments to
fn. There is no default value. Only
offsetis supported as the function name, with
argsas the integer offset to be applied to the value of the column when inserting it into the target table from the backup. This column must be one of
BIGINT; the allowed range of the offset value is the same as the signed version of that type (this allows the offset to be negative if desired).
The new option can be used multiple times in the same invocation of ndb_restore, so that you can remap to new values multiple columns of the same table, different tables, or both. The offset value does not have to be the same for all instances of the option.
In addition, two new options are provided for ndb_desc, also beginning in NDB 7.6.14:
For more information and examples, see the description of the
--ndb-log-fail-terminate option. Beginning with NDB 7.6.14, you can cause the SQL node to terminate whenever it is unable to log all row events fully. This can be done by starting mysqld with the
NDB programs—NDBT dependency removal. The dependency of a number of
NDButility programs on the
NDBTlibrary has been removed. This library is used internally for development, and is not required for normal use; its inclusion in these programs could lead to unwanted issues when testing.
Affected programs are listed here, along with the
NDBversions in which the dependency was removed:
The principal effect of this change for users is that these programs no longer print
NDBT_ProgramExit -following completion of a run. Applications that depend upon such behavior should be updated to reflect the change when upgrading to the indicated versions.
Auto-Installer deprecation and removal. The MySQL NDB Cluster Auto-Installer web-based installation tool (ndb_setup.py) is deprecated in NDB 7.6.16, and is removed in NDB 7.6.17 and later. It is no longer supported.
ndbmemcache deprecation and removal.
ndbmemcacheis no longer supported.
ndbmemcachewas deprecated in NDB 7.6.16, and removed in NDB 7.6.17.
Node.js support removed. Beginning with the NDB Cluster 7.6.16 release, support for Node.js by NDB 7.6 has been removed.
Support for Node.js by NDB Cluster is maintained in NDB 8.0 only.
Conversion between NULL and NOT NULL during restore operations. Beginning with NDB 7.6.19, ndb_restore can support restoring of
NOT NULLand the reverse, using the options listed here:
For more information, see the descriptions of the indicated ndb_restore options.