MySQL :: MySQL 9.7 Reference Manual :: 25.2.3 NDB Cluster Hardware, Software, and Networking Requirements

MySQL 9.7 Reference Manual / ... / NDB Cluster Hardware, Software, and Networking Requirements

25.2.3 NDB Cluster Hardware, Software, and Networking Requirements

One of the strengths of NDB Cluster is that it can be run on commodity hardware and has no unusual requirements in this regard, other than for large amounts of RAM, due to the fact that all live data storage is done in memory. (It is possible to reduce this requirement using Disk Data tables—see Section 25.6.11, “NDB Cluster Disk Data Tables”, for more information about these.) You can obtain information about memory usage by data nodes by viewing the ndbinfo.memoryusage table, or the output of the REPORT MemoryUsage command in the ndb_mgm client. For information about memory used by NDB tables, you can query the ndbinfo.memory_per_fragment table.

Increasing the number of CPUs, using faster CPUs, or both, on the computers hosting data nodes can generally be expected to enhance the performance of NDB Cluster. Memory requirements for cluster processes other than the data nodes are relatively small.

The software requirements for NDB Cluster are also modest. Host operating systems do not require any unusual modules, services, applications, or configuration to support NDB Cluster. For supported operating systems, a standard installation should be sufficient. The MySQL software requirements are simple: all that is needed is a production release of NDB Cluster. It is not strictly necessary to compile MySQL yourself merely to be able to use NDB Cluster. We assume that you are using the binaries appropriate to your platform, available from the NDB Cluster software downloads page at https://dev.mysql.com/downloads/cluster/.

For communication between nodes, NDB Cluster supports TCP/IP networking in any standard topology, and the minimum expected for each host is a standard 100 Mbps Ethernet card, plus a switch, hub, or router to provide network connectivity for the cluster as a whole.

We strongly recommend that an NDB Cluster be run on its own subnet which is not shared with machines not forming part of the cluster; using a private or protected network allows the cluster to make exclusive use of bandwidth between cluster hosts. Using a separate switch for your NDB Cluster installation not only helps protect against unauthorized access to data stored in the cluster, but also ensures that cluster nodes are shielded from interference caused by transmissions between other computers on the network. For enhanced reliability, you can use dual switches and dual cards to remove the network as a single point of failure; many device drivers support failover for such communication links.

NDB supports encrypted live and backup files and file systems, as discussed in Section 25.6.19.4, “File System Encryption for NDB Cluster”. Section 25.6.19.5, “TLS Link Encryption for NDB Cluster”, provides information about enabling support for encrypted connections between nodes. Encrypted backups can be read by many NDB command-line programs including ndb_restore, ndbxfrm, ndb_print_backup_file, and ndb_mgm. See Section 25.6.8.2, “Using The NDB Cluster Management Client to Create a Backup”, for more information about creating encypted backups.

NDB Cluster also supports encrypted network connections between nodes; see Section 25.6.19.4, “File System Encryption for NDB Cluster”, for details.

Network communication and latency. NDB Cluster requires communication between data nodes and API nodes (including SQL nodes), as well as between data nodes and other data nodes, to execute queries and updates. Communication latency between these processes can directly affect the observed performance and latency of user queries. In addition, to maintain consistency and service despite the silent failure of nodes, NDB Cluster uses heartbeating and timeout mechanisms which treat an extended loss of communication from a node as node failure. This can lead to reduced redundancy. Recall that, to maintain data consistency, an NDB Cluster shuts down when the last node in a node group fails. Thus, to avoid increasing the risk of a forced shutdown, breaks in communication between nodes should be avoided wherever possible.

The failure of a data or API node results in the abort of all uncommitted transactions involving the failed node. Data node recovery requires synchronization of the failed node's data from a surviving data node, and re-establishment of disk-based redo and checkpoint logs, before the data node returns to service. This recovery can take some time, during which the Cluster operates with reduced redundancy.

Heartbeating relies on timely generation of heartbeat signals by all nodes. This may not be possible if the node is overloaded, has insufficient machine CPU due to sharing with other programs, or is experiencing delays due to swapping. If heartbeat generation is sufficiently delayed, other nodes treat the node that is slow to respond as failed.

This treatment of a slow node as a failed one may or may not be desirable in some circumstances, depending on the impact of the node's slowed operation on the rest of the cluster. When setting timeout values such as HeartbeatIntervalDbDb and HeartbeatIntervalDbApi for NDB Cluster, care must be taken care to achieve quick detection, failover, and return to service, while avoiding potentially expensive false positives.

Where communication latencies between data nodes are expected to be higher than would be expected in a LAN environment (on the order of 100 µs), timeout parameters must be increased to ensure that any allowed periods of latency periods are well within configured timeouts. Increasing timeouts in this way has a corresponding effect on the worst-case time to detect failure and therefore time to service recovery.

LAN environments can typically be configured with stable low latency, and such that they can provide redundancy with fast failover. Individual link failures can be recovered from with minimal and controlled latency visible at the TCP level (where NDB Cluster normally operates). WAN environments may offer a range of latencies, as well as redundancy with slower failover times. Individual link failures may require route changes to propagate before end-to-end connectivity is restored. At the TCP level this can appear as large latencies on individual channels. The worst-case observed TCP latency in these scenarios is related to the worst-case time for the IP layer to reroute around the failures.

PREV HOME UP NEXT

PDF (US Ltr) - 41.8Mb
PDF (A4) - 41.9Mb
Man Pages (TGZ) - 272.4Kb
Man Pages (Zip) - 378.4Kb
Info (Gzip) - 4.2Mb
Info (Zip) - 4.2Mb