NDBCLUSTER storage engine can be
configured with a range of failover and load-balancing options,
but it is easiest to start with the storage engine at the cluster
level. NDB Cluster's
engine contains a complete set of data, dependent only on other
data within the cluster itself.
The “Cluster” portion of NDB Cluster is configured independently of the MySQL servers. In an NDB Cluster, each part of the cluster is considered to be a node.
In many contexts, the term “node” is used to indicate a computer, but when discussing NDB Cluster it means a process. It is possible to run multiple nodes on a single computer; for a computer on which one or more cluster nodes are being run we use the term cluster host.
There are three types of cluster nodes, and in a minimal NDB Cluster configuration, there must be at least three nodes, one of each of these types:
Management node: The role of this type of node is to manage the other nodes within the NDB Cluster, performing such functions as providing configuration data, starting and stopping nodes, and running backups. Because this node type manages the configuration of the other nodes, a node of this type should be started first, before any other node. A management node is started with the command ndb_mgmd.
Data node: This type of node stores cluster data. There are as many data nodes as there are fragment replicas, times the number of fragments (see Section 21.2.2, “NDB Cluster Nodes, Node Groups, Fragment Replicas, and Partitions”). For example, with two fragment replicas, each having two fragments, you need four data nodes. One fragment replica is sufficient for data storage, but provides no redundancy; therefore, it is recommended to have two (or more) fragment replicas to provide redundancy, and thus high availability. A data node is started with the command ndbd (see Section 21.5.1, “ndbd — The NDB Cluster Data Node Daemon”) or ndbmtd (see Section 21.5.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”).
NDB Cluster tables are normally stored completely in memory rather than on disk (this is why we refer to NDB Cluster as an in-memory database). However, some NDB Cluster data can be stored on disk; see Section 21.6.11, “NDB Cluster Disk Data Tables”, for more information.
SQL node: This is a node that accesses the cluster data. In the case of NDB Cluster, an SQL node is a traditional MySQL server that uses the
NDBCLUSTERstorage engine. An SQL node is a mysqld process started with the
--ndb-connectstringoptions, which are explained elsewhere in this chapter, possibly with additional MySQL server options as well.
An SQL node is actually just a specialized type of API node, which designates any application which accesses NDB Cluster data. Another example of an API node is the ndb_restore utility that is used to restore a cluster backup. It is possible to write such applications using the NDB API. For basic information about the NDB API, see Getting Started with the NDB API.
It is not realistic to expect to employ a three-node setup in a production environment. Such a configuration provides no redundancy; to benefit from NDB Cluster's high-availability features, you must use multiple data and SQL nodes. The use of multiple management nodes is also highly recommended.
For a brief introduction to the relationships between nodes, node groups, fragment replicas, and partitions in NDB Cluster, see Section 21.2.2, “NDB Cluster Nodes, Node Groups, Fragment Replicas, and Partitions”.
Configuration of a cluster involves configuring each individual node in the cluster and setting up individual communication links between nodes. NDB Cluster is currently designed with the intention that data nodes are homogeneous in terms of processor power, memory space, and bandwidth. In addition, to provide a single point of configuration, all configuration data for the cluster as a whole is located in one configuration file.
The management server manages the cluster configuration file and the cluster log. Each node in the cluster retrieves the configuration data from the management server, and so requires a way to determine where the management server resides. When interesting events occur in the data nodes, the nodes transfer information about these events to the management server, which then writes the information to the cluster log.
In addition, there can be any number of cluster client processes
or applications. These include standard MySQL clients,
NDB-specific API programs, and management
clients. These are described in the next few paragraphs.
Standard MySQL clients. NDB Cluster can be used with existing MySQL applications written in PHP, Perl, C, C++, Java, Python, and so on. Such client applications send SQL statements to and receive responses from MySQL servers acting as NDB Cluster SQL nodes in much the same way that they interact with standalone MySQL servers.
MySQL clients using an NDB Cluster as a data source can be
modified to take advantage of the ability to connect with multiple
MySQL servers to achieve load balancing and failover. For example,
Java clients using Connector/J 5.0.6 and later can use
jdbc:mysql:loadbalance:// URLs (improved in
Connector/J 5.1.7) to achieve load balancing transparently; for
more information about using Connector/J with NDB Cluster, see
Using Connector/J with NDB Cluster.
NDB client programs.
Client programs can be written that access NDB Cluster data
directly from the
NDBCLUSTER storage engine,
bypassing any MySQL Servers that may be connected to the
cluster, using the NDB
API, a high-level C++ API. Such applications may be
useful for specialized purposes where an SQL interface to the
data is not needed. For more information, see
The NDB API.
NDB-specific Java applications can also be
written for NDB Cluster using the NDB
Cluster Connector for Java. This NDB Cluster Connector
includes ClusterJ, a
high-level database API similar to object-relational mapping
persistence frameworks such as Hibernate and JPA that connect
NDBCLUSTER, and so does not require
access to a MySQL Server. See
Java and NDB Cluster, and
The ClusterJ API and Data Object Model, for more
Management clients. These clients connect to the management server and provide commands for starting and stopping nodes gracefully, starting and stopping message tracing (debug versions only), showing node versions and status, starting and stopping backups, and so on. An example of this type of program is the ndb_mgm management client supplied with NDB Cluster (see Section 21.5.5, “ndb_mgm — The NDB Cluster Management Client”). Such applications can be written using the MGM API, a C-language API that communicates directly with one or more NDB Cluster management servers. For more information, see The MGM API.
Oracle also makes available MySQL Cluster Manager, which provides an advanced command-line interface simplifying many complex NDB Cluster management tasks, such restarting an NDB Cluster with a large number of nodes. The MySQL Cluster Manager client also supports commands for getting and setting the values of most node configuration parameters as well as mysqld server options and variables relating to NDB Cluster. See MySQL Cluster Manager 1.4.8 User Manual, for more information.
Event logs. NDB Cluster logs events by category (startup, shutdown, errors, checkpoints, and so on), priority, and severity. A complete listing of all reportable events may be found in Section 21.6.3, “Event Reports Generated in NDB Cluster”. Event logs are of the two types listed here:
Cluster log: Keeps a record of all desired reportable events for the cluster as a whole.
Node log: A separate log which is also kept for each individual node.
Under normal circumstances, it is necessary and sufficient to keep and examine only the cluster log. The node logs need be consulted only for application development and debugging purposes.
Generally speaking, when data is saved to disk, it is said that
a checkpoint has been
reached. More specific to NDB Cluster, a checkpoint is a point
in time where all committed transactions are stored on disk.
With regard to the
engine, there are two types of checkpoints which work together
to ensure that a consistent view of the cluster's data is
maintained. These are shown in the following list:
Local Checkpoint (LCP): This is a checkpoint that is specific to a single node; however, LCPs take place for all nodes in the cluster more or less concurrently. An LCP usually occurs every few minutes; the precise interval varies, and depends upon the amount of data stored by the node, the level of cluster activity, and other factors.
Previously, an LCP involved saving all of a node's data to disk. NDB 7.6 introduces support for partial LCPs, which can significantly improve recovery time under some conditions. See Section 220.127.116.11, “What is New in NDB Cluster 7.6”, for more information, as well as the descriptions of the
RecoveryWorkconfiguration parameters which enable partial LCPs and control the amount of storage they use.
Global Checkpoint (GCP): A GCP occurs every few seconds, when transactions for all nodes are synchronized and the redo-log is flushed to disk.
For more information about the files and directories created by local checkpoints and global checkpoints, see NDB Cluster Data Node File System Directory.
Transporter. We use the term transporter for the data transport mechanism employed between data nodes. MySQL NDB Cluster 7.5 and 7.6 support three of these, which are listed here:
TCP/IP over Ethernet. See Section 18.104.22.168, “NDB Cluster TCP/IP Connections”.
Direct TCP/IP. Uses machine-to-machine connections. See Section 22.214.171.124, “NDB Cluster TCP/IP Connections Using Direct Connections”.
Although this transporter uses the same TCP/IP protocol as mentioned in the previous item, it requires setting up the hardware differently and is configured differently as well. For this reason, it is considered a separate transport mechanism for NDB Cluster.
Shared memory (SHM). See Section 126.96.36.199, “NDB Cluster Shared Memory Connections”.
Because it is ubiquitous, most users employ TCP/IP over Ethernet for NDB Cluster.
Regardless of the transporter used,
attempts to make sure that communication between data node
processes is performed using chunks that are as large as possible
since this benefits all types of data transmission.