In the following section, we answer questions that are frequently
asked about NDB Cluster and the NDB
storage engine.
Questions
A.1: Which versions of the MySQL software support NDB Cluster? Do I have to compile from source?
A.2: What do “NDB” and “NDBCLUSTER” mean?
A.3: What is the difference between using NDB Cluster versus using MySQL Replication?
A.4: Do I need any special networking to run NDB Cluster? How do computers in a cluster communicate?
A.5: How many computers do I need to run an NDB Cluster, and why?
A.6: What do the different computers do in an NDB Cluster?
A.7: When I run the
SHOW
command in the NDB Cluster management client, I see a line of output that looks like this:id=2 @10.100.10.32 (Version: 8.0.41-ndb-8.0.41 Nodegroup: 0, *)
What does the
*
mean? How is this node different from the others?A.8: With which operating systems can I use NDB Cluster?
A.9: What are the hardware requirements for running NDB Cluster?
A.10: How much RAM do I need to use NDB Cluster? Is it possible to use disk memory at all?
A.11: What file systems can I use with NDB Cluster? What about network file systems or network shares?
A.12: Can I run NDB Cluster nodes inside virtual machines (such as those created by VMWare, VirtualBox, Parallels, or Xen)?
A.13: I am trying to populate an NDB Cluster database. The loading process terminates prematurely and I get an error message like this one:
ERROR 1114: The table 'my_cluster_table' is full
Why is this happening?A.14: NDB Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
A.15: Do I have to learn a new programming or query language to use NDB Cluster?
A.16: What programming languages and APIs are supported by NDB Cluster?
A.17: Does NDB Cluster include any management tools?
A.18: How do I find out what an error or warning message means when using NDB Cluster?
A.19: Is NDB Cluster transaction-safe? What isolation levels are supported?
A.20: What storage engines are supported by NDB Cluster?
A.21: In the event of a catastrophic failure— for example, the whole city loses power and my UPS fails—would I lose all my data?
A.22: Is it possible to use
FULLTEXT
indexes with NDB Cluster?A.23: Can I run multiple nodes on a single computer?
A.24: Can I add data nodes to an NDB Cluster without restarting it?
A.25: Are there any limitations that I should be aware of when using NDB Cluster?
A.26: Does NDB Cluster support foreign keys?
A.27: How do I import an existing MySQL database into an NDB Cluster?
A.28: How do NDB Cluster nodes communicate with one another?
A.29: What is an arbitrator?
A.30: What data types are supported by NDB Cluster?
A.31: How do I start and stop NDB Cluster?
A.32: What happens to NDB Cluster data when the cluster is shut down?
A.33: Is it a good idea to have more than one management node for an NDB Cluster?
A.34: Can I mix different kinds of hardware and operating systems in one NDB Cluster?
A.35: Can I run two data nodes on a single host? Two SQL nodes?
A.36: Can I use host names with NDB Cluster?
A.37: Does NDB Cluster support IPv6?
A.38: How do I handle MySQL users in an NDB Cluster having multiple MySQL servers?
A.39: How do I continue to send queries in the event that one of the SQL nodes fails?
A.40: How do I back up and restore an NDB Cluster?
A.41: What is an “angel process”?
Questions and Answers
A.1: Which versions of the MySQL software support NDB Cluster? Do I have to compile from source?
NDB Cluster is not supported in standard MySQL Server releases. Instead, MySQL NDB Cluster is provided as a separate product. Available NDB Cluster release series include the following:
NDB Cluster 7.3 / NDB Cluster 7.4. These two series are no longer maintained or supported for new deployments. Users of NDB Cluster 7.3 or 7.4 should upgrade to NDB 7.5 or newer as soon as possible. We recommend that new deployments use the latest NDB Cluster 8.0 release.
NDB Cluster 7.5. This series is a previous General Availability (GA) version of NDB Cluster, still available for production use, although we recommend that new deployments use the latest NDB Cluster 8.0 release. The latest NDB Cluster 7.5 releases can be obtained from https://dev.mysql.com/downloads/cluster/.
NDB Cluster 7.6. This series is a previous General Availability (GA) version of NDB Cluster, still available for production use, although we recommend that new deployments use the latest NDB Cluster 8.0 release. The latest NDB Cluster 7.6 releases can be obtained from https://dev.mysql.com/downloads/cluster/.
NDB Cluster 8.0. This series is the most recent General Availability (GA) version of NDB Cluster, based on version 8.0 of the
NDB
storage engine and MySQL Server 8.0. NDB Cluster 8.0 is available for production use; new deployments intended for production should use the latest GA release in this series, which is currently NDB Cluster 8.0.41. You can obtain the most recent NDB Cluster 8.0 release from https://dev.mysql.com/downloads/cluster/. For information about new features and other important changes in this series, see What is New in MySQL NDB Cluster 8.0.
You can obtain and compile NDB Cluster from source (see Section 3.1.4, “Building NDB Cluster from Source on Linux”, and Section 3.2.2, “Compiling and Installing NDB Cluster from Source on Windows”), but for all but the most specialized cases, we recommend using one of the following installers provided by Oracle that is appropriate to your operating platform and circumstances:
Linux binary release (
tar.gz
file)Linux RPM package
Linux
.deb
fileWindows binary “no-install” release
Windows MSI Installer
Installation packages may also be available from your platform's package management system.
You can determine whether your MySQL Server has
NDB
support using one of the
statements SHOW VARIABLES LIKE 'have_%'
,
SHOW ENGINES
, or
SHOW PLUGINS
.
A.2: What do “NDB” and “NDBCLUSTER” mean?
“NDB” stands for
“Network
Database”.
NDB
and NDBCLUSTER
are
both names for the storage engine that enables clustering
support with MySQL. NDB
is preferred, but
either name is correct.
A.3: What is the difference between using NDB Cluster versus using MySQL Replication?
In traditional MySQL replication, a source MySQL server updates
one or more replicas. Transactions are committed sequentially,
and a slow transaction can cause the replica to lag behind the
source. This means that if the source fails, it is possible that
the replica might not have recorded the last few transactions.
If a transaction-safe engine such as
InnoDB
is being used, a transaction
is either completed on the replica or not applied at all, but
replication does not guarantee that all data on the source and
the replica remains consistent at all times. In NDB Cluster, all
data nodes are kept in synchrony, and a transaction committed by
any one data node is committed for all data nodes. In the event
of a data node failure, all remaining data nodes remain in a
consistent state.
In short, whereas standard MySQL replication is asynchronous, NDB Cluster is synchronous.
Asynchronous replication is also available in NDB Cluster. NDB Cluster Replication (also sometimes known as “geo-replication”) includes the capability to replicate both between two NDB Clusters, and from an NDB Cluster to a non-Cluster MySQL server. See Chapter 7, NDB Cluster Replication.
A.4: Do I need any special networking to run NDB Cluster? How do computers in a cluster communicate?
NDB Cluster is intended to be used in a high-bandwidth environment, with computers connecting using TCP/IP. Its performance depends directly upon the connection speed between the cluster's computers. The minimum connectivity requirements for NDB Cluster include a typical 100-megabit Ethernet network or the equivalent. We recommend you use gigabit Ethernet whenever available.
A.5: How many computers do I need to run an NDB Cluster, and why?
A minimum of three computers is required to run a viable cluster. However, the minimum recommended number of computers in an NDB Cluster is four: one each to run the management and SQL nodes, and two computers to serve as data nodes. The purpose of the two data nodes is to provide redundancy; the management node must run on a separate machine to guarantee continued arbitration services in the event that one of the data nodes fails.
To provide increased throughput and high availability, you should use multiple SQL nodes (MySQL Servers connected to the cluster). It is also possible (although not strictly necessary) to run multiple management servers.
A.6: What do the different computers do in an NDB Cluster?
An NDB Cluster has both a physical and logical organization, with computers being the physical elements. The logical or functional elements of a cluster are referred to as nodes, and a computer housing a cluster node is sometimes referred to as a cluster host. There are three types of nodes, each corresponding to a specific role within the cluster. These are:
Management node. This node provides management services for the cluster as a whole, including startup, shutdown, backups, and configuration data for the other nodes. The management node server is implemented as the application ndb_mgmd; the management client used to control NDB Cluster is ndb_mgm. See Section 5.4, “ndb_mgmd — The NDB Cluster Management Server Daemon”, and Section 5.5, “ndb_mgm — The NDB Cluster Management Client”, for information about these programs.
Data node. This type of node stores and replicates data. Data node functionality is handled by instances of the
NDB
data node process ndbd. For more information, see Section 5.1, “ndbd — The NDB Cluster Data Node Daemon”.SQL node. This is simply an instance of MySQL Server (mysqld) that is built with support for the
NDBCLUSTER
storage engine and started with the--ndb-cluster
option to enable the engine and the--ndb-connectstring
option to enable it to connect to an NDB Cluster management server. For more about these options, see Section 4.3.9.1, “MySQL Server Options for NDB Cluster”.NoteAn API node is any application that makes direct use of Cluster data nodes for data storage and retrieval. An SQL node can thus be considered a type of API node that uses a MySQL Server to provide an SQL interface to the Cluster. You can write such applications (that do not depend on a MySQL Server) using the NDB API, which supplies a direct, object-oriented transaction and scanning interface to NDB Cluster data; see NDB Cluster API Overview: The NDB API, for more information.
A.7:
When I run the SHOW
command in the NDB
Cluster management client, I see a line of output that looks
like this:
id=2 @10.100.10.32 (Version: 8.0.41-ndb-8.0.41 Nodegroup: 0, *)
What does the *
mean? How is this node
different from the others?
The simplest answer is, “It's not something you can control, and it's nothing that you need to worry about in any case, unless you're a software engineer writing or analyzing the NDB Cluster source code”.
If you don't find that answer satisfactory, here's a longer and more technical version:
A number of mechanisms in NDB Cluster require distributed coordination among the data nodes. These distributed algorithms and protocols include global checkpointing, DDL (schema) changes, and node restart handling. To make this coordination simpler, the data nodes “elect” one of their number to act as leader. There is no user-facing mechanism for influencing this selection, which is completely automatic; the fact that it is automatic is a key part of NDB Cluster's internal architecture.
When a node acts as the “leader” for any of these mechanisms, it is usually the point of coordination for the activity, and the other nodes act as “followers”, carrying out their parts of the activity as directed by the leader. If the node acting as leader fails, then the remaining nodes elect a new leader. Tasks in progress that were being coordinated by the old leader may either fail or be continued by the new leader, depending on the actual mechanism involved.
It is possible for some of these different mechanisms and
protocols to have different leader nodes, but in general the
same leader is chosen for all of them. The node indicated as the
leader in the output of SHOW
in the management client is known internally as the
DICT
manager, responsible
for coordinating DDL and metadata activity.
NDB Cluster is designed in such a way that the choice of leader has no discernible effect outside the cluster itself. For example, the current leader does not have significantly higher CPU or resource usage than the other data nodes, and failure of the leader should not have a significantly different impact on the cluster than the failure of any other data node.
A.8: With which operating systems can I use NDB Cluster?
NDB Cluster is supported on most Unix-like operating systems. NDB Cluster is also supported in production settings on Microsoft Windows operating systems.
For more detailed information concerning the level of support which is offered for NDB Cluster on various operating system versions, operating system distributions, and hardware platforms, please refer to https://www.mysql.com/support/supportedplatforms/cluster.html.
A.9: What are the hardware requirements for running NDB Cluster?
NDB Cluster should run on any platform for which
NDB
-enabled binaries are available.
For data nodes and API nodes, faster CPUs and more memory are
likely to improve performance, and 64-bit CPUs are likely to be
more effective than 32-bit processors. There must be sufficient
memory on machines used for data nodes to hold each node's share
of the database (see How much RAM do I
Need? for more information). For a computer which is
used only for running the NDB Cluster management server, the
requirements are minimal; a common desktop PC (or the
equivalent) is generally sufficient for this task. Nodes can
communicate through the standard TCP/IP network and hardware.
They can also use the high-speed SCI protocol; however, special
networking hardware and software are required to use SCI (see
Section 4.4, “Using High-Speed Interconnects with NDB Cluster”).
A.10: How much RAM do I need to use NDB Cluster? Is it possible to use disk memory at all?
NDB Cluster was originally implemented as in-memory only, but all versions currently available also provide the ability to store NDB Cluster on disk. See Section 6.11, “NDB Cluster Disk Data Tables”, for more information.
For in-memory NDB
tables, you can use the
following formula for obtaining a rough estimate of how much RAM
is needed for each data node in the cluster:
(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes
To calculate the memory requirements more exactly requires determining, for each table in the cluster database, the storage space required per row (see Data Type Storage Requirements, for details), and multiplying this by the number of rows. You must also remember to account for any column indexes as follows:
Each primary key or hash index created for an
NDBCLUSTER
table requires 21−25 bytes per record. These indexes useIndexMemory
.Each ordered index requires 10 bytes storage per record, using
DataMemory
.Creating a primary key or unique index also creates an ordered index, unless this index is created with
USING HASH
. In other words:A primary key or unique index on a Cluster table normally takes up 31 to 35 bytes per record.
However, if the primary key or unique index is created with
USING HASH
, then it requires only 21 to 25 bytes per record.
Creating NDB Cluster tables with USING HASH
for all primary keys and unique indexes generally causes table
updates to run more quickly—in some cases by a much as 20
to 30 percent faster than updates on tables where USING
HASH
was not used in creating primary and unique keys.
This is due to the fact that less memory is required (because no
ordered indexes are created), and that less CPU must be utilized
(because fewer indexes must be read and possibly updated).
However, it also means that queries that could otherwise use
range scans must be satisfied by other means, which can result
in slower selects.
When calculating Cluster memory requirements, you may find
useful the ndb_size.pl utility which is
available in recent MySQL 5.7 releases. This Perl
script connects to a current (non-Cluster) MySQL database and
creates a report on how much space that database would require
if it used the NDBCLUSTER
storage
engine. For more information, see
Section 5.28, “ndb_size.pl — NDBCLUSTER Size Requirement Estimator”.
It is especially important to keep in mind that every
NDB Cluster table must have a primary key. The
NDB
storage engine creates a
primary key automatically if none is defined; this primary key
is created without USING HASH
.
You can determine how much memory is being used for storage of
NDB Cluster data and indexes at any given time using the
REPORT MEMORYUSAGE
command in the
ndb_mgm client; see
Section 6.1, “Commands in the NDB Cluster Management Client”, for more
information. In addition, warnings are written to the cluster
log when 80% of available
DataMemory
or (prior to
NDB 7.6) IndexMemory
is
in use, and again when usage reaches 90%, 99%, and 100%.
A.11: What file systems can I use with NDB Cluster? What about network file systems or network shares?
Generally, any file system that is native to the host operating system should work well with NDB Cluster. If you find that a given file system works particularly well (or not so especially well) with NDB Cluster, we invite you to discuss your findings in the NDB Cluster Forums.
For Windows, we recommend that you use NTFS
file systems for NDB Cluster, just as we do for standard MySQL.
We do not test NDB Cluster with FAT
or
VFAT
file systems. Because of this, we do not
recommend their use with MySQL or NDB Cluster.
NDB Cluster is implemented as a shared-nothing solution; the idea behind this is that the failure of a single piece of hardware should not cause the failure of multiple cluster nodes, or possibly even the failure of the cluster as a whole. For this reason, the use of network shares or network file systems is not supported for NDB Cluster. This also applies to shared storage devices such as SANs.
A.12: Can I run NDB Cluster nodes inside virtual machines (such as those created by VMWare, VirtualBox, Parallels, or Xen)?
NDB Cluster is supported for use in virtual machines. We currently support and test using Oracle VM.
Some NDB Cluster users have successfully deployed NDB Cluster using other virtualization products; in such cases, Oracle can provide NDB Cluster support, but issues specific to the virtual environment must be referred to that product's vendor.
A.13:
I am trying to populate an NDB Cluster database. The loading
process terminates prematurely and I get an error message like
this one:
ERROR 1114: The table 'my_cluster_table' is
full
Why is this happening?
The cause is very likely to be that your setup does not provide
sufficient RAM for all table data and all indexes,
including the primary key required by the
NDB
storage engine and
automatically created in the event that the table definition
does not include the definition of a primary key.
It is also worth noting that all data nodes should have the same amount of RAM, since no data node in a cluster can use more memory than the least amount available to any individual data node. For example, if there are four computers hosting Cluster data nodes, and three of these have 3GB of RAM available to store Cluster data while the remaining data node has only 1GB RAM, then each data node can devote at most 1GB to NDB Cluster data and indexes.
In some cases it is possible to get Table is
full errors in MySQL client applications even when
ndb_mgm -e "ALL REPORT MEMORYUSAGE" shows
significant free
DataMemory
. You can
force NDB
to create extra
partitions for NDB Cluster tables and thus have more memory
available for hash indexes by using the
MAX_ROWS
option for
CREATE TABLE
. In general, setting
MAX_ROWS
to twice the number of rows that you
expect to store in the table should be sufficient.
For similar reasons, you can also sometimes encounter problems
with data node restarts on nodes that are heavily loaded with
data. The MinFreePct
parameter can help with this issue by reserving a portion (5% by
default) of DataMemory
and (prior to NDB 7.6)
IndexMemory
for use in
restarts. This reserved memory is not available for storing
NDB
tables or data.
A.14: NDB Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
It is very unlikely that a cluster would perform reliably under such conditions, as NDB Cluster was designed and implemented with the assumption that it would be run under conditions guaranteeing dedicated high-speed connectivity such as that found in a LAN setting using 100 Mbps or gigabit Ethernet—preferably the latter. We neither test nor warrant its performance using anything slower than this.
Also, it is extremely important to keep in mind that communications between the nodes in an NDB Cluster are not secure; they are neither encrypted nor safeguarded by any other protective mechanism. The most secure configuration for a cluster is in a private network behind a firewall, with no direct access to any Cluster data or management nodes from outside. (For SQL nodes, you should take the same precautions as you would with any other instance of the MySQL server.) For more information, see Section 6.18, “NDB Cluster Security Issues”.
A.15: Do I have to learn a new programming or query language to use NDB Cluster?
No. Although some specialized commands are used to manage and configure the cluster itself, only standard (My)SQL statements are required for the following operations:
Creating, altering, and dropping tables
Inserting, updating, and deleting table data
Creating, changing, and dropping primary and unique indexes
Some specialized configuration parameters and files are required to set up an NDB Cluster—see Section 4.3, “NDB Cluster Configuration Files”, for information about these.
A few simple commands are used in the NDB Cluster management client (ndb_mgm) for tasks such as starting and stopping cluster nodes. See Section 6.1, “Commands in the NDB Cluster Management Client”.
A.16: What programming languages and APIs are supported by NDB Cluster?
NDB Cluster supports the same programming APIs and languages as the standard MySQL Server, including ODBC, .Net, the MySQL C API, and numerous drivers for popular scripting languages such as PHP, Perl, and Python. NDB Cluster applications written using these APIs behave similarly to other MySQL applications; they transmit SQL statements to a MySQL Server (in the case of NDB Cluster, an SQL node), and receive responses containing rows of data. For more information about these APIs, see Connectors and APIs.
NDB Cluster also supports application programming using the NDB
API, which provides a low-level C++ interface to NDB Cluster
data without needing to go through a MySQL Server. See
The NDB API. In addition, many
NDBCLUSTER
management functions are
exposed by the C-language MGM API; see
The MGM API, for more information.
NDB Cluster also supports Java application programming using ClusterJ, which supports a domain object model of data using sessions and transactions. See Java and NDB Cluster, for more information.
NDB Cluster 8.0 also includes adapters supporting NoSQL
applications written against Node.js
, with
NDB Cluster as the data store. See MySQL NoSQL Connector for JavaScript,
for more information.
A.17: Does NDB Cluster include any management tools?
NDB Cluster includes a command line client for performing basic management functions. See Section 5.5, “ndb_mgm — The NDB Cluster Management Client”, and Section 6.1, “Commands in the NDB Cluster Management Client”.
NDB Cluster is also supported by MySQL Cluster Manager, a separate product providing an advanced command line interface that can automate many NDB Cluster management tasks such as rolling restarts and configuration changes. For more information about MySQL Cluster Manager, see MySQL Cluster Manager 1.4.8 User Manual.
A.18: How do I find out what an error or warning message means when using NDB Cluster?
There are two ways in which this can be done:
From within the mysql client, use SHOW ERRORS or SHOW WARNINGS immediately upon being notified of the error or warning condition.
From a system shell prompt, use perror --ndb
error_code
.
A.19: Is NDB Cluster transaction-safe? What isolation levels are supported?
Yes. For tables created with the
NDB
storage engine, transactions
are supported. Currently, NDB Cluster supports only the
READ COMMITTED
transaction
isolation level.
A.20: What storage engines are supported by NDB Cluster?
NDB Cluster requires the NDB
storage engine. That is, in order for a table to be shared
between nodes in an NDB Cluster, the table must be created using
ENGINE=NDB
(or the equivalent option
ENGINE=NDBCLUSTER
).
It is possible to create tables using other storage engines
(such as InnoDB
or
MyISAM
) on a MySQL server being
used with NDB Cluster, but since these tables do not use
NDB
, they do not participate in
clustering; each such table is strictly local to the individual
MySQL server instance on which it is created.
NDB Cluster is quite different from
InnoDB
clustering with regard to
architecture, requirements, and implementation; despite any
similarity in their names, the two are not compatible. For more
information about InnoDB
clustering, see
MySQL AdminAPI. See also
Section 2.6, “MySQL Server Using InnoDB Compared with NDB Cluster”, for information about
the differences between the NDB
and
InnoDB
storage engines.
A.21: In the event of a catastrophic failure— for example, the whole city loses power and my UPS fails—would I lose all my data?
All committed transactions are logged. Therefore, although it is possible that some data could be lost in the event of a catastrophe, this should be quite limited. Data loss can be further reduced by minimizing the number of operations per transaction. (It is not a good idea to perform large numbers of operations per transaction in any case.)
A.22:
Is it possible to use FULLTEXT
indexes with
NDB Cluster?
FULLTEXT
indexing is currently supported only
by the InnoDB
and
MyISAM
storage engines. See
Full-Text Search Functions, for more information.
A.23: Can I run multiple nodes on a single computer?
It is possible but not always advisable. One of the chief reasons to run a cluster is to provide redundancy. To obtain the full benefits of this redundancy, each node should reside on a separate machine. If you place multiple nodes on a single machine and that machine fails, you lose all of those nodes. For this reason, if you do run multiple data nodes on a single machine, it is extremely important that they be set up in such a way that the failure of this machine does not cause the loss of all the data nodes in a given node group.
Given that NDB Cluster can be run on commodity hardware loaded with a low-cost (or even no-cost) operating system, the expense of an extra machine or two is well worth it to safeguard mission-critical data. It also worth noting that the requirements for a cluster host running a management node are minimal. This task can be accomplished with a 300 MHz Pentium or equivalent CPU and sufficient RAM for the operating system, plus a small amount of overhead for the ndb_mgmd and ndb_mgm processes.
It is acceptable to run multiple cluster data nodes on a single host that has multiple CPUs, cores, or both. The NDB Cluster distribution also provides a multithreaded version of the data node binary intended for use on such systems. For more information, see Section 5.3, “ndbmtd — The NDB Cluster Data Node Daemon (Multi-Threaded)”.
It is also possible in some cases to run data nodes and SQL nodes concurrently on the same machine; how well such an arrangement performs is dependent on a number of factors such as number of cores and CPUs as well as the amount of disk and memory available to the data node and SQL node processes, and you must take these factors into account when planning such a configuration.
A.24: Can I add data nodes to an NDB Cluster without restarting it?
It is possible to add new data nodes to a running NDB Cluster without taking the cluster offline. For more information, see Section 6.7, “Adding NDB Cluster Data Nodes Online”.
For other types of NDB Cluster nodes, a rolling restart is all that is required (see Section 6.5, “Performing a Rolling Restart of an NDB Cluster”).
A.25: Are there any limitations that I should be aware of when using NDB Cluster?
Limitations on NDB
tables in MySQL
NDB Cluster include the following:
Temporary tables are not supported; a
CREATE TEMPORARY TABLE
statement usingENGINE=NDB
orENGINE=NDBCLUSTER
fails with an error.The only types of user-defined partitioning supported for
NDBCLUSTER
tables areKEY
andLINEAR KEY
. Trying to create anNDB
table using any other partitioning type fails with an error.FULLTEXT
indexes are not supported.Index prefixes are not supported. Only complete columns may be indexed.
Spatial indexes are not supported (although spatial columns can be used). See Spatial Data Types.
Support for partial transactions and partial rollbacks is comparable to that of other transactional storage engines such as
InnoDB
that can roll back individual statements.The maximum number of attributes allowed per table is 512. Attribute names cannot be any longer than 31 characters. For each table, the maximum combined length of the table and database names is 122 characters.
Priot to NDB 8.0, the maximum size for a table row is 14 kilobytes, not counting
BLOB
values. In NDB 8.0, this maximum is increased to 30000 bytes. See Section 2.7.5, “Limits Associated with Database Objects in NDB Cluster”, for more information.There is no set limit for the number of rows per
NDB
table. Limits on table size depend on a number of factors, in particular on the amount of RAM available to each data node.
For a complete listing of limitations in NDB Cluster, see Section 2.7, “Known Limitations of NDB Cluster”. See also Previous NDB Cluster Issues Resolved in NDB Cluster 8.0.
A.26: Does NDB Cluster support foreign keys?
NDB Cluster provides support for foreign key constraints which
is comparable to that found in the
InnoDB
storage engine; see
FOREIGN KEY Constraints, for more detailed
information, as well as
FOREIGN KEY Constraints. Applications
requiring foreign key support should use NDB Cluster 7.3, 7.4,
7.5, or later.
A.27: How do I import an existing MySQL database into an NDB Cluster?
You can import databases into NDB Cluster much as you would with
any other version of MySQL. Other than the limitations mentioned
elsewhere in this FAQ, the only other special requirement is
that any tables to be included in the cluster must use the
NDB
storage engine. This means that
the tables must be created with ENGINE=NDB
or
ENGINE=NDBCLUSTER
.
It is also possible to convert existing tables that use other
storage engines to NDBCLUSTER
using
one or more ALTER TABLE
statement. However, the definition of the table must be
compatible with the NDBCLUSTER
storage engine prior to making the conversion. In MySQL
5.7, an additional workaround is also required; see
Section 2.7, “Known Limitations of NDB Cluster”, for details.
A.28: How do NDB Cluster nodes communicate with one another?
Cluster nodes can communicate through any of three different transport mechanisms: TCP/IP, SHM (shared memory), and SCI (Scalable Coherent Interface). Where available, SHM is used by default between nodes residing on the same cluster host; however, this is considered experimental. SCI is a high-speed (1 gigabit per second and higher), high-availability protocol used in building scalable multi-processor systems; it requires special hardware and drivers. See Section 4.4, “Using High-Speed Interconnects with NDB Cluster”, for more about using SCI as a transport mechanism for NDB Cluster.
If one or more data nodes in a cluster fail, it is possible that not all cluster data nodes are able to “see” one another. In fact, it is possible that two sets of data nodes might become isolated from one another in a network partitioning, also known as a “split-brain” scenario. This type of situation is undesirable because each set of data nodes tries to behave as though it is the entire cluster. An arbitrator is required to decide between the competing sets of data nodes.
When all data nodes in at least one node group are alive,
network partitioning is not an issue, because no single subset
of the cluster can form a functional cluster on its own. The
real problem arises when no single node group has all its nodes
alive, in which case network partitioning (the
“split-brain” scenario) becomes possible. Then an
arbitrator is required. All cluster nodes recognize the same
node as the arbitrator, which is normally the management server;
however, it is possible to configure any of the MySQL Servers in
the cluster to act as the arbitrator instead. The arbitrator
accepts the first set of cluster nodes to contact it, and tells
the remaining set to shut down. Arbitrator selection is
controlled by the ArbitrationRank
configuration parameter for MySQL Server and management server
nodes. You can also use the ArbitrationRank
configuration parameter to control the arbitrator selection
process. For more information about these parameters, see
Section 4.3.5, “Defining an NDB Cluster Management Server”.
The role of arbitrator does not in and of itself impose any heavy demands upon the host so designated, and thus the arbitrator host does not need to be particularly fast or to have extra memory especially for this purpose.
A.30: What data types are supported by NDB Cluster?
NDB Cluster supports all of the usual MySQL data types,
including those associated with MySQL's spatial extensions;
however, the NDB
storage engine
does not support spatial indexes. (Spatial indexes are supported
only by MyISAM
; see
Spatial Data Types, for more information.) In
addition, there are some differences with regard to indexes when
used with NDB
tables.
NDB Cluster Disk Data tables (that is, tables created with
TABLESPACE ... STORAGE DISK ENGINE=NDB
or
TABLESPACE ... STORAGE DISK
ENGINE=NDBCLUSTER
) have only fixed-width rows. This
means that (for example) each Disk Data table record
containing a
VARCHAR(255)
column requires space for 255 characters (as required for the
character set and collation being used for the table),
regardless of the actual number of characters stored therein.
See Section 2.7, “Known Limitations of NDB Cluster”, for more information about these issues.
A.31: How do I start and stop NDB Cluster?
It is necessary to start each node in the cluster separately, in the following order:
Start the management node, using the ndb_mgmd command.
When starting the cluster for the first time, you must include the
-f
or--config-file
option to tell the management node where its configuration file can be found.Start each data node with the ndbd command.
Each data node must be started with the
-c
or--ndb-connectstring
option so that the data node knows how to connect to the management server.Start each MySQL Server (SQL node) using your preferred startup script, such as mysqld_safe.
Each MySQL Server must be started with the
--ndbcluster
and--ndb-connectstring
options. These options cause mysqld to enableNDBCLUSTER
storage engine support and how to connect to the management server.
Each of these commands must be run from a system shell on the
machine housing the affected node. (You do not have to be
physically present at the machine—a remote login shell can
be used for this purpose.) You can verify that the cluster is
running by starting the NDB
management client ndb_mgm on the machine
housing the management node and issuing the
SHOW
or ALL STATUS
command.
To shut down a running cluster, issue the command
SHUTDOWN
in the management client.
Alternatively, you may enter the following command in a system
shell:
$> ndb_mgm -e "SHUTDOWN"
(The quotation marks in this example are optional, since there
are no spaces in the command string following the
-e
option; in addition, the
SHUTDOWN
command, like other management
client commands, is not case-sensitive.)
Either of these commands causes the ndb_mgm, ndb_mgm, and any ndbd processes to terminate gracefully. MySQL servers running as SQL nodes can be stopped using mysqladmin shutdown.
For more information, see Section 6.1, “Commands in the NDB Cluster Management Client”, and Section 3.6, “Safe Shutdown and Restart of NDB Cluster”.
MySQL Cluster Manager provides additional ways to handle starting ansd stopping of NDB Cluster nodes. See MySQL Cluster Manager 1.4.8 User Manual, for more information about this tool.
A.32: What happens to NDB Cluster data when the cluster is shut down?
The data that was held in memory by the cluster's data nodes is written to disk, and is reloaded into memory the next time that the cluster is started.
A.33: Is it a good idea to have more than one management node for an NDB Cluster?
It can be helpful as a fail-safe. Only one management node controls the cluster at any given time, but it is possible to configure one management node as primary, and one or more additional management nodes to take over in the event that the primary management node fails.
See Section 4.3, “NDB Cluster Configuration Files”, for information on how to configure NDB Cluster management nodes.
A.34: Can I mix different kinds of hardware and operating systems in one NDB Cluster?
Yes, as long as all machines and operating systems have the same “endianness” (all big-endian or all little-endian).
It is also possible to use software from different NDB Cluster releases on different nodes. However, we support such use only as part of a rolling upgrade procedure (see Section 6.5, “Performing a Rolling Restart of an NDB Cluster”).
A.35: Can I run two data nodes on a single host? Two SQL nodes?
Yes, it is possible to do this. In the case of multiple data nodes, it is advisable (but not required) for each node to use a different data directory. If you want to run multiple SQL nodes on one machine, each instance of mysqld must use a different TCP/IP port.
Running data nodes and SQL nodes together on the same host is possible, but you should be aware that the ndbd or ndbmtd processes may compete for memory with mysqld.
A.36: Can I use host names with NDB Cluster?
Yes, it is possible to use DNS and DHCP for cluster hosts. However, if your application requires “five nines” availability, you should use fixed (numeric) IP addresses, since making communication between Cluster hosts dependent on services such as DNS and DHCP introduces additional potential points of failure.
A.37: Does NDB Cluster support IPv6?
IPv6 is supported for connections between SQL nodes (MySQL servers), but connections between all other types of NDB Cluster nodes must use IPv4.
In practical terms, this means that you can use IPv6 for replication between NDB Clusters, but connections between nodes in the same NDB Cluster must use IPv4. For more information, see Section 7.3, “Known Issues in NDB Cluster Replication”.
A.38: How do I handle MySQL users in an NDB Cluster having multiple MySQL servers?
MySQL user accounts and privileges are normally not automatically propagated between different MySQL servers accessing the same NDB Cluster. MySQL NDB Cluster provides support for distributed privileges, which you can enable by following a procedure provided in the documentation; see Section 6.13, “Distributed Privileges Using Shared Grant Tables”, for more information.
The mechanism for handling users distributed or shared between NDB Cluster SQL nodes changed significantly in NDB 8.0; this implementation is not compatible with that in NDB 7.6 and earlier. See Privilege Synchronization and NDB_STORED_USER, for details.
A.39: How do I continue to send queries in the event that one of the SQL nodes fails?
MySQL NDB Cluster does not provide any sort of automatic failover between SQL nodes. Your application must be prepared to handle the loss of SQL nodes and to fail over between them.
A.40: How do I back up and restore an NDB Cluster?
You can use the NDB Cluster native backup and restore functionality in the NDB management client and the ndb_restore program. See Section 6.8, “Online Backup of NDB Cluster”, and Section 5.24, “ndb_restore — Restore an NDB Cluster Backup”.
You can also use the traditional functionality provided for this purpose in mysqldump and the MySQL server. See mysqldump — A Database Backup Program, for more information.
A.41: What is an “angel process”?
This process monitors and, if necessary, attempts to restart the data node process. If you check the list of active processes on your system after starting ndbd, you can see that there are actually 2 processes running by that name, as shown here (we omit the output from ndb_mgmd and ndbd for brevity):
$> ./ndb_mgmd
$> ps aux | grep ndb
me 23002 0.0 0.0 122948 3104 ? Ssl 14:14 0:00 ./ndb_mgmd
me 23025 0.0 0.0 5284 820 pts/2 S+ 14:14 0:00 grep ndb
$> ./ndbd -c 127.0.0.1 --initial
$> ps aux | grep ndb
me 23002 0.0 0.0 123080 3356 ? Ssl 14:14 0:00 ./ndb_mgmd
me 23096 0.0 0.0 35876 2036 ? Ss 14:14 0:00 ./ndbmtd -c 127.0.0.1 --initial
me 23097 1.0 2.4 524116 91096 ? Sl 14:14 0:00 ./ndbmtd -c 127.0.0.1 --initial
me 23168 0.0 0.0 5284 812 pts/2 R+ 14:15 0:00 grep ndb
The ndbd process showing
0.0
for both memory and CPU usage is the
angel process (although it actually does use a very small amount
of each). This process merely checks to see if the main
ndbd or ndbmtd process
(the primary data node process which actually handles the data)
is running. If permitted to do so (for example, if the
StopOnError
configuration parameter is set to false
), the
angel process tries to restart the primary data node process.