In the following section, we answer questions that are frequently
asked about MySQL Cluster and the
NDBCLUSTER storage engine.
- A.10.1. Which versions of the MySQL software support Cluster? Do I have to compile from source?
- A.10.2. What do “NDB” and “NDBCLUSTER” mean?
- A.10.3. What is the difference between using MySQL Cluster versus using MySQL Replication?
- A.10.4. Do I need any special networking to run MySQL Cluster? How do computers in a cluster communicate?
- A.10.5. How many computers do I need to run a MySQL Cluster, and why?
- A.10.6. What do the different computers do in a MySQL Cluster?
- A.10.7. With which operating systems can I use MySQL Cluster?
- A.10.8. What are the hardware requirements for running MySQL Cluster?
- A.10.9. How much RAM do I need to use MySQL Cluster? Is it possible to use disk memory at all?
- A.10.10. What file systems can I use with MySQL Cluster? What about network file systems or network shares?
- A.10.11. Can I run MySQL Cluster nodes inside virtual machines (such as those created by VMWare, Parallels, or Xen)?
- A.10.12. I am trying to populate a MySQL Cluster database. The loading process terminates prematurely and I get an error message like this one:
- A.10.13. MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
- A.10.14. Do I have to learn a new programming or query language to use MySQL Cluster?
- A.10.15. What programming languages and APIs are supported by MySQL Cluster?
- A.10.16. Does MySQL Cluster include any management tools?
- A.10.17. How do I find out what an error or warning message means when using MySQL Cluster?
- A.10.18. Is MySQL Cluster transaction-safe? What isolation levels are supported?
- A.10.19. What storage engines are supported by MySQL Cluster?
- A.10.20. In the event of a catastrophic failure—say, for instance, the whole city loses power and my UPS fails—would I lose all my data?
- A.10.21. Is it possible to use FULLTEXT indexes with MySQL Cluster?
- A.10.22. Can I run multiple nodes on a single computer?
- A.10.23. Can I add data nodes to a MySQL Cluster without restarting it?
- A.10.24. Are there any limitations that I should be aware of when using MySQL Cluster?
- A.10.25. Does MySQL Cluster support foreign keys?
- A.10.26. How do I import an existing MySQL database into a MySQL Cluster?
- A.10.27. How do MySQL Cluster nodes communicate with one another?
- A.10.28. What is an arbitrator?
- A.10.29. What data types are supported by MySQL Cluster?
- A.10.30. How do I start and stop MySQL Cluster?
- A.10.31. What happens to MySQL Cluster data when the MySQL Cluster is shut down?
- A.10.32. Is it a good idea to have more than one management node for a MySQL Cluster?
- A.10.33. Can I mix different kinds of hardware and operating systems in one MySQL Cluster?
- A.10.34. Can I run two data nodes on a single host? Two SQL nodes?
- A.10.35. Can I use host names with MySQL Cluster?
- A.10.36. How do I handle MySQL users in a MySQL Cluster having multiple MySQL servers?
- A.10.37. How do I continue to send queries in the event that one of the SQL nodes fails?
- A.10.38. How do I back up and restore a MySQL Cluster?
- A.10.39. What is an “angel process”?
Which versions of the MySQL software support Cluster? Do I have to compile from source?
MySQL Cluster is supported in all server binaries in the
5.0 release series for operating systems on which
MySQL Cluster is available. See Section 4.3.1, “mysqld — The MySQL Server”. You
can determine whether your server has
You can also obtain
You should use MySQL Cluster NDB 7.3 or MySQL Cluster NDB 7.4 for new deployments; if you are currently using an older version of MySQL Cluster, you should upgrade to one of these versions as soon as possible. For an overview of improvements made in MySQL Cluster NDB 7.3 and 7.4, see What is New in MySQL Cluster NDB 7.3, and What is New in MySQL Cluster NDB 7.4, respectively.
What do “NDB” and “NDBCLUSTER” mean?
“NDB” stands for
What is the difference between using MySQL Cluster versus using MySQL Replication?
In traditional MySQL replication, a master MySQL server updates
one or more slaves. Transactions are committed sequentially, and
a slow transaction can cause the slave to lag behind the master.
This means that if the master fails, it is possible that the
slave might not have recorded the last few transactions. If a
transaction-safe engine such as
In short, whereas standard MySQL replication is asynchronous, MySQL Cluster is synchronous.
We have implemented (asynchronous) replication for Cluster in MySQL 5.1 and later. MySQL Cluster Replication (also sometimes known as “geo-replication”) includes the capability to replicate both between two MySQL Clusters, and from a MySQL Cluster to a non-Cluster MySQL server. However, we do not plan to backport this functionality to MySQL 5.0. See MySQL Cluster Replication.
Do I need any special networking to run MySQL Cluster? How do computers in a cluster communicate?
MySQL Cluster is intended to be used in a high-bandwidth environment, with computers connecting using TCP/IP. Its performance depends directly upon the connection speed between the cluster's computers. The minimum connectivity requirements for MySQL Cluster include a typical 100-megabit Ethernet network or the equivalent. We recommend you use gigabit Ethernet whenever available.
How many computers do I need to run a MySQL Cluster, and why?
A minimum of three computers is required to run a viable cluster. However, the minimum recommended number of computers in a MySQL Cluster is four: one each to run the management and SQL nodes, and two computers to serve as data nodes. The purpose of the two data nodes is to provide redundancy; the management node must run on a separate machine to guarantee continued arbitration services in the event that one of the data nodes fails.
To provide increased throughput and high availability, you should use multiple SQL nodes (MySQL Servers connected to the cluster). It is also possible (although not strictly necessary) to run multiple management servers.
What do the different computers do in a MySQL Cluster?
A MySQL Cluster has both a physical and logical organization, with computers being the physical elements. The logical or functional elements of a cluster are referred to as nodes, and a computer housing a cluster node is sometimes referred to as a cluster host. There are three types of nodes, each corresponding to a specific role within the cluster. These are:
With which operating systems can I use MySQL Cluster?
MySQL Cluster is supported on most Unix-like operating systems. Beginning with MySQL Cluster NDB 7.1.3, MySQL Cluster is also supported in production on Microsoft Windows operating systems.
We do not intend to provide any level of support on Windows for MySQL Cluster in MySQL 5.0; you must use MySQL Cluster NDB 7.1.3 or later to obtain GA-level support for MySQL Cluster in a Windows environment. See What is New in MySQL Cluster NDB 7.1, for more information.
For more detailed information concerning the level of support which is offered for MySQL Cluster on various operating system versions, operating system distributions, and hardware platforms, please refer to http://www.mysql.com/support/supportedplatforms/cluster.html.
What are the hardware requirements for running MySQL Cluster?
MySQL Cluster should run on any platform for which
How much RAM do I need to use MySQL Cluster? Is it possible to use disk memory at all?
In MySQL 5.0, Cluster is in-memory only. This means that all table data (including indexes) is stored in RAM. Therefore, if your data takes up 1 GB of space and you want to replicate it once in the cluster, you need 2 GB of memory to do so (1 GB per replica). This is in addition to the memory required by the operating system and any applications running on the cluster computers.
If a data node's memory usage exceeds what is available in
RAM, then the system will attempt to use swap space up to the
limit set for
We have implemented disk data storage for MySQL Cluster in MySQL 5.1 and later but we have no plans to add this capability in MySQL 5.0. See MySQL Cluster Disk Data Tables, for more information.
You can use the following formula for obtaining a rough estimate of how much RAM is needed for each data node in the cluster:
(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes
To calculate the memory requirements more exactly requires determining, for each table in the cluster database, the storage space required per row (see Section 11.7, “Data Type Storage Requirements”, for details), and multiplying this by the number of rows. You must also remember to account for any column indexes as follows:
Creating MySQL Cluster tables with
When calculating Cluster memory requirements, you may find
useful the ndb_size.pl utility which is
available in recent MySQL 5.0 releases. This Perl
script connects to a current (non-Cluster) MySQL database and
creates a report on how much space that database would require
if it used the
It is especially important to keep in mind that every
MySQL Cluster table must have a primary key. The
There is no easy way to determine exactly how much memory is
being used for storage of MySQL Cluster indexes at any given
time; however, warnings are written to the cluster log when 80%
What file systems can I use with MySQL Cluster? What about network file systems or network shares?
Generally, any file system that is native to the host operating system should work well with MySQL Cluster. If you find that a given file system works particularly well (or not so especially well) with MySQL Cluster, we invite you to discuss your findings in the MySQL Cluster Forums.
We do not test MySQL Cluster with
MySQL Cluster is implemented as a shared-nothing solution; the idea behind this is that the failure of a single piece of hardware should not cause the failure of multiple cluster nodes, or possibly even the failure of the cluster as a whole. For this reason, the use of network shares or network file systems is not supported for MySQL Cluster. This also applies to shared storage devices such as SANs.
Can I run MySQL Cluster nodes inside virtual machines (such as those created by VMWare, Parallels, or Xen)?
This is possible but not recommended for a production environment with MySQL Cluster versions prior to MySQL Cluster NDB 7.2.
For deployment in virtualized environments, you should use MySQL Cluster NDB 7.2 or later.
I am trying to populate a MySQL Cluster database. The loading process terminates prematurely and I get an error message like this one:
Why is this happening?
The cause is very likely to be that your setup does not provide
sufficient RAM for all table data and all indexes,
including the primary key required by the
It is also worth noting that all data nodes should have the same amount of RAM, since no data node in a cluster can use more memory than the least amount available to any individual data node. For example, if there are four computers hosting Cluster data nodes, and three of these have 3GB of RAM available to store Cluster data while the remaining data node has only 1GB RAM, then each data node can devote at most 1GB to MySQL Cluster data and indexes.
In some cases it is possible to get Table is
full errors in MySQL client applications even when
ndb_mgm -e "ALL REPORT MEMORYUSAGE" shows
For similar reasons, you can also sometimes encounter problems
with data node restarts on nodes that are heavily loaded with
data. In MySQL Cluster NDB 7.1 and later, the addition of the
MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
It is very unlikely that a cluster would perform reliably under such conditions, as MySQL Cluster was designed and implemented with the assumption that it would be run under conditions guaranteeing dedicated high-speed connectivity such as that found in a LAN setting using 100 Mbps or gigabit Ethernet—preferably the latter. We neither test nor warrant its performance using anything slower than this.
Also, it is extremely important to keep in mind that communications between the nodes in a MySQL Cluster are not secure; they are neither encrypted nor safeguarded by any other protective mechanism. The most secure configuration for a cluster is in a private network behind a firewall, with no direct access to any Cluster data or management nodes from outside. (For SQL nodes, you should take the same precautions as you would with any other instance of the MySQL server.) For more information, see Section 17.5.10, “MySQL Cluster Security Issues”.
Do I have to learn a new programming or query language to use MySQL Cluster?
No. Although some specialized commands are used to manage and configure the cluster itself, only standard (My)SQL statements are required for the following operations:
Some specialized configuration parameters and files are required to set up a MySQL Cluster—see Section 17.3.3, “MySQL Cluster Configuration Files”, for information about these.
A few simple commands are used in the MySQL Cluster management client (ndb_mgm) for tasks such as starting and stopping cluster nodes. See Section 17.5.2, “Commands in the MySQL Cluster Management Client”.
What programming languages and APIs are supported by MySQL Cluster?
MySQL Cluster 5.0 supports the same programming APIs and languages as the standard MySQL Server, including ODBC, .Net, the MySQL C API, and numerous drivers for popular scripting languages such as PHP, Perl, and Python. MySQL Cluster applications written using these APIs behave similarly to other MySQL applications; they transmit SQL statements to a MySQL Server (in the case of MySQL Cluster, an SQL node), and receive responses containing rows of data. For more information about these APIs, see Chapter 20, Connectors and APIs.
Does MySQL Cluster include any management tools?
MySQL Cluster includes a command line client for performing basic management functions. See Section 17.4.3, “ndb_mgm — The MySQL Cluster Management Client”, and Section 17.5.2, “Commands in the MySQL Cluster Management Client”.
How do I find out what an error or warning message means when using MySQL Cluster?
There are two ways in which this can be done:
Is MySQL Cluster transaction-safe? What isolation levels are supported?
What storage engines are supported by MySQL Cluster?
Clustering with MySQL is supported only by the
It is possible to create tables using other storage engines
In the event of a catastrophic failure—say, for instance, the whole city loses power and my UPS fails—would I lose all my data?
All committed transactions are logged. Therefore, although it is possible that some data could be lost in the event of a catastrophe, this should be quite limited. Data loss can be further reduced by minimizing the number of operations per transaction. (It is not a good idea to perform large numbers of operations per transaction in any case.)
Is it possible to use
Can I run multiple nodes on a single computer?
It is possible but not advisable. One of the chief reasons to run a cluster is to provide redundancy. To obtain the full benefits of this redundancy, each node should reside on a separate machine. If you place multiple nodes on a single machine and that machine fails, you lose all of those nodes. Given that MySQL Cluster can be run on commodity hardware loaded with a low-cost (or even no-cost) operating system, the expense of an extra machine or two is well worth it to safeguard mission-critical data. It also worth noting that the requirements for a cluster host running a management node are minimal. This task can be accomplished with a 300 MHz Pentium or equivalent CPU and sufficient RAM for the operating system, plus a small amount of overhead for the ndb_mgmd and ndb_mgm processes.
It is acceptable to run multiple cluster data nodes on a single host for learning about MySQL Cluster, or for testing purposes; however, this is not generally supported for production use.
Can I add data nodes to a MySQL Cluster without restarting it?
Not in MySQL 5.0. While a rolling restart is all that is required for adding new management or API nodes to a MySQL Cluster (see Section 17.5.5, “Performing a Rolling Restart of a MySQL Cluster”), adding data nodes is more complex, and requires the following steps:
Beginning with MySQL Cluster NDB 6.4, it is possible to add new data nodes to a running MySQL Cluster without taking it offline. For more information, see Adding MySQL Cluster Data Nodes Online. However, we do not plan to add this capability in MySQL 5.0.
Are there any limitations that I should be aware of when using MySQL Cluster?
For a complete listing of limitations in MySQL Cluster, see Section 17.1.5, “Known Limitations of MySQL Cluster”. See also Section 22.214.171.124, “Previous MySQL Cluster Issues Resolved in MySQL 5.0”.
Does MySQL Cluster support foreign keys?
Foreign key support comparable to that found in the
How do I import an existing MySQL database into a MySQL Cluster?
You can import databases into MySQL Cluster much as you would
with any other version of MySQL. Other than the limitations
mentioned elsewhere in this FAQ, the only other special
requirement is that any tables to be included in the cluster
must use the
It is also possible to convert existing tables that use other
storage engines to
How do MySQL Cluster nodes communicate with one another?
Cluster nodes can communicate through any of three different transport mechanisms: TCP/IP, SHM (shared memory), and SCI (Scalable Coherent Interface). Where available, SHM is used by default between nodes residing on the same cluster host; however, this is considered experimental. SCI is a high-speed (1 gigabit per second and higher), high-availability protocol used in building scalable multi-processor systems; it requires special hardware and drivers. See Section 17.3.4, “Using High-Speed Interconnects with MySQL Cluster”, for more about using SCI as a transport mechanism for MySQL Cluster.
What is an arbitrator?
If one or more data nodes in a cluster fail, it is possible that not all cluster data nodes will be able to “see” one another. In fact, it is possible that two sets of data nodes might become isolated from one another in a network partitioning, also known as a “split-brain” scenario. This type of situation is undesirable because each set of data nodes tries to behave as though it is the entire cluster. An arbitrator is required to decide between the competing sets of data nodes.
When all data nodes in at least one node group are alive,
network partitioning is not an issue, because no single subset
of the cluster can form a functional cluster on its own. The
real problem arises when no single node group has all its nodes
alive, in which case network partitioning (the
“split-brain” scenario) becomes possible. Then an
arbitrator is required. All cluster nodes recognize the same
node as the arbitrator, which is normally the management server;
however, it is possible to configure any of the MySQL Servers in
the cluster to act as the arbitrator instead. The arbitrator
accepts the first set of cluster nodes to contact it, and tells
the remaining set to shut down. Arbitrator selection is
controlled by the
The role of arbitrator does not in and of itself impose any heavy demands upon the host so designated, and thus the arbitrator host does not need to be particularly fast or to have extra memory especially for this purpose.
What data types are supported by MySQL Cluster?
In MySQL 5.0;, MySQL Cluster supports all of the usual MySQL
data types, including (beginning with MySQL 5.0.16) those
associated with MySQL's spatial extensions; however, the
In MySQL 5.0, MySQL Cluster tables (that is, tables created
See Section 17.1.5, “Known Limitations of MySQL Cluster”, for more information about these issues.
How do I start and stop MySQL Cluster?
It is necessary to start each node in the cluster separately, in the following order:
Each of these commands must be run from a system shell on the
machine housing the affected node. (You do not have to be
physically present at the machine—a remote login shell can
be used for this purpose.) You can verify that the cluster is
running by starting the
To shut down a running cluster, issue the command
(The quotation marks in this example are optional, since there
are no spaces in the command string following the
For more information, see Section 17.5.2, “Commands in the MySQL Cluster Management Client”, and Section 17.2.5, “Safe Shutdown and Restart of MySQL Cluster”.
What happens to MySQL Cluster data when the MySQL Cluster is shut down?
The data that was held in memory by the cluster's data nodes is written to disk, and is reloaded into memory the next time that the cluster is started.
Is it a good idea to have more than one management node for a MySQL Cluster?
It can be helpful as a fail-safe. Only one management node controls the cluster at any given time, but it is possible to configure one management node as primary, and one or more additional management nodes to take over in the event that the primary management node fails.
See Section 17.3.3, “MySQL Cluster Configuration Files”, for information on how to configure MySQL Cluster management nodes.
Can I mix different kinds of hardware and operating systems in one MySQL Cluster?
Yes, as long as all machines and operating systems have the same “endianness” (all big-endian or all little-endian).
It is also possible to use software from different MySQL Cluster releases on different nodes. However, we support this only as part of a rolling upgrade procedure (see Section 17.5.5, “Performing a Rolling Restart of a MySQL Cluster”).
Can I run two data nodes on a single host? Two SQL nodes?
Yes, it is possible to do this. In the case of multiple data nodes, it is advisable (but not required) for each node to use a different data directory. If you want to run multiple SQL nodes on one machine, each instance of mysqld must use a different TCP/IP port. However, in MySQL 5.0, running more than one cluster node of a given type per machine is generally not encouraged or supported for production use.
Can I use host names with MySQL Cluster?
Yes, it is possible to use DNS and DHCP for cluster hosts. However, if your application requires “five nines” availability, you should use fixed (numeric) IP addresses, since making communication between Cluster hosts dependent on services such as DNS and DHCP introduces additional potential points of failure.
How do I handle MySQL users in a MySQL Cluster having multiple MySQL servers?
MySQL user accounts and privileges are not automatically propagated between different MySQL servers accessing the same MySQL Cluster. Therefore, you must make sure that these are copied between the SQL nodes yourself. You can do this manually, or automate the task with scripts.
How do I continue to send queries in the event that one of the SQL nodes fails?
MySQL Cluster does not provide any sort of automatic failover between SQL nodes. Your application must be prepared to handle the loss of SQL nodes and to fail over between them.
How do I back up and restore a MySQL Cluster?
You can use the NDB native backup and restore functionality in the MySQL Cluster management client and the ndb_restore program. See Section 17.5.3, “Online Backup of MySQL Cluster”, and Section 17.4.14, “ndb_restore — Restore a MySQL Cluster Backup”.
You can also use the traditional functionality provided for this purpose in mysqldump and the MySQL server. See Section 4.5.4, “mysqldump — A Database Backup Program”, for more information.
What is an “angel process”?
This process monitors and, if necessary, attempts to restart the data node process. If you check the list of active processes on your system after starting ndbd, you can see that there are actually 2 processes running by that name, as shown here (we omit the output from ndb_mgmd and ndbd for brevity):
The ndbd process showing 0 memory and CPU
usage is the angel process. It actually does use a very small
amount of each, of course. It simply checks to see if the main
ndbd process (the primary data node process
that actually handles the data) is running. If permitted to do
so (for example, if the