Achieving the best performance from an NDB Cluster depends on a number of factors including the following:
NDB Cluster software version
Numbers of data nodes and SQL nodes
Amount of data to be stored
Size and type of load under which the cluster is to operate
Therefore, obtaining an optimum configuration is likely to be an iterative process, the outcome of which can vary widely with the specifics of each NDB Cluster deployment. Changes in configuration are also likely to be indicated when changes are made in the platform on which the cluster is run, or in applications that use the NDB Cluster 's data. For these reasons, it is not possible to offer a single configuration that is ideal for all usage scenarios. However, in this section, we provide a recommended base configuration.
Starting config.ini file.
config.ini file is a
recommended starting point for configuring a cluster running
NDB Cluster 7.5:
# TCP PARAMETERS [tcp default] SendBufferMemory=2M ReceiveBufferMemory=2M # Increasing the sizes of these 2 buffers beyond the default values # helps prevent bottlenecks due to slow disk I/O. # MANAGEMENT NODE PARAMETERS [ndb_mgmd default] DataDir=path/to/management/server/data/directory # It is possible to use a different data directory for each management # server, but for ease of administration it is preferable to be # consistent. [ndb_mgmd] HostName=management-server-A-hostname # NodeId=management-server-A-nodeid [ndb_mgmd] HostName=management-server-B-hostname # NodeId=management-server-B-nodeid # Using 2 management servers helps guarantee that there is always an # arbitrator in the event of network partitioning, and so is # recommended for high availability. Each management server must be # identified by a HostName. You may for the sake of convenience specify # a NodeId for any management server, although one will be allocated # for it automatically; if you do so, it must be in the range 1-255 # inclusive and must be unique among all IDs specified for cluster # nodes. # DATA NODE PARAMETERS [ndbd default] NoOfReplicas=2 # Using 2 replicas is recommended to guarantee availability of data; # using only 1 replica does not provide any redundancy, which means # that the failure of a single data node causes the entire cluster to # shut down. We do not recommend using more than 2 replicas, since 2 is # sufficient to provide high availability, and we do not currently test # with greater values for this parameter. LockPagesInMainMemory=1 # On Linux and Solaris systems, setting this parameter locks data node # processes into memory. Doing so prevents them from swapping to disk, # which can severely degrade cluster performance. DataMemory=3072M IndexMemory=384M # The values provided for DataMemory and IndexMemory assume 4 GB RAM # per data node. However, for best results, you should first calculate # the memory that would be used based on the data you actually plan to # store (you may find the ndb_size.pl utility helpful in estimating # this), then allow an extra 20% over the calculated values. Naturally, # you should ensure that each data node host has at least as much # physical memory as the sum of these two values. # ODirect=1 # Enabling this parameter causes NDBCLUSTER to try using O_DIRECT # writes for local checkpoints and redo logs; this can reduce load on # CPUs. We recommend doing so when using NDB Cluster on systems running # Linux kernel 2.6 or later. NoOfFragmentLogFiles=300 DataDir=path/to/data/node/data/directory MaxNoOfConcurrentOperations=100000 SchedulerSpinTimer=400 SchedulerExecutionTimer=100 RealTimeScheduler=1 # Setting these parameters allows you to take advantage of real-time scheduling # of NDB threads to achieve increased throughput when using ndbd. They # are not needed when using ndbmtd; in particular, you should not set # RealTimeScheduler for ndbmtd data nodes. TimeBetweenGlobalCheckpoints=1000 TimeBetweenEpochs=200 RedoBuffer=32M # CompressedLCP=1 # CompressedBackup=1 # Enabling CompressedLCP and CompressedBackup causes, respectively, local checkpoint files and backup files to be compressed, which can result in a space savings of up to 50% over noncompressed LCPs and backups. # MaxNoOfLocalScans=64 MaxNoOfTables=1024 MaxNoOfOrderedIndexes=256 [ndbd] HostName=data-node-A-hostname # NodeId=data-node-A-nodeid LockExecuteThreadToCPU=1 LockMaintThreadsToCPU=0 # On systems with multiple CPUs, these parameters can be used to lock NDBCLUSTER # threads to specific CPUs [ndbd] HostName=data-node-B-hostname # NodeId=data-node-B-nodeid LockExecuteThreadToCPU=1 LockMaintThreadsToCPU=0 # You must have an [ndbd] section for every data node in the cluster; # each of these sections must include a HostName. Each section may # optionally include a NodeId for convenience, but in most cases, it is # sufficient to allow the cluster to allocate node IDs dynamically. If # you do specify the node ID for a data node, it must be in the range 1 # to 48 inclusive and must be unique among all IDs specified for # cluster nodes. # SQL NODE / API NODE PARAMETERS [mysqld] # HostName=sql-node-A-hostname # NodeId=sql-node-A-nodeid [mysqld] [mysqld] # Each API or SQL node that connects to the cluster requires a [mysqld] # or [api] section of its own. Each such section defines a connection # “slot”; you should have at least as many of these sections in the # config.ini file as the total number of API nodes and SQL nodes that # you wish to have connected to the cluster at any given time. There is # no performance or other penalty for having extra slots available in # case you find later that you want or need more API or SQL nodes to # connect to the cluster at the same time. # If no HostName is specified for a given [mysqld] or [api] section, # then any API or SQL node may use that slot to connect to the # cluster. You may wish to use an explicit HostName for one connection slot # to guarantee that an API or SQL node from that host can always # connect to the cluster. If you wish to prevent API or SQL nodes from # connecting from other than a desired host or hosts, then use a # HostName for every [mysqld] or [api] section in the config.ini file. # You can if you wish define a node ID (NodeId parameter) for any API or # SQL node, but this is not necessary; if you do so, it must be in the # range 1 to 255 inclusive and must be unique among all IDs specified # for cluster nodes.
Recommended my.cnf options for SQL nodes.
MySQL Servers acting as NDB Cluster SQL nodes must always be
started with the
--ndb-connectstring options, either on
the command line or in
addition, set the following options for all
mysqld processes in the cluster, unless
your setup requires otherwise: