It is possible to take a backup with multiple local data
managers (LDMs) acting in parallel on the data nodes. For this
to work, all data nodes in the cluster must use multiple LDMs,
and each data node must use the same number of LDMs. This means
that all data nodes must run ndbmtd
(ndbd is single-threaded and thus always has
only one LDM) and they must be configured to use multiple LDMs
before taking the backup; ndbmtd by default
runs in single-threaded mode. You can cause them to use multiple
LDMs by choosing an appropriate setting for one of the
multi-threaded data node configuration parameters
MaxNoOfExecutionThreads
or ThreadConfig
. Keep
in mind that changing these parameters requires a restart of the
cluster; this can be a rolling restart. In addition, the
EnableMultithreadedBackup
parameter must be set to 1 for each data node (this is the
default).
Depending on the number of LDMs and other factors, you may also
need to increase
NoOfFragmentLogParts
.
If you are using large Disk Data tables, you may also need to
increase
DiskPageBufferMemory
. As
with single-threaded backups, you may also want or need to make
adjustments to settings for
BackupDataBufferSize
,
BackupMemory
, and other
configuration parameters relating to backups (see
Backup parameters).
Once all data nodes are using multiple LDMs, you can take the
parallel backup using the START
BACKUP
command in the NDB management client just as
you would if the data nodes were running ndbd
(or ndbmtd in single-threaded mode); no
additional or special syntax is required, and you can specify a
backup ID, wait option, or snapshot option in any combination as
needed or desired.
Backups using multiple LDMs create subdirectories, one per LDM,
under the directory
BACKUP/BACKUP-
(which in turn resides under the
backup_id
/BackupDataDir
) on each
data node; these subdirectories are named
BACKUP-
,
backup_id
-PART-1-OF-N
/BACKUP-
,
and so on, up to
backup_id
-PART-2-OF-N
/BACKUP-
,
where backup_id
-PART-N
-OF-N
/backup_id
is the backup ID used
for this backup and N
is the number
of LDMs per data node. Each of these subdirectories contains the
usual backup files
BACKUP-
,
backup_id
-0.node_id
.DataBACKUP-
,
and backup_id
.node_id
.ctlBACKUP-backup_id.node_id.log
, where
node_id
is the node ID of this data
node.
ndb_restore automatically checks for the presence of the subdirectories just described; if it finds them, it attempts to restore the backup in parallel. For information about restoring backups taken with multiple LDMs, see Restoring from a backup taken in parallel.
To force creation of a single-threaded backup, set
EnableMultithreadedBackup =
0
for all data nodes (you can do this by setting the
parameter in the [ndbd default]
section of
the config.ini
global configuration file).
It is also possible to restore a parallel backup to a cluster
running an older version of NDB
. See
Restoring an NDB backup to a previous version of NDB Cluster, for more
information.