ndb_import imports CSV-formatted data, such
as that produced by mysqldump
--tab, directly into
NDB using the NDB API.
ndb_import requires a connection to an NDB
management server (ndb_mgmd) to function; it
does not require a connection to a MySQL Server.
Usage
ndb_import db_name file_name options
ndb_import requires two arguments.
db_name is the name of the database
where the table into which to import the data is found;
file_name is the name of the CSV file
from which to read the data; this must include the path to this
file if it is not in the current directory. The name of the file
must match that of the table; the file's extension, if any,
is not taken into consideration. Options supported by
ndb_import include those for specifying field
separators, escapes, and line terminators, and are described
later in this section.
ndb_import rejects any empty lines read from the CSV file.
ndb_import must be able to connect to an NDB
Cluster management server; for this reason, there must be an
unused [api] slot in the cluster
config.ini file.
To duplicate an existing table that uses a different storage
engine, such as InnoDB, as an
NDB table, use the mysql
client to perform a
SELECT INTO
OUTFILE statement to export the existing table to a
CSV file, then to execute a
CREATE TABLE
LIKE statement to create a new table having the same
structure as the existing table, then perform
ALTER TABLE ...
ENGINE=NDB on the new table; after this, from the
system shell, invoke ndb_import to load the
data into the new NDB table. For example, an
existing InnoDB table named
myinnodb_table in a database named
myinnodb can be exported into an
NDB table named
myndb_table in a database named
myndb as shown here, assuming that you are
already logged in as a MySQL user with the appropriate
privileges:
In the mysql client:
mysql> USE myinnodb; mysql> SELECT * INTO OUTFILE '/tmp/myndb_table.csv' > FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' > LINES TERMINATED BY '\n' > FROM myinnodbtable; mysql> CREATE DATABASE myndb; mysql> USE myndb; mysql> CREATE TABLE myndb_table LIKE myinnodb.myinnodb_table; mysql> ALTER TABLE myndb_table ENGINE=NDB; mysql> EXIT; Bye $>Once the target database and table have been created, a running mysqld is no longer required. You can stop it using mysqladmin shutdown or another method before proceeding, if you wish.
In the system shell:
# if you are not already in the MySQL bin directory: $> cd path-to-mysql-bin-dir $> ndb_import myndb /tmp/myndb_table.csv --fields-optionally-enclosed-by='"' \ --fields-terminated-by="," --fields-escaped-by='\\'The output should resemble what is shown here:
job-1 import myndb.myndb_table from /tmp/myndb_table.csv job-1 [running] import myndb.myndb_table from /tmp/myndb_table.csv job-1 [success] import myndb.myndb_table from /tmp/myndb_table.csv job-1 imported 19984 rows in 0h0m9s at 2277 rows/s jobs summary: defined: 1 run: 1 with success: 1 with failure: 0 $>
Options that can be used with ndb_import are shown in the following table. Additional descriptions follow the table.
Table 21.33 Command-line options used with the program ndb_import
| Format | Description | Added, Deprecated, or Removed |
|---|---|---|
| Dump core on any fatal error; used for debugging | ADDED: NDB 7.6.2 |
|
| For table with hidden PK, specify autoincrement increment. See mysqld | ADDED: NDB 7.6.2 |
|
| For table with hidden PK, specify autoincrement offset. See mysqld | ADDED: NDB 7.6.2 |
|
| For table with hidden PK, specify number of autoincrement values that are prefetched. See mysqld | ADDED: NDB 7.6.2 |
|
| Directory containing character sets | ADDED: NDB 7.6.2 |
|
| Number of times to retry connection before giving up | (Supported in all NDB releases based on MySQL 5.7) |
|
| Number of seconds to wait between attempts to contact management server | ADDED: NDB 7.6.2 |
|
| Same as --ndb-connectstring | ADDED: NDB 7.6.2 |
|
| Number of cluster connections to create | ADDED: NDB 7.6.2 |
|
| When job fails, continue to next job | ADDED: NDB 7.6.2 |
|
| Write core file on error; used in debugging | ADDED: NDB 7.6.2 |
|
| Shorthand option for setting typical CSV option values. See documentation for syntax and other information | ADDED: NDB 7.6.2 |
|
| Number of threads, per data node, executing database operations | ADDED: NDB 7.6.2 |
|
| Read given file after global files are read | ADDED: NDB 7.6.2 |
|
| Read default options from given file only | ADDED: NDB 7.6.2 |
|
| Also read groups with concat(group, suffix) | ADDED: NDB 7.6.2 |
|
| Error insert type, for testing purposes; use "list" to obtain all possible values | ADDED: NDB 7.6.2 |
|
| Error insert delay in milliseconds; random variation is added | ADDED: NDB 7.6.2 |
|
| Same as FIELDS ENCLOSED BY option for LOAD DATA statements. For CSV input this is same as using --fields-optionally-enclosed-by | ADDED: NDB 7.6.2 |
|
| Same as FIELDS ESCAPED BY option for LOAD DATA statements | ADDED: NDB 7.6.2 |
|
| Same as FIELDS OPTIONALLY ENCLOSED BY option for LOAD DATA statements | ADDED: NDB 7.6.2 |
|
| Same as FIELDS TERMINATED BY option for LOAD DATA statements | ADDED: NDB 7.6.2 |
|
| Display help text and exit | ADDED: NDB 7.6.2 |
|
| Number of milliseconds to sleep waiting for more to do | ADDED: NDB 7.6.2 |
|
| Number of times to retry before idlesleep | ADDED: NDB 7.6.2 |
|
| Ignore first # lines in input file. Used to skip a non-data header | ADDED: NDB 7.6.2 |
|
| Input type: random or csv | ADDED: NDB 7.6.2 |
|
| Number of threads processing input. Must be 2 or more if --input-type is csv | ADDED: NDB 7.6.2 |
|
| State files (except non-empty *.rej files) are normally removed on job completion. Using this option causes all state files to be preserved instead | ADDED: NDB 7.6.4 |
|
| Same as LINES TERMINATED BY option for LOAD DATA statements | ADDED: NDB 7.6.2 |
|
| Read given path from login file | ADDED: NDB 7.6.2 |
|
| Import only this number of input data rows; default is 0, which imports all rows | ADDED: NDB 7.6.2 |
|
| Periodically print status of running job if something has changed (status, rejected rows, temporary errors). Value 0 disables. Value 1 prints any change seen. Higher values reduce status printing exponentially up to some pre-defined limit | ADDED: NDB 7.6.2 |
|
| Set connect string for connecting to ndb_mgmd. Syntax: "[nodeid=id;][host=]hostname[:port]". Overrides entries in NDB_CONNECTSTRING and my.cnf | ADDED: NDB 7.6.2 |
|
| Same as --ndb-connectstring | ADDED: NDB 7.6.2 |
|
| Set node ID for this node, overriding any ID set by --ndb-connectstring | ADDED: NDB 7.6.2 |
|
| Enable optimizations for selection of nodes for transactions. Enabled by default; use --skip-ndb-optimized-node-selection to disable | (Supported in all NDB releases based on MySQL 5.7) |
|
| Run database operations as batches, in single transactions | ADDED: NDB 7.6.2 |
|
| Do not read default options from any option file other than login file | ADDED: NDB 7.6.2 |
|
| Tells transaction coordinator not to use distribution key hint when selecting data node | ADDED: NDB 7.6.2 |
|
| A db execution batch is a set of transactions and operations sent to NDB kernel. This option limits NDB operations (including blob operations) in a db execution batch. Therefore it also limits number of asynch transactions. Value 0 is not valid | ADDED: NDB 7.6.2 |
|
| Limit bytes in execution batch (default 0 = no limit) | ADDED: NDB 7.6.2 |
|
| Output type: ndb is default, null used for testing | ADDED: NDB 7.6.2 |
|
| Number of threads processing output or relaying database operations | ADDED: NDB 7.6.2 |
|
| Align I/O buffers to given size | ADDED: NDB 7.6.2 |
|
| Size of I/O buffers as multiple of page size. CSV input worker allocates double-sized buffer | ADDED: NDB 7.6.2 |
|
| Timeout per poll for completed asynchonous transactions; polling continues until all polls are completed, or error occurs | ADDED: NDB 7.6.2 |
|
| Print program argument list and exit | ADDED: NDB 7.6.2 |
|
| Limit number of rejected rows (rows with permanent error) in data load. Default is 0 which means that any rejected row causes a fatal error. The row exceeding the limit is also added to *.rej | ADDED: NDB 7.6.2 |
|
| If job aborted (temporary error, user interrupt), resume with rows not yet processed | ADDED: NDB 7.6.2 |
|
| Limit rows in row queues (default 0 = no limit); must be 1 or more if --input-type is random | ADDED: NDB 7.6.2 |
|
| Limit bytes in row queues (0 = no limit) | ADDED: NDB 7.6.2 |
|
| Where to write state files; currect directory is default | ADDED: NDB 7.6.2 |
|
| Save performance related options and internal statistics in *.sto and *.stt files. These files are kept on successful completion even if --keep-state is not used | ADDED: NDB 7.6.4 |
|
| Number of milliseconds to sleep between temporary errors | ADDED: NDB 7.6.2 |
|
| Number of times a transaction can fail due to a temporary error, per execution batch; 0 means any temporary error is fatal. Such errors do not cause any rows to be written to .rej file | ADDED: NDB 7.6.2 |
|
| Display help text and exit; same as --help | ADDED: NDB 7.6.2 |
|
| Enable verbose output | ADDED: NDB 7.6.2 |
|
| Display version information and exit | ADDED: NDB 7.6.2 |
-
Command-Line Format --abort-on-errorIntroduced 5.7.18-ndb-7.6.2 Dump core on any fatal error; used for debugging only.
-
Command-Line Format --ai-increment=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1Minimum Value 1Maximum Value 4294967295For a table with a hidden primary key, specify the autoincrement increment, like the
auto_increment_incrementsystem variable does in the MySQL Server. -
Command-Line Format --ai-offset=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1Minimum Value 1Maximum Value 4294967295For a table with hidden primary key, specify the autoincrement offset. Similar to the
auto_increment_offsetsystem variable. -
Command-Line Format --ai-prefetch-sz=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1024Minimum Value 1Maximum Value 4294967295For a table with a hidden primary key, specify the number of autoincrement values that are prefetched. Behaves like the
ndb_autoincrement_prefetch_szsystem variable does in the MySQL Server. -
Command-Line Format --character-sets-dir=pathIntroduced 5.7.18-ndb-7.6.2 Directory containing character sets.
-
Command-Line Format --connect-retries=#Type Integer Default Value 12Minimum Value 0Maximum Value 12Number of times to retry connection before giving up.
-
Command-Line Format --connect-retry-delay=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 5Minimum Value 0Maximum Value 5Number of seconds to wait between attempts to contact management server.
-
Command-Line Format --connections=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1Minimum Value 1Maximum Value 4294967295Number of cluster connections to create.
-
Command-Line Format --connect-string=connection_stringIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Same as
--ndb-connectstring. -
Command-Line Format --continueIntroduced 5.7.18-ndb-7.6.2 When a job fails, continue to the next job.
-
Command-Line Format --core-fileIntroduced 5.7.18-ndb-7.6.2 Write core file on error; used in debugging.
-
Command-Line Format --csvopt=optsIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Provides a shortcut method for setting typical CSV import options. The argument to this option is a string consisting of one or more of the following parameters:
c: Fields terminated by commad: Use defaults, except where overridden by another parametern: Lines terminated by\nq: Fields optionally enclosed by double quote characters (")r: Line terminated by\r
The order of the parameters makes no difference, except that if both
nandrare specified, the one occurring last is the parameter which takes effect.This option is intended for use in testing under conditions in which it is difficult to transmit escapes or quotation marks.
-
Command-Line Format --db-workers=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value (≥ 5.7.20-ndb-7.6.4) 4Default Value (≥ 5.7.18-ndb-7.6.2, ≤ 5.7.18-ndb-7.6.3) 1Minimum Value 1Maximum Value 4294967295Number of threads, per data node, executing database operations.
-
Command-Line Format --defaults-extra-file=pathIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Read given file after global files are read.
-
Command-Line Format --defaults-file=pathIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Read default options from given file only.
-
Command-Line Format --defaults-group-suffix=stringIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Also read groups with concat(group, suffix).
-
Command-Line Format --errins-type=nameIntroduced 5.7.18-ndb-7.6.2 Type Enumeration Default Value [none]Valid Values stopjobstopallsighupsigintlistError insert type; use
listas thenamevalue to obtain all possible values. This option is used for testing purposes only. -
Command-Line Format --errins-delay=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1000Minimum Value 0Maximum Value 4294967295Unit ms Error insert delay in milliseconds; random variation is added. This option is used for testing purposes only.
-
Command-Line Format --fields-enclosed-by=charIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]This works in the same way as the
FIELDS ENCLOSED BYoption does for theLOAD DATAstatement, specifying a character to be interpeted as quoting field values. For CSV input, this is the same as--fields-optionally-enclosed-by. -
Command-Line Format --fields-escaped-by=charIntroduced 5.7.18-ndb-7.6.2 Type String Default Value \Specify an escape character in the same way as the
FIELDS ESCAPED BYoption does for the SQLLOAD DATAstatement. --fields-optionally-enclosed-by=charCommand-Line Format --fields-optionally-enclosed-by=charIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]This works in the same way as the
FIELDS OPTIONALLY ENCLOSED BYoption does for theLOAD DATAstatement, specifying a character to be interpeted as optionally quoting field values. For CSV input, this is the same as--fields-enclosed-by.-
Command-Line Format --fields-terminated-by=charIntroduced 5.7.18-ndb-7.6.2 Type String Default Value \tThis works in the same way as the
FIELDS TERMINATED BYoption does for theLOAD DATAstatement, specifying a character to be interpeted as the field separator. -
Command-Line Format --helpIntroduced 5.7.18-ndb-7.6.2 Display help text and exit.
-
Command-Line Format --idlesleep=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1Minimum Value 1Maximum Value 4294967295Unit ms Number of milliseconds to sleep waiting for more work to perform.
-
Command-Line Format --idlespin=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Number of times to retry before sleeping.
-
Command-Line Format --ignore-lines=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Cause ndb_import to ignore the first
#lines of the input file. This can be employed to skip a file header that does not contain any data. -
Command-Line Format --input-type=nameIntroduced 5.7.18-ndb-7.6.2 Type Enumeration Default Value csvValid Values randomcsvSet the type of input type. The default is
csv;randomis intended for testing purposes only. . -
Command-Line Format --input-workers=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value (≥ 5.7.20-ndb-7.6.4) 4Default Value (≥ 5.7.18-ndb-7.6.2, ≤ 5.7.18-ndb-7.6.3) 2Minimum Value 1Maximum Value 4294967295Set the number of threads processing input.
-
Command-Line Format --keep-stateIntroduced 5.7.20-ndb-7.6.4 By default, ndb_import removes all state files (except non-empty
*.rejfiles) when it completes a job. Specify this option (nor argument is required) to force the program to retain all state files instead. -
Command-Line Format --lines-terminated-by=charIntroduced 5.7.18-ndb-7.6.2 Type String Default Value \nThis works in the same way as the
LINES TERMINATED BYoption does for theLOAD DATAstatement, specifying a character to be interpeted as end-of-line. -
Command-Line Format --login-path=pathIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Read given path from login file.
-
Command-Line Format --log-level=#Type Integer Default Value 0Minimum Value 0Maximum Value 2Performs internal logging at the given level. This option is intended primarily for internal and development use.
In debug builds of NDB only, the logging level can be set using this option to a maximum of 4.
-
Command-Line Format --max-rows=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Unit bytes Import only this number of input data rows; the default is 0, which imports all rows.
-
Command-Line Format --monitor=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 2Minimum Value 0Maximum Value 4294967295Unit bytes Periodically print the status of a running job if something has changed (status, rejected rows, temporary errors). Set to 0 to disable this reporting. Setting to 1 prints any change that is seen. Higher values reduce the frequency of this status reporting.
-
Command-Line Format --ndb-connectstring=connection_stringIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Set connect string for connecting to ndb_mgmd. Syntax: "[nodeid=id;][host=]hostname[:port]". Overrides entries in NDB_CONNECTSTRING and my.cnf.
-
Command-Line Format --ndb-mgmd-host=connection_stringIntroduced 5.7.18-ndb-7.6.2 Type String Default Value [none]Same as
--ndb-connectstring. -
Command-Line Format --ndb-nodeid=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value [none]Set node ID for this node, overriding any ID set by
--ndb-connectstring. --ndb-optimized-node-selectionCommand-Line Format --ndb-optimized-node-selectionEnable optimizations for selection of nodes for transactions. Enabled by default; use
--skip-ndb-optimized-node-selectionto disable.-
Command-Line Format --no-asynchIntroduced 5.7.18-ndb-7.6.2 Run database operations as batches, in single transactions.
-
Command-Line Format --no-defaultsIntroduced 5.7.18-ndb-7.6.2 Do not read default options from any option file other than login file.
-
Command-Line Format --no-hintIntroduced 5.7.18-ndb-7.6.2 Do not use distribution key hinting to select a data node.
-
Command-Line Format --opbatch=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 256Minimum Value 1Maximum Value 4294967295Unit bytes Set a limit on the number of operations (including blob operations), and thus the number of asynchronous transactions, per execution batch.
-
Command-Line Format --opbytes=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Unit bytes Set a limit on the number of bytes per execution batch. Use 0 for no limit.
-
Command-Line Format --output-type=nameIntroduced 5.7.18-ndb-7.6.2 Type Enumeration Default Value ndbValid Values nullSet the output type.
ndbis the default.nullis used only for testing. -
Command-Line Format --output-workers=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 2Minimum Value 1Maximum Value 4294967295Set the number of threads processing output or relaying database operations.
-
Command-Line Format --pagesize=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 4096Minimum Value 1Maximum Value 4294967295Unit bytes Align I/O buffers to the given size.
-
Command-Line Format --pagecnt=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 64Minimum Value 1Maximum Value 4294967295Set the size of I/O buffers as multiple of page size. The CSV input worker allocates buffer that is doubled in size.
-
Command-Line Format --polltimeout=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 1000Minimum Value 1Maximum Value 4294967295Unit ms Set a timeout per poll for completed asynchonous transactions; polling continues until all polls are completed, or until an error occurs.
-
Command-Line Format --print-defaultsIntroduced 5.7.18-ndb-7.6.2 Print program argument list and exit.
-
Command-Line Format --rejects=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Limit the number of rejected rows (rows with permanent errors) in the data load. The default is 0, which means that any rejected row causes a fatal error. Any rows causing the limit to be exceeded are added to the
.rejfile.The limit imposed by this option is effective for the duration of the current run. A run restarted using
--resumeis considered a “new” run for this purpose. -
Command-Line Format --resumeIntroduced 5.7.18-ndb-7.6.2 If a job is aborted (due to a temporary db error or when interrupted by the user), resume with any rows not yet processed.
-
Command-Line Format --rowbatch=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Unit rows Set a limit on the number of rows per row queue. Use 0 for no limit.
-
Command-Line Format --rowbytes=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 262144Minimum Value 0Maximum Value 4294967295Unit bytes Set a limit on the number of bytes per row queue. Use 0 for no limit.
-
Command-Line Format --statsIntroduced 5.7.20-ndb-7.6.4 Save information about options related to performance and other internal statistics in files named
*.stoand*.stt. These files are always kept on successful completion (even if--keep-stateis not also specified). -
Command-Line Format --state-dir=pathIntroduced 5.7.18-ndb-7.6.2 Type String Default Value .Where to write the state files (
,tbl_name.map,tbl_name.rej, andtbl_name.res) produced by a run of the program; the default is the current directory.tbl_name.stt -
Command-Line Format --tempdelay=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 10Minimum Value 0Maximum Value 4294967295Unit ms Number of milliseconds to sleep between temporary errors.
-
Command-Line Format --temperrors=#Introduced 5.7.18-ndb-7.6.2 Type Integer Default Value 0Minimum Value 0Maximum Value 4294967295Number of times a transaction can fail due to a temporary error, per execution batch. The default is 0, which means that any temporary error is fatal. Temporary errors do not cause any rows to be added to the
.rejfile. -
Command-Line Format --usageIntroduced 5.7.18-ndb-7.6.2 Display help text and exit; same as
--help. -
Command-Line Format --verbose[=#]Introduced 5.7.18-ndb-7.6.2 Type (≥ 5.7.20-ndb-7.6.4) Boolean Type (≥ 5.7.18-ndb-7.6.2, ≤ 5.7.18-ndb-7.6.3) Integer Default Value (≥ 5.7.20-ndb-7.6.4) falseDefault Value (≥ 5.7.18-ndb-7.6.2, ≤ 5.7.18-ndb-7.6.3) 0Minimum Value 0Maximum Value 2Enable verbose output.
NotePreviously, this option controlled the internal logging level for debugging messages. In NDB 7.6, use the
--log-leveloption for this purpose instead. -
Command-Line Format --versionIntroduced 5.7.18-ndb-7.6.2 Display version information and exit.
As with LOAD DATA, options for
field and line formatting much match those used to create the
CSV file, whether this was done using
SELECT INTO ...
OUTFILE, or by some other means. There is no
equivalent to the LOAD DATA
statement STARTING WITH option.
ndb_import was added in NDB 7.6.