Documentation Home
MySQL Shell 8.0
Related Documentation Download this Manual
PDF (US Ltr) - 0.6Mb
PDF (A4) - 0.6Mb
HTML Download (TGZ) - 125.6Kb
HTML Download (Zip) - 136.7Kb


MySQL Shell 8.0  /  MySQL Shell Utilities  /  Dump Loading Utility

7.5 Dump Loading Utility

MySQL Shell's dump loading utility, introduced in MySQL Shell 8.0.21, supports the import into a MySQL Database Service DB System (a MySQL DB System, for short) or a MySQL Server instance of schemas dumped using MySQL Shell's Section 7.4, “Instance Dump Utility and Schema Dump Utility”. The dump loading utility provides data streaming from remote storage, parallel loading of tables or table chunks, progress state tracking, resume and reset capability, and the option of concurrent loading while the dump is still taking place.

For import into a MySQL DB System, the MySQL Shell instance where you run the dump loading utility must be installed on an Oracle Cloud Infrastructure Compute instance that has access to the MySQL DB System. If the dump files are in an Oracle Cloud Infrastructure Object Storage bucket, you can access the Object Storage bucket from the Compute instance. If the dump files are on your local system, you need to transfer them to the Oracle Cloud Infrastructure Compute instance using the copy utility of your choice, depending on the operating system you chose for your Compute instance. Ensure the dump was created with the ocimds option set to true in MySQL Shell's instance dump utility or schema dump utility, for compatibility with MySQL Database Service.

Note
  1. The dump loading utility uses the LOAD DATA LOCAL INFILE statement, so the global setting of the local_infile system variable on the target MySQL instance must be ON for the duration of the import. By default, this system variable is set to ON in a standard MySQL DB System configuration.

  2. On the target MySQL instance, the dump loading utility checks whether the sql_require_primary_key system variable is set to ON, and if it is, returns an error if there is a table in the dump files with no primary key. By default, this system variable is set to OFF in a standard MySQL DB System configuration.

  3. The dump loading utility does not apply the gtid_executed GTID set from the source MySQL instance on the target MySQL instance. The GTID set is included in the dump metadata from MySQL Shell's instance dump utility or schema dump utility, as the gtidExecuted field in the @.json dump file. To retain these GTIDs on the target MySQL instance for use with replication, set the dump loading utility's skipBinlog option to true for the import to prevent the generation of new GTIDs as the dump is imported. After the import, update the gtid_purged GTID set on the target MySQL instance using the gtid_executed GTID set from the source MySQL instance. (This is currently not supported on MySQL DB System.) For details, see the description for the skipBinlog option.

MySQL Shell's dump loading utility uses the DDL files and tab-separated .tsv data files output by the instance dump utility or schema dump utility to set up the server instance or schema in the target MySQL instance, then load the data. Dumps containing only the DDL files or only the data files can be used to perform these tasks separately. The dump loading utility also lets you separately apply the DDL files and data files from a regular dump that contains both sorts of files. You can customize the import with further options in the dump loading utility:

  • You can select individual tables or schemas to import or to exclude from the import.

  • Users and their roles and grants are excluded by default, but you can choose to import them.

  • You can specify a different character set for the data in the target MySQL instance to that used in the dump files.

  • You can update the ANALYZE TABLE histograms, even after the data has already been loaded.

  • You can choose to skip binary logging on the target MySQL instance during the course of the import using a SET sql_log_bin=0 statement.

You can carry out a dry run with your chosen set of dump loading options to show what actions would be performed when you run the utility for real with those options.

Progress state for an import is stored in a persistent progress state file, which records steps successfully completed and steps that were interrupted or failed. By default, the progress state file is named loadprogress.server_uuid.json and created in the dump directory, but you can choose a different name and location. The dump loading utility references the progress state file when you resume or retry the import for a dump, and skips completed steps. De-duplication is automatically managed for tables that were partially loaded. If you interrupt a dump in progress by using Ctrl + C, on the first use of that key combination, no new tasks are started by the utility but existing tasks continue. Pressing Ctrl + C again stops existing tasks, resulting in error messages. In either case, the utility can still resume the import from where it stopped.

You can choose to reset the progress state and start the import for a dump again from the beginning, but in this case the utility does not skip objects that were already created and does not manage de-duplication. If you do this, to ensure a correct import, you must manually remove from the target MySQL instance all previously loaded objects from that dump, including schemas, tables, users, views, triggers, routines, and events. Otherwise, the import stops with an error if an object in the dump files already exists in the target MySQL instance. With appropriate caution, you may use the ignoreExistingObjects option to make the utility report duplicate objects but skip them and continue with the import. Note that the utility does not check whether the contents of the object in the target MySQL instance and in the dump files are different, so it is possible for the resulting import to contain incorrect or invalid data.

Important

Do not change the data in the dump files between a dump stopping and a dump resuming. Resuming a dump after changing the data has undefined behavior and can lead to data inconsistency and data loss. If you need to change the data after partially loading a dump, manually drop all objects that were created during the partial import (as listed in the progress state file), then run the dump loading utility with the resetProgress option to start again from the beginning.

The DDL files for a dump are loaded by a single thread, but the data is loaded in parallel by the number of threads you select, which defaults to 4. If table data was chunked when the dump was created, multiple threads can be used for a table, otherwise each thread loads one table at a time. The dump loading utility schedules data imports across threads to maximize parallelism. If the dump files were compressed by MySQL Shell's instance dump utility and schema dump utility, the dump loading utility handles decompression for them.

By default, fulltext indexes for a table are created only after the table is completely loaded, which speeds up the import. You can choose to defer all index creation (except the primary index) until each table is completely loaded. You can also opt to create all indexes during the table import. You can also choose to disable index creation during the import, and create the indexes afterwards, for example if you want to make changes to the table structure after loading.

For an additional improvement to data loading performance, from MySQL 8.0.21, you can disable the InnoDB redo log on the target MySQL instance during the import. Note that this should only be done on a new MySQL Server instance (not a production system), and this feature is not available on MySQL DB System. For more information, see Disabling Redo Logging.

The waitDumpTimeout option lets you apply a dump that is still in the process of being created. Tables are loaded as they become available, and the utility waits for the specified number of seconds after new data stops arriving in the dump location. When the timeout elapses, the utility assumes the dump is complete and stops importing.

The dump loading utility uses the MySQL Shell global session to obtain the connection details of the target MySQL instance to which the dump is to be imported. You must open the global session (which can have an X Protocol connection or a classic MySQL protocol connection) before running the utility. The utility opens its own sessions for each thread, copying options such as connection compression and SSL options from the global session, and does not make any further use of the global session.

In the MySQL Shell API, the dump loading utility is a function of the util global object, and has the following signature:

util.loadDump(url[, options])

If you are importing a dump that is located in the Oracle Cloud Infrastructure Compute instance's filesystem where you are running the utility, url is a string specifying the path to a local directory containing the dump files. You can prefix a local directory path with the file:// schema. In this example, a dry run is carried out to check that there will be no issues when the dump files are loaded from a local directory into the connected MySQL instance:

shell-js> util.loadDump("/mnt/data/worlddump", {dryRun: true})

If you are importing a dump from an Oracle Cloud Infrastructure Object Storage bucket, url is the path prefix that the dump files have in the bucket, which was assigned using the outputUrl parameter when the dump was created. Use the osBucketName option to provide the name of the Object Storage bucket, and the osNamespace option to identify the namespace for the bucket. In this example, the dump prefixed worlddump is loaded from an Object Storage bucket into the connected MySQL DB System using 8 threads:

shell-js> util.loadDump("worlddump", {
        > threads: 8, osBucketName: "hanna-bucket", osNamespace: "idx28w1ckztq"})

The namespace for an Object Storage bucket is displayed in the Bucket Information tab of the bucket details page in the Oracle Cloud Infrastructure console, or can be obtained using the Oracle Cloud Infrastructure command line interface. A connection is established to the Object Storage bucket using the default profile in the default Oracle Cloud Infrastructure CLI configuration file, or alternative details that you specify using the ociConfigFile and ociProfile options. For instructions to set up a CLI configuration file, see SDK and CLI Configuration File

options is a dictionary of options that can be omitted if it is empty. The following options are available:

dryRun: [ true | false ]

Display information about what actions would be performed given the specified options and dump files, including any errors that would be returned based on the dump contents, but do not proceed with the import. The default is false.

osBucketName: "string"

The name of the Oracle Cloud Infrastructure Object Storage bucket where the dump files are located. By default, the [DEFAULT] profile in the Oracle Cloud Infrastructure CLI configuration file located at ~/.oci/config is used to establish a connection to the bucket. You can substitute an alternative profile to be used for the connection with the ociConfigFile and ociProfile options. For instructions to set up a CLI configuration file, see SDK and CLI Configuration File.

osNamespace: "string"

The Oracle Cloud Infrastructure namespace (tenancy name) where the Object Storage bucket named by osBucketName is located. The namespace for an Object Storage bucket is displayed in the Bucket Information tab of the bucket details page in the Oracle Cloud Infrastructure console, or can be obtained using the Oracle Cloud Infrastructure command line interface.

ociConfigFile: "string"

An Oracle Cloud Infrastructure CLI configuration file that contains the profile to use for the connection, instead of the one in the default location ~/.oci/config.

ociProfile: "string"

The profile name of the Oracle Cloud Infrastructure profile to use for the connection, instead of the [DEFAULT] profile in the Oracle Cloud Infrastructure CLI configuration file used for the connection.

threads: int

The number of parallel threads to use to upload chunks of data to the target MySQL instance. Each thread has its own connection to the MySQL instance. The default is 4. if the dump was created with chunking enabled (which is the default), the utility can use multiple threads to load data for a table; otherwise a thread is only used for one table.

progressFile: "string"

A local file location for the dump loading utility's progress state file, which persists progress state for an import. By default, the progress state file is named loadprogress.server_uuid.json and created in the dump directory, but you can change that using this option. Setting progressFile to an empty string disables progress state tracking, which means that the dump loading utility cannot resume a partially completed import.

showProgress: [ true | false ]

Display (true) or hide (false) progress information for the import. The default is true if stdout is a terminal (tty), such as when MySQL Shell is in interactive mode, and false otherwise. The progress information includes the number of active threads and their actions, the amount of data loaded so far, the percentage complete and the rate of throughput. When the progress information is not displayed, progress state is still recorded in the dump loading utility's progress state file.

resetProgress: [ true | false ]

Setting this option to true resets the progress state and starts the import again from the beginning. The default is false. Note that with this option, the dump loading utility does not skip objects that were already created and does not manage de-duplication. If you want to use this option, to ensure a correct import, you must first manually remove from the target MySQL instance all previously loaded objects, including schemas, tables, users, views, triggers, routines, and events from that dump. Otherwise, the import stops with an error if an object in the dump files already exists in the target MySQL instance. With appropriate caution, you may use the ignoreExistingObjects option to make the utility report duplicate objects but skip them and continue with the import.

waitDumpTimeout: int

Setting this option activates concurrent loading by specifying a timeout (in seconds) for which the utility waits for further data after all available tables in the dump location have been processed. This allows the utility to import the dump while it is still in the process of being created. Tables are loaded as they become available, and the import stops when the timeout is exceeded with no further data appearing in the dump location. The default setting, 0, means that the utility marks the dump as complete when all available tables have been loaded and does not wait for more data.

ignoreExistingObjects: [ true | false ]

Import the dump even if it contains objects that already exist in the target schema in the MySQL instance. The default is false, meaning that an error is issued and the import stops when a duplicate object is found, unless the import is being resumed from a previous attempt using a progress state file, in which case the check is skipped. When this option is set to true, duplicate objects are reported but no error is generated and the import proceeds. This option should be used with caution, because the utility does not check whether the contents of the object in the target MySQL instance and in the dump files are different, so it is possible for the resulting import to contain incorrect or invalid data. An alternative strategy is to use the excludeTables option to exclude tables that you have already loaded where you have verified the object in the dump files is identical with the imported object in the target MySQL instance. The safest choice is to remove duplicate objects from the target MySQL instance before restarting the dump.

ignoreVersion: [ true | false ]

Import the dump even if the major version number of the MySQL instance from which the data was dumped is different to the major version number of the MySQL instance to which the data will be uploaded. The default is false, meaning that an error is issued and the import does not proceed if the major version number is different. When this option is set to true, a warning is issued and the import proceeds. Note that the import will only be successful if the schemas in the dump files have no compatibility issues with the new major version. Before attempting an import using the ignoreVersion option, use MySQL Shell's upgrade checker utility checkForServerUpgrade() to check the schemas on the source MySQL instance. Fix any compatibility issues identified by the utility before dumping the schemas and importing them to the target MySQL instance.

skipBinlog: [ true | false ]

Skips binary logging on the target MySQL instance for the sessions used by the utility during the course of the import, by issuing a SET sql_log_bin=0 statement. The default is false, so binary logging is active by default. When GTIDs are in use on the target MySQL instance (gtid_mode=ON), setting this option to true prevents new GTIDs from being generated and assigned as the import is being carried out, so that the original GTID set from the source server can be used.

The dump loading utility does not apply the gtid_executed GTID set from the source MySQL instance on the target MySQL instance. This is currently not supported on MySQL DB System. For a MySQL Server instance, after the import, use MySQL Shell's \sql command (or enter SQL mode) to issue the following statement on the connected MySQL instance, copying the gtid_executed GTID set from the gtidExecuted field in the @.json dump file in the dump metadata:

shell-js> \sql SET @@GLOBAL.gtid_purged= "+gtidExecuted_set";

This statement, which works from MySQL 8.0, adds the source MySQL Server instance's gtid_executed GTID set to the target MySQL instance's gtid_purged GTID set, which holds the GTIDs of all transactions that have been applied on the server, but do not exist on any binary log file on the server. For MySQL 5.7, the plus sign (+) must be omitted, and the gtid_executed GTID set on the target MySQL instance must be empty. For more details, see the description of the gtid_purged system variable in the release of the target MySQL instance.

Note

The gtid_executed GTID set in the dump metadata includes the GTIDs of all transactions in the gtid_executed set on the source MySQL instance, including those relating to schemas that were not included in the dump. If you set the value of gtid_purged on the target MySQL instance, and later import a dump of another schema from the same source MySQL instance, do not issue the SET @@GLOBAL.gtid_purged statement again, as the GTIDs will already be present. Do continue to use skipBinlog: true for the import so that new GTIDs are not generated during the import.

loadIndexes: [ true | false ]

Create (true) or do not create (false) secondary indexes for tables. The default is true. When this option is set to false, secondary indexes are not created during the import, and you must create them afterwards. This can be useful if you are loading the DDL files and data files separately, and if you want to make changes to the table structure after loading the DDL files. Afterwards, you can create the secondary indexes by running the dump loading utility again with loadIndexes set to true and deferTableIndexes set to all.

deferTableIndexes: [ off | fulltext | all ]

Defer the creation of secondary indexes until after the table data is loaded. This can reduce loading times. off means all indexes are created during the table load. The default setting fulltext defers full-text indexes only. all defers all secondary indexes and only creates primary indexes during the table load, and also (from MySQL Shell 8.0.22) indexes defined on columns containing auto-increment values. In MySQL Shell 8.0.21, do not set all if you have any unique key columns containing auto-increment values.

analyzeTables: [ off | on | histogram ]

Execute ANALYZE TABLE for tables when they have been loaded. on analyzes all tables, and histogram analyzes only tables that have histogram information stored in the dump. The default is off. You can run the dump loading utility with this option to analyze the tables even if the data has already been loaded.

characterSet: "string"

The character set to be used for the import to the target MySQL instance, for example in the CHARACTER SET option of the LOAD DATA statement. The default is the character set given in the dump metadata that was used when the dump was created by MySQL Shell's instance dump utility or schema dump utility, which default to using utf8mb4. The character set must be permitted by the character_set_client system variable and supported by the MySQL instance.

excludeSchemas: array of strings

Exclude the named schemas from the import. Note that the information_schema, mysql, ndbinfo, performance_schema, and sys schemas are always excluded from a dump that is created by MySQL Shell's instance dump utility. If a named schema does not exist in the dump files, the utility ignores the item.

excludeTables: array of strings

Exclude the named tables from the import. Table names must be qualified with a valid schema name, and quoted with the backtick character if needed. Note that the data for the mysql.apply_status, mysql.general_log, mysql.schema, and mysql.slow_log tables is always excluded from a dump created by MySQL Shell's schema dump utility, although their DDL statements are included. Tables named by the excludeTables option are not uploaded to the target MySQL instance. If a named table does not exist in the schema or the schema does not exist in the dump files, the dump loading utility ignores the item.

includeSchemas: array of strings

Load only the named schemas from the dump files.

includeTables: array of strings

Load only the named tables from the dump files. Table names must be qualified with a valid schema name, and quoted with the backtick character if needed. You cannot combine this option with the includeSchemas option.

loadDdl: [ true | false ]

Setting this option to true imports only the DDL files from the dump, and does not import the data. The default is false.

loadData: [ true | false ]

Setting this option to true imports only the data files from the dump, and does not import the DDL files. The default is false.

loadUsers: [ true | false ]

Import (true) or do not import (false) users and their roles and grants into the target MySQL instance. Statements for the current user are skipped. The default is false, so users are not imported by default. Note that MySQL Shell's schema dump utility does not include users, roles, and grants in a dump, but the instance dump utility can, and does by default.