Instead of preparing and loading tables into HeatWave manually, consider using the Auto Parallel Load utility. See Section 2.2.3, “Loading Data Using Auto Parallel Load”. Users of MySQL HeatWave on AWS also have the option of loading data from the MySQL HeatWave Console. See Manage HeatWave Data with Workspaces, in the MySQL HeatWave on AWS Service Guide.
The loading of data into HeatWave can be classified into three types: Initial Bulk Load, Incremental Bulk Load, and Change Propagation.
Initial Bulk Load: Performed when loading data into HeatWave for the first time, or when reloading data. The best time to perform an initial bulk load is during off-peak hours, as bulk load operations can affect OLTP performance on the MySQL DB System.
Incremental Bulk Load: Performed when there is a substantial amount of data to load into tables that are already loaded in HeatWave. An incremental bulk load involves these steps:
Depending on the amount of data, an incremental bulk load may be a faster method of loading new data than waiting for change propagation to occur. It also provides greater control over when new data is loaded. As with initial build loads, the best time to perform an incremental bulk load is during off-peak hours, as bulk load operations can affect OLTP performance on the MySQL DB System.
Change Propagation: After tables are loaded into HeatWave, data changes are automatically propagated from
InnoDBtables on the MySQL DB System to their counterpart tables in HeatWave. See Section 2.2.7, “Change Propagation”.
Use the following strategies to improve load performance:
Increase the number of read threads
For medium to large tables, increase the number of read threads to 32 by setting the
innodb_parallel_read_threadsvariable on the MySQL DB System.
mysql> SET SESSION innodb_parallel_read_threads = 32;
If the MySQL DB System is not busy, you can increase the value to 64.Tip
The Auto Parallel Load utility automatically optimizes the number of parallel read threads for each table. See Section 2.2.3, “Loading Data Using Auto Parallel Load”. For users of MySQL HeatWave on AWS, the number of parallel read threads is also optimized when loading data from the MySQL HeatWave Console. See Manage HeatWave Data with Workspaces.
Load tables concurrently
If you have many small and medium tables (less than 20GB in size), load tables from multiple sessions:
Session 1: mysql> ALTER TABLE supplier SECONDARY_LOAD; Session 2: mysql> ALTER TABLE parts SECONDARY_LOAD; Session 3: mysql> ALTER TABLE region SECONDARY_LOAD; Session 4: mysql> ALTER TABLE partsupp SECONDARY_LOAD;
Avoid or reduce conflicting operations
Data load operations share resources with other OLTP DML and DDL operations on the MySQL DB System. To improve load performance, avoid or reduce conflicting DDL and DML operations. For example, avoid running DDL and large DML operations on the
LINEITEMtable while executing an
ALTER TABLE LINEITEM SECONDARY_LOADoperation.