Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.0Mb
PDF (A4) - 2.0Mb


HeatWave User Guide  /  ...  /  Loading Data

2.9.5 Loading Data

Instead of preparing and loading tables into HeatWave manually, consider using the Auto Parallel Load utility. See Section 2.2.3, “Loading Data Using Auto Parallel Load”. Users of HeatWave on AWS also have the option of loading data from the HeatWave Console. See Manage Data in HeatWave with Workspaces in the HeatWave on AWS Service Guide.

The loading of data into HeatWave can be classified into three types: Initial Bulk Load, Incremental Bulk Load, and Change Propagation.

  • Initial Bulk Load: Performed when loading data into HeatWave for the first time, or when reloading data. The best time to perform an initial bulk load is during off-peak hours, as bulk load operations can affect OLTP performance on the MySQL DB System.

  • Incremental Bulk Load: Performed when there is a substantial amount of data to load into tables that are already loaded in HeatWave. An incremental bulk load involves these steps:

    1. Performing a SECONDARY_UNLOAD operation to unload a table from HeatWave. See Section 2.5.1, “Unloading Tables”.

    2. Importing data into the table on the MySQL DB System node.

    3. Performing a SECONDARY_LOAD operation to reload the table into HeatWave. See Section 2.2, “Loading Data to HeatWave MySQL”.

    Depending on the amount of data, an incremental bulk load may be a faster method of loading new data than waiting for change propagation to occur. It also provides greater control over when new data is loaded. As with initial build loads, the best time to perform an incremental bulk load is during off-peak hours, as bulk load operations can affect OLTP performance on the MySQL DB System.

  • Change Propagation: After tables are loaded into HeatWave, data changes are automatically propagated from InnoDB tables on the MySQL DB System to their counterpart tables in HeatWave. See Section 2.2.7, “Change Propagation”.

Use the following strategies to improve load performance:

  • Increase the number of read threads

    For medium to large tables, increase the number of read threads to 32 by setting the innodb_parallel_read_threads variable on the MySQL DB System.

    mysql> SET SESSION innodb_parallel_read_threads = 32;

    If the MySQL DB System is not busy, you can increase the value to 64.

    Tip

    The Auto Parallel Load utility automatically optimizes the number of parallel read threads for each table. See Section 2.2.3, “Loading Data Using Auto Parallel Load”. For users of HeatWave on AWS, the number of parallel read threads is also optimized when loading data from the HeatWave Console. See Manage Data in HeatWave with Workspaces in the HeatWave on AWS Service Guide.

  • Load tables concurrently

    If you have many small and medium tables (less than 20GB in size), load tables from multiple sessions:

    Session 1: 
    mysql> ALTER TABLE supplier SECONDARY_LOAD;
    
    Session 2: 
    mysql> ALTER TABLE parts SECONDARY_LOAD;
    
    Session 3: 
    mysql> ALTER TABLE region SECONDARY_LOAD;
    
    Session 4: 
    mysql> ALTER TABLE partsupp SECONDARY_LOAD;
  • Avoid or reduce conflicting operations

    Data load operations share resources with other OLTP DML and DDL operations on the MySQL DB System. To improve load performance, avoid or reduce conflicting DDL and DML operations. For example, avoid running DDL and large DML operations on the LINEITEM table while executing an ALTER TABLE LINEITEM SECONDARY_LOAD operation.