Related Documentation Download this Manual
PDF (US Ltr) - 0.6Mb
PDF (A4) - 0.6Mb


HeatWave User Guide  /  Loading Data

Chapter 4 Loading Data

This section describes how to load data into HeatWave. Before attempting to load data, ensure that you have:

Otherwise, the table load operation is not permitted.

For related best practices, see Chapter 9, Best Practices.

Tip

Instead of preparing and loading tables into HeatWave manually, consider using the Auto Parallel Load utility. See Section 4.1, “Auto Parallel Load”.

To load a table into HeatWave, specify the SECONDARY_LOAD option in an ALTER TABLE statement.

mysql> ALTER TABLE orders SECONDARY_LOAD;

The time required to load a table depends on data size. You can monitor load progress by issuing the following query, which returns a percentage value indicating load progress.

mysql> SELECT VARIABLE_VALUE
       FROM performance_schema.global_status
       WHERE VARIABLE_NAME = 'rapid_load_progress';
+----------------+
| VARIABLE_VALUE |
+----------------+
| 100.000000     |
+----------------+
Note

If necessary, you can halt a load operation using Ctrl-C.

You can verify that tables are loaded by querying the LOAD_STATUS data from HeatWave Performance Schema tables. For example:

mysql> USE performance_schema;
mysql> SELECT NAME, LOAD_STATUS FROM rpd_tables,rpd_table_id
       WHERE rpd_tables.ID = rpd_table_id.ID;
+------------------------------+---------------------+
| NAME                         | LOAD_STATUS         |
+------------------------------+---------------------+
| tpch.supplier                | AVAIL_RPDGSTABSTATE |
| tpch.partsupp                | AVAIL_RPDGSTABSTATE |
| tpch.orders                  | AVAIL_RPDGSTABSTATE |
| tpch.lineitem                | AVAIL_RPDGSTABSTATE |
| tpch.customer                | AVAIL_RPDGSTABSTATE |
| tpch.nation                  | AVAIL_RPDGSTABSTATE |
| tpch.region                  | AVAIL_RPDGSTABSTATE |
| tpch.part                    | AVAIL_RPDGSTABSTATE |
+------------------------------+---------------------+

The AVAIL_RPDGSTABSTATE status indicates that the table is loaded. For information about load statuses, see Section 11.10.4, “The rpd_tables Table”.

When loading a table into HeatWave, data is read from InnoDB using batched, multi-threaded reads. Data is then converted into columnar format and sent over the network and distributed among the HeatWave nodes in horizontal slices. Data is partitioned by the table's primary key unless data placement keys are defined. See Section 3.4, “Defining Data Placement Keys”.

Concurrent DML operations and queries on the MySQL node are supported while a data load operation is in progress; however, concurrent operations on the MySQL node can affect load performance and vice versa.

After tables are loaded, changes to table data on the MySQL DB System node are automatically propagated to HeatWave. For more information, see Section 4.2, “Change Propagation”.

The SECONDARY_LOAD clause has these properties:

  • It is considered a local operation and is therefore omitted from the binary log.

  • Data is read using the READ COMMITTED isolation level.

The following limitations apply when loading tables:

  • Loading a table is not permitted if the primary key is absent. Primary key columns defined with column prefixes are not supported.

  • HeatWave supports a maximum of 470 columns per table.

  • Load time is affected if the primary key contains more than one column, or if the primary key column is not an INTEGER column. The impact on MySQL performance during load, change propagation, and query processing depends on factors such as data properties, available resources (compute, memory, and network), and the rate of transaction processing on the MySQL DB System.

  • DDL operations are not permitted on tables that are loaded in HeatWave. To alter the definition of a table, you must unload the table and remove the SECONDARY_ENGINE attribute before performing the DDL operation. See Chapter 10, Troubleshooting.