HeatWave on AWS  /  ...  /  Estimating Cluster Size with HeatWave Autopilot

4.1.1 Estimating Cluster Size with HeatWave Autopilot

This topic describes how to estimate the optimal HeatWave Cluster size for your data.

A cluster size estimate is generated using HeatWave Autopilot machine learning techniques. HeatWave Autopilot analyzes the data on your MySQL DB System and recommends a cluster size. If you have not loaded data into your DB System, and you want to estimate the optimal HeatWave Cluster size, load data into the DB System before you create a HeatWave Cluster. See Importing Data.

Prerequisites:

  • The data you intend to load into the HeatWave Cluster must be available on the DB System.

  • Optionally, log into your DB System and run ANALYZE TABLE on tables you intend to load into the HeatWave Cluster. Estimates should generally be valid without running ANALYZE TABLE , but running ANALYZE TABLE ensures that estimates are as accurate as possible.

To estimate a cluster size:

  1. Click Estimate Cluster Size.

    The Estimate Cluster Size with Autopilot dialog is displayed.

  2. Select the schemas and tables you want to include in the estimate. Schemas are displayed in the Schemas pane. Tables belonging to the selected schema appear in the Tables from selected schemas pane.

    When schemas and tables are selected, the Summary details are adjusted automatically.

    The Schemas pane provides the following information:

    • Name: The schema name.

    • HeatWave Cluster Memory Usage (GiB): The estimated amount of HeatWave Cluster memory used by the schema.

    • Tables Selected: The number of tables selected expressed as a fraction of the total number of tables.

    • Warnings: The number of table warnings.

    The Tables from selected schemas pane provides the following information:

    • Name: The table name.

    • Warnings: The number of table warnings. For a description of table warnings, see Cluster Size Estimate Table Warnings.

    • Memory Size Estimate (GiB): The estimated amount of HeatWave Cluster memory required for the table.

    • Rows Estimate: The estimated number of table rows.

  3. Review the Summary details, which include memory required by the schemas and tables selected, memory provided per node, HeatWave Cluster nodes required, and memory provided by the cluster.

  4. To apply the cluster size estimate, click Apply Cluster Size Estimate.

    You are returned to the Create HeatWave Cluster dialog where the estimate is applied to the Cluster Size field.

Cluster Size Estimate Table Warnings

This topic describes table warnings that may appear in the Tables from selected schemas pane, in the Estimate Cluster Size with MySQL Autopilot dialog.

Table 4-1 Cluster Size Table Warnings

Table Status Issue Description
TOO MANY COLUMNS TO LOAD The table has too many columns. The column limit is 1017.
ALL COLUMNS MARKED AS NOT SECONDARY There are no columns to load. All table columns are defined as NOT SECONDARY. Columns defined as NOT SECONDARY are excluded from the estimate. For more information, see Excluding Table Columns, in the HeatWave User Guide.
CONTAINS VARLEN COLUMN WITH >65532 BYTES A VARLEN column exceeds the 65532 byte limit. For more information on VARLEN, see Variable-length Encoding in the HeatWave User Guide
ESTIMATION COULD NOT BE CALCULATED The estimate could not be calculated. For example, a table estimate may not be available if statistics for VARLEN columns are unavailable.
UNABLE TO LOAD TABLE WITHOUT PRIMARY KEY A table must be defined with a primary key before it can be loaded into HeatWave.