Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.1Mb
PDF (A4) - 2.1Mb


HeatWave User Guide  /  ...  /  Running Auto Encoding

2.7.4.1 Running Auto Encoding

To enable Auto Encoding, specify the auto_enc option in recommend mode. See Section 2.7.3, “HeatWave Autopilot Advisor Syntax”.

Note

To run Advisor for both encoding and data placement recommendations, run Auto Encoding first, apply the recommended encodings, rerun the queries, and then run Auto Data Placement. This sequence allows data placement performance benefits to be calculated with string column encodings in place, which provides for greater accuracy from Advisor internal models.

For Advisor to provide string column encoding recommendations, tables must be loaded in HeatWave and a query history must be available. Run the queries that you intend to use or run a representative set of queries. Failing to do so can affect query offload after Auto Encoding recommendations are implemented due to query constraints associated with dictionary encoding. For dictionary encoding limitations, see Section 2.14.2, “Dictionary Encoding”.

In the following example, Auto Encoding is run in recommend mode, which analyzes column data, checks the amount of memory on the MySQL node, and provides encoding recommendations intended to reduce the amount of space required on HeatWave nodes and optimize query performance. There is no target schema specified, so Auto Encoding runs on all schemas loaded in HeatWave

mysql> CALL sys.heatwave_advisor(JSON_OBJECT('auto_enc',JSON_OBJECT('mode','recommend')));

The fixed_enc option can be used in recommend mode to specify an encoding for specific columns. These columns are excluded from consideration when Auto Encoding generates recommendations. Manually encoded columns are also excluded from consideration. (For manual encoding instructions, see Section 2.7.1, “Encoding String Columns”.)

mysql> CALL sys.heatwave_advisor(JSON_OBJECT('auto_enc',JSON_OBJECT('mode','recommend','fixed_enc', 
          JSON_OBJECT('tpch.CUSTOMER.C_ADDRESS','varlen'))));

Advisor output provides information about each stage of Advisor execution, including recommended column encodings and estimated HeatWave Cluster memory savings.

mysql> CALL sys.heatwave_advisor(JSON_OBJECT('target_schema',JSON_ARRAY('tpch_1024'), 
          'auto_enc',JSON_OBJECT('mode','recommend')));
+-------------------------------+
| INITIALIZING HEATWAVE ADVISOR |
+-------------------------------+
| Version: 1.12                 |
|                               |
| Output Mode: normal           |
| Excluded Queries: 0           |
| Target Schemas: 1             |
|                               |
+-------------------------------+
6 rows in set (0.01 sec)

+---------------------------------------------------------+
| ANALYZING LOADED DATA                                   |
+---------------------------------------------------------+
| Total 8 tables loaded in HeatWave for 1 schemas         |
| Tables excluded by user: 0 (within target schemas)      |
|                                                         |
| SCHEMA                            TABLES        COLUMNS |
| NAME                              LOADED         LOADED |
| ------                            ------         ------ |
| `tpch_1024`                            8             61 |
|                                                         |
+---------------------------------------------------------+
8 rows in set (0.15 sec)

+-------------------------------------------------------------------------------------------+
| ENCODING SUGGESTIONS                                                                      |
+-------------------------------------------------------------------------------------------+
| Total Auto Encoding suggestions produced for 21 columns                                   |
| Queries executed: 200                                                                     |
|   Total query execution time: 28.82 min                                                   |
|   Most recent query executed on: Tuesday 8th June 2021 14:42:13                           |
|   Oldest query executed on: Tuesday 8th June 2021 14:11:45                                |
|                                                                                           |
|                                                        CURRENT           SUGGESTED        |
| COLUMN                                                  COLUMN              COLUMN        |
| NAME                                                  ENCODING            ENCODING        |
| ------                                                --------           ---------        |
| `tpch_1024`.`CUSTOMER`.`C_ADDRESS`                      VARLEN          DICTIONARY        |
| `tpch_1024`.`CUSTOMER`.`C_COMMENT`                      VARLEN          DICTIONARY        |
| `tpch_1024`.`CUSTOMER`.`C_MKTSEGMENT`                   VARLEN          DICTIONARY        |
| `tpch_1024`.`CUSTOMER`.`C_NAME`                         VARLEN          DICTIONARY        |
| `tpch_1024`.`LINEITEM`.`L_COMMENT`                      VARLEN          DICTIONARY        |
| `tpch_1024`.`LINEITEM`.`L_SHIPINSTRUCT`                 VARLEN          DICTIONARY        |
| `tpch_1024`.`LINEITEM`.`L_SHIPMODE`                     VARLEN          DICTIONARY        |
| `tpch_1024`.`NATION`.`N_COMMENT`                        VARLEN          DICTIONARY        |
| `tpch_1024`.`NATION`.`N_NAME`                           VARLEN          DICTIONARY        |
| `tpch_1024`.`ORDERS`.`O_CLERK`                          VARLEN          DICTIONARY        |
| `tpch_1024`.`ORDERS`.`O_ORDERPRIORITY`                  VARLEN          DICTIONARY        |
| `tpch_1024`.`PART`.`P_BRAND`                            VARLEN          DICTIONARY        |
| `tpch_1024`.`PART`.`P_COMMENT`                          VARLEN          DICTIONARY        |
| `tpch_1024`.`PART`.`P_CONTAINER`                        VARLEN          DICTIONARY        |
| `tpch_1024`.`PART`.`P_MFGR`                             VARLEN          DICTIONARY        |
| `tpch_1024`.`PARTSUPP`.`PS_COMMENT`                     VARLEN          DICTIONARY        |
| `tpch_1024`.`REGION`.`R_COMMENT`                        VARLEN          DICTIONARY        |
| `tpch_1024`.`REGION`.`R_NAME`                           VARLEN          DICTIONARY        |
| `tpch_1024`.`SUPPLIER`.`S_ADDRESS`                      VARLEN          DICTIONARY        |
| `tpch_1024`.`SUPPLIER`.`S_NAME`                         VARLEN          DICTIONARY        |
| `tpch_1024`.`SUPPLIER`.`S_PHONE`                        VARLEN          DICTIONARY        |
|                                                                                           |
| Applying the suggested encodings might improve query performance and cluster memory usage.|
|   Estimated HeatWave Cluster memory savings: 355.60 GiB                                   |
|                                                                                           |
+-------------------------------------------------------------------------------------------+
35 rows in set (18.18 sec)

+-------------------------------------------------------------------------------------------+
| SCRIPT GENERATION                                                                         |
+-------------------------------------------------------------------------------------------+
| Script generated for applying suggestions for 8 loaded tables                             |
|                                                                                           |
| Applying changes will take approximately 1.64 h                                           |
|                                                                                           |
| Retrieve script containing 61 generated DDL commands using the query below:               |
|   SELECT log->>"$.sql" AS "SQL Script" FROM sys.heatwave_advisor_report WHERE type = "sql"|
|   ORDER BY id;                                                                            |
|                                                                                           |
| Caution: Executing the generated script will alter the column comment and secondary engine|
| flags in the schema                                                                       |
|                                                                                           |
+-------------------------------------------------------------------------------------------+
9 rows in set (18.20 sec)

To inspect the load script, which includes the DDL statements required to implement the recommended encodings, query the heatwave_autopilot_report table.

mysql> SELECT log->>"$.sql" AS "SQL Script"
          FROM sys.heatwave_autopilot_report 
          WHERE type = "sql" 
          ORDER BY id;

To concatenate generated DDL statements into a single string that can be copied and pasted for execution, issue the statements that follow. The group_concat_max_len variable sets the result length in bytes for the GROUP_CONCAT() function to accommodate a potentially long string. (The default group_concat_max_len setting is 1024 bytes.)

mysql> SET SESSION group_concat_max_len = 1000000;
mysql> SELECT GROUP_CONCAT(log->>"$.sql" SEPARATOR ' ') 
          FROM sys.heatwave_advisor_report 
          WHERE type = "sql" 
          ORDER BY id;
Usage Notes:
  • Auto Encoding analyzes string columns (CHAR, VARCHAR, and TEXT type columns) of tables that are loaded in HeatWave. Automatically or manually excluded columns, columns greater than 65532 bytes, and columns with manually defined encodings are excluded from consideration. Auto Encoding also analyzes HeatWave query history to identify query constraints that preclude the use of dictionary encoding. Dictionary-encoded columns are not supported in JOIN operations, with string functions and operators, or in LIKE predicates. For dictionary encoding limitations, see Section 2.14.2, “Dictionary Encoding”.

  • The time required to generate encoding recommendations depends on the number of queries to be analyzed, the number of operators, and the complexity of each query.

  • Encoding recommendations for the same table may differ after changes to data or data statistics. For example, changes to table cardinality or the number of distinct values in a column can affect recommendations.

  • Auto Encoding does not generate recommendations for a given table if existing encodings do not require modification.

  • Auto Encoding only recommends dictionary encoding if it is expected to reduce the amount of memory required on HeatWave nodes.

  • If there is not enough MySQL node memory for the dictionaries of all columns that would benefit from dictionary encoding, the columns estimated to save the most memory are recommended for dictionary encoding.

  • Auto Encoding uses the current state of tables loaded in HeatWave when generating recommendations. Concurrent change propagation activity is not considered.

  • Encoding recommendations are based on estimates and are therefore not guaranteed to reduce the memory required on HeatWave nodes or improve query performance.