To enable Auto Encoding, specify the
auto_enc
option in
recommend
mode. See
Section 2.7.3, “HeatWave Autopilot Advisor Syntax”.
To run Advisor for both encoding and data placement recommendations, run Auto Encoding first, apply the recommended encodings, rerun the queries, and then run Auto Data Placement. This sequence allows data placement performance benefits to be calculated with string column encodings in place, which provides for greater accuracy from Advisor internal models.
For Advisor to provide string column encoding recommendations, tables must be loaded in HeatWave and a query history must be available. Run the queries that you intend to use or run a representative set of queries. Failing to do so can affect query offload after Auto Encoding recommendations are implemented due to query constraints associated with dictionary encoding. For dictionary encoding limitations, see Section 2.14.2, “Dictionary Encoding”.
In the following example, Auto Encoding is run in
recommend
mode, which analyzes column
data, checks the amount of memory on the MySQL node, and
provides encoding recommendations intended to reduce the
amount of space required on HeatWave nodes and optimize query
performance. There is no target schema specified, so Auto
Encoding runs on all schemas loaded in HeatWave
mysql> CALL sys.heatwave_advisor(JSON_OBJECT('auto_enc',JSON_OBJECT('mode','recommend')));
The fixed_enc
option can be used in
recommend
mode to specify an encoding for
specific columns. These columns are excluded from
consideration when Auto Encoding generates recommendations.
Manually encoded columns are also excluded from
consideration. (For manual encoding instructions, see
Section 2.7.1, “Encoding String Columns”.)
mysql> CALL sys.heatwave_advisor(JSON_OBJECT('auto_enc',JSON_OBJECT('mode','recommend','fixed_enc',
JSON_OBJECT('tpch.CUSTOMER.C_ADDRESS','varlen'))));
Advisor output provides information about each stage of Advisor execution, including recommended column encodings and estimated HeatWave Cluster memory savings.
mysql> CALL sys.heatwave_advisor(JSON_OBJECT('target_schema',JSON_ARRAY('tpch_1024'),
'auto_enc',JSON_OBJECT('mode','recommend')));
+-------------------------------+
| INITIALIZING HEATWAVE ADVISOR |
+-------------------------------+
| Version: 1.12 |
| |
| Output Mode: normal |
| Excluded Queries: 0 |
| Target Schemas: 1 |
| |
+-------------------------------+
6 rows in set (0.01 sec)
+---------------------------------------------------------+
| ANALYZING LOADED DATA |
+---------------------------------------------------------+
| Total 8 tables loaded in HeatWave for 1 schemas |
| Tables excluded by user: 0 (within target schemas) |
| |
| SCHEMA TABLES COLUMNS |
| NAME LOADED LOADED |
| ------ ------ ------ |
| `tpch_1024` 8 61 |
| |
+---------------------------------------------------------+
8 rows in set (0.15 sec)
+-------------------------------------------------------------------------------------------+
| ENCODING SUGGESTIONS |
+-------------------------------------------------------------------------------------------+
| Total Auto Encoding suggestions produced for 21 columns |
| Queries executed: 200 |
| Total query execution time: 28.82 min |
| Most recent query executed on: Tuesday 8th June 2021 14:42:13 |
| Oldest query executed on: Tuesday 8th June 2021 14:11:45 |
| |
| CURRENT SUGGESTED |
| COLUMN COLUMN COLUMN |
| NAME ENCODING ENCODING |
| ------ -------- --------- |
| `tpch_1024`.`CUSTOMER`.`C_ADDRESS` VARLEN DICTIONARY |
| `tpch_1024`.`CUSTOMER`.`C_COMMENT` VARLEN DICTIONARY |
| `tpch_1024`.`CUSTOMER`.`C_MKTSEGMENT` VARLEN DICTIONARY |
| `tpch_1024`.`CUSTOMER`.`C_NAME` VARLEN DICTIONARY |
| `tpch_1024`.`LINEITEM`.`L_COMMENT` VARLEN DICTIONARY |
| `tpch_1024`.`LINEITEM`.`L_SHIPINSTRUCT` VARLEN DICTIONARY |
| `tpch_1024`.`LINEITEM`.`L_SHIPMODE` VARLEN DICTIONARY |
| `tpch_1024`.`NATION`.`N_COMMENT` VARLEN DICTIONARY |
| `tpch_1024`.`NATION`.`N_NAME` VARLEN DICTIONARY |
| `tpch_1024`.`ORDERS`.`O_CLERK` VARLEN DICTIONARY |
| `tpch_1024`.`ORDERS`.`O_ORDERPRIORITY` VARLEN DICTIONARY |
| `tpch_1024`.`PART`.`P_BRAND` VARLEN DICTIONARY |
| `tpch_1024`.`PART`.`P_COMMENT` VARLEN DICTIONARY |
| `tpch_1024`.`PART`.`P_CONTAINER` VARLEN DICTIONARY |
| `tpch_1024`.`PART`.`P_MFGR` VARLEN DICTIONARY |
| `tpch_1024`.`PARTSUPP`.`PS_COMMENT` VARLEN DICTIONARY |
| `tpch_1024`.`REGION`.`R_COMMENT` VARLEN DICTIONARY |
| `tpch_1024`.`REGION`.`R_NAME` VARLEN DICTIONARY |
| `tpch_1024`.`SUPPLIER`.`S_ADDRESS` VARLEN DICTIONARY |
| `tpch_1024`.`SUPPLIER`.`S_NAME` VARLEN DICTIONARY |
| `tpch_1024`.`SUPPLIER`.`S_PHONE` VARLEN DICTIONARY |
| |
| Applying the suggested encodings might improve query performance and cluster memory usage.|
| Estimated HeatWave Cluster memory savings: 355.60 GiB |
| |
+-------------------------------------------------------------------------------------------+
35 rows in set (18.18 sec)
+-------------------------------------------------------------------------------------------+
| SCRIPT GENERATION |
+-------------------------------------------------------------------------------------------+
| Script generated for applying suggestions for 8 loaded tables |
| |
| Applying changes will take approximately 1.64 h |
| |
| Retrieve script containing 61 generated DDL commands using the query below: |
| SELECT log->>"$.sql" AS "SQL Script" FROM sys.heatwave_advisor_report WHERE type = "sql"|
| ORDER BY id; |
| |
| Caution: Executing the generated script will alter the column comment and secondary engine|
| flags in the schema |
| |
+-------------------------------------------------------------------------------------------+
9 rows in set (18.20 sec)
To inspect the load script, which includes the DDL
statements required to implement the recommended encodings,
query the heatwave_autopilot_report
table.
mysql> SELECT log->>"$.sql" AS "SQL Script"
FROM sys.heatwave_autopilot_report
WHERE type = "sql"
ORDER BY id;
To concatenate generated DDL statements into a single string
that can be copied and pasted for execution, issue the
statements that follow. The
group_concat_max_len
variable sets the result length in bytes for the
GROUP_CONCAT()
function to
accommodate a potentially long string. (The default
group_concat_max_len
setting is 1024 bytes.)
mysql> SET SESSION group_concat_max_len = 1000000;
mysql> SELECT GROUP_CONCAT(log->>"$.sql" SEPARATOR ' ')
FROM sys.heatwave_advisor_report
WHERE type = "sql"
ORDER BY id;
Auto Encoding analyzes string columns (
CHAR
,VARCHAR
, andTEXT
type columns) of tables that are loaded in HeatWave. Automatically or manually excluded columns, columns greater than 65532 bytes, and columns with manually defined encodings are excluded from consideration. Auto Encoding also analyzes HeatWave query history to identify query constraints that preclude the use of dictionary encoding. Dictionary-encoded columns are not supported inJOIN
operations, with string functions and operators, or inLIKE
predicates. For dictionary encoding limitations, see Section 2.14.2, “Dictionary Encoding”.The time required to generate encoding recommendations depends on the number of queries to be analyzed, the number of operators, and the complexity of each query.
Encoding recommendations for the same table may differ after changes to data or data statistics. For example, changes to table cardinality or the number of distinct values in a column can affect recommendations.
Auto Encoding does not generate recommendations for a given table if existing encodings do not require modification.
Auto Encoding only recommends dictionary encoding if it is expected to reduce the amount of memory required on HeatWave nodes.
If there is not enough MySQL node memory for the dictionaries of all columns that would benefit from dictionary encoding, the columns estimated to save the most memory are recommended for dictionary encoding.
Auto Encoding uses the current state of tables loaded in HeatWave when generating recommendations. Concurrent change propagation activity is not considered.
Encoding recommendations are based on estimates and are therefore not guaranteed to reduce the memory required on HeatWave nodes or improve query performance.