HeatWave Release Notes  /  Changes in HeatWave  /  Changes in HeatWave 8.0.31 (2022-10-11, General Availability)

Changes in HeatWave 8.0.31 (2022-10-11, General Availability)

HeatWave AutoML

  • HeatWave AutoML queries are now monitored and recorded in the Performance Schema tables rpd_query_stats and rpd_exec_stats. Where a single HeatWave AutoML query contains a number of sub-queries, there is one record in rpd_query_stats and multiple records in rpd_exec_stats. (WL #15243)

  • New functions have been added to HeatWave AutoML to help you manage models:

    • When you run the ML_TRAIN routine on a training dataset, you can now specify a model handle to use for the model instead of the generated one.

    • A new column notes has been added to the MODEL_CATALOG table, which you can use to record notes about the models in your model catalog.

    • The new column model_metadata in the MODEL_CATALOG table records metadata for models, such as the training score, training time, and information about the training dataset. If an error occurs during training or you cancel the training operation, HeatWave AutoML records the error status in this column.

    (WL #15243)

  • HeatWave AutoML now supports the upload of pre-trained models in ONNX (Open Neural Network Exchange) format to the model catalog. You can load them using the stored procedure ML_MODEL_IMPORT that provides the conversion required to store the model in a MySQL table. (WL #15243)

  • A new stored procedure ML_EXPLAIN lets you train a variety of model explainers and prediction explainers for HeatWave AutoML, in addition to the default Permutation Importance model and prediction explainers:

    • The Partial Dependence model explainer shows how changing the values of one or more columns will change the value that the model predicts.

    • The SHAP model explainer produces global feature importance values based on Shapley values.

    • The Fast SHAP model explainer is a subsampling version of the SHAP model explainer which usually has a faster runtime.

    • The Permutation Importance prediction explainer explains the prediction for a single row or table.

    • The SHAP prediction explainer uses feature importance values to explain the prediction for a single row or table.

    When you use the ML_EXPLAIN_TABLE and ML_EXPLAIN_ROW stored procedures to generate explanations for a prediction, you can now use the SHAP prediction explainer as an alternative to the default Permutation Importance prediction explainer. SHAP produces feature importance values (explanations) based on Shapley values. (WL #15243)

  • HeatWave AutoML now supports timeseries forecasting using the existing stored procedures ML_TRAIN, ML_PREDICT_TABLE, and ML_SCORE. You can create a forecast for a single column (a univariate endogenous variable) with a numeric data type. The forecasting task is specified as a JSON object when you call the ML_TRAIN stored procedure. (WL #15243)

Functionality Added or Changed

  • HeatWave uses dictionary encoding to compress string columns (CHAR, VARCHAR, TEXT). These dictionaries are built for each string column with the RAPID_COLUMN=ENCODING=SORTED keyword. HeatWave now supports 8.5 billion dictionary entries (up from 4 billion), which means HeatWave can now encode string columns with number of distinct value (NDV) up to 8.5 billion. (WL #14742)

  • MySQL HeatWave now uses zone maps to exclude data chunks that are not relevant for a query. The zone map stores per chunk statistics for the minimum and maximum values of primary key columns. Queries using point and range filters to filter on values can now get accelerated by HeatWave by an order of magnitude. This is particularly useful for improving range queries in OLAP and mixed workloads. (WL #14713)

  • A new hypergraph-based MySQL optimizer is introduced for HeatWave to provide a holistic cost model across MySQL and HeatWave, create better query plans based on statistics used in Autopilot, reduce compilation time, eliminate the need of query hints for join order, and improve join query performance. With the new optimizer, HeatWave can now run all 22 TPC-H queries without straight join hints. Before 8.0.31, a straight join hint is needed for 10 out of 22 TPC-H to reach peak performance. (WL #14449)

  • DDL statements such as ALTER TABLE, RENAME TABLE, and TRUNCATE TABLE are now permitted on a table that has RAPID defined as the secondary engine. If a DDL operation is successfully carried out on a table that is loaded to a HeatWave Cluster at the time, HeatWave automatically reloads the table from InnoDB. Note that if the DDL operation makes the table‚Äôs structure incompatible with HeatWave, the table is unloaded from the HeatWave Cluster. (WL #15129)