HeatWave Advisor Auto Encoding, which recommends string column encodings, now provides encoding recommendations that optimize query performance. Recommendations are based on performance models that use query execution data. Previously, string column encoding recommendations were optimized for cluster memory usage only. A performance improvement estimate is provided with string column encoding recommendations. (Bug #34145862)
You can now train HeatWave AutoML models on tables containing
DATE
,TIME
,DATETIME
,TIMESTAMP
, andYEAR
data types. (Bug #33895503)-
HeatWave AutoML now generates a model explanation when you train a machine learning model. Model explanations help identify the features that are most important to a model. For more information, see The Model Catalog.
The following columns were added to the
MODEL_CATALOG
table:column_names
: The feature columns used to train the model.last_accessed
: The last time the model was accessed. HeatWave AutoML routines update this value to the current timestamp when accessing the model.model_explanation
: The model explanation generated during training.model_type
: The type of model (algorithm) selected byML_TRAIN
to build the model.task
: The task type specified in theML_TRAIN
query (classification
orregression
).
ML_PREDICT_*
andML_EXPLAIN_*
routine performance was improved, resulting in faster prediction and explanation processing. (WL #15088, WL #15014) -
The following HeatWave AutoML enhancements were implemented:
-
ML_TRAIN
options for advanced users. These options permit users to customize various aspects of the ML training pipeline including algorithm selection, feature selection, and hyperparameter optimization.The
model_list
option permits specifying the type of model to be trained.The
exclude_model_list
option specifies models types to exclude from consideration during model selection.The
optimization_metric
option specifies the scoring metric to optimize for when training a machine learning model.The
exclude_column_list
option specifies feature columns to exclude from consideration when training a machine learning model.
For more information, see Advanced ML_TRAIN Options.
Support was added for Support Vector Machine
SVC
andLinearSVC
classification and regression models. For a complete list of supported model types, see Model Types.The
ML_TRAIN
routine now reports a message if a trained model does not meet expected quality criteria.ML_EXPLAIN_ROW
andML_EXPLAIN_TABLE
routines now provide information to help interpret explanations. The routines also report a warning when a model quality issue is detected, enabling users to revisit their data in order to improve model quality.
(WL #15089)
-
The amount of heap memory allocated on the MySQL node for each table loaded into HeatWave was reduced, increasing the maximum number of tables that can be loaded. For
MySQL.HeatWave.VM.E3.Standard
shapes, the maximum was raised from 100k tables to 400k tables. ForMySQL.HeatWave.BM.E3.Standard
shapes, the maximum number was raised from 400k tables to 1600k tables. The actual number of tables that can be loaded is dependent on the table's data. (Bug #33951708)The
performance_schema.rpd_column_id
table was modified to remove redundant data. TheNAME
,SCHEMA_NAME
,TABLE_NAME
columns were removed, and aTABLE_ID
column was added. (Bug #33899183)Support was added for the
FROM_DAYS()
temporal function, andGREATEST()
andLEAST()
comparison and string functions which now supportDATE
,DATETIME
,TIME
, andTIMESTAMP
columns. (WL #14956)-
Support was added for built-in server-side data masking and de-identification to help protect sensitive data from unauthorized uses by hiding and replacing real values with substitutes. Data masking and de-identification operations are performed on the server, and queries involving data masking and de-identification functions are accelerated by HeatWave. The following data masking and de-identification functions are supported:
See Data Masking and De-Identification Functions. (WL #15143)
Optimizations were implemented to improve performance for
JOIN
andGROUP BY
queries with execution plans involving multiple consecutive rounds of data partitioning. (WL #15143)-
comparisons, where the expression is a single value and compared values are constants of the same data type and encoding, have been optimized. For example, the followingexpr
IN (value
,...)IN()
comparison has been optimized:SELECT * FROM Customers WHERE Country IN ('Germany', 'France', 'Spain');
(WL #14952)