Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 3.8Mb
PDF (A4) - 3.8Mb


HeatWave User Guide  /  Train and Use Machine Learning Models  /  Additional MySQL HeatWave AutoML Requirements

6.2 Additional MySQL HeatWave AutoML Requirements

Before You Begin

Model and Table Sizes

  • The table used to train a model cannot exceed 10 GB, 100 million rows, or 1017 columns.

  • Refer to the appropriate MySQL version for maximum MySQL HeatWave AutoML model sizes.

    • Before MySQL 9.0.0: The maximum model size is 900MB.

    • MySQL 9.0.0 and later: The shape you set for the MySQL HeatWave cluster in the DB system defines the total memory available to train a model and for all loaded models. For imported models, we recommend individual models have a size of 4GB or less. To query all loaded models and relevant model sizes, see ML_MODEL_ACTIVE.

      Refer to the following to learn more:

Data Requirements

  • Each dataset must reside in a single table on the DB System. MySQL HeatWave AutoML routines operate on a single table. See Load and Manage Data in MySQL HeatWave.

  • Table columns must use supported data types. See Supported Data Types for MySQL HeatWave AutoML to learn more.

  • NaN (Not a Number) values are not recognized by MySQL and should be replaced by NULL.

  • Refer to the following requirements for specific machine learning models.

    • Classification models: Must have at least two distinct values, and each distinct value should appear in at least five rows.

    • Regression models: The target column must be numeric.

Note

The ML_TRAIN routine ignores columns missing more than 20% of its values and columns with the same value in each row. Missing values in numerical columns are replaced with the average value of the column, standardized to a mean of 0 and with a standard deviation of 1. Missing values in categorical columns are replaced with the most frequent value, and either one-hot or ordinal encoding is used to convert categorical values to numeric values. The input data as it exists in the MySQL database is not modified by ML_TRAIN.

MySQL User Names

To use MySQL HeatWave AutoML, ensure that the MySQL user name that trains a model does not have a period character ("."). For example, a user named 'joesmith'@'%' is permitted to train a model, but a user named 'joe.smith'@'%' is not. The model catalog schema created by the ML_TRAIN procedure incorporates the user name in the schema name (for example, ML_SCHEMA_joesmith), and a period is not a permitted schema name character.

What's Next