Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.1Mb
PDF (A4) - 2.1Mb


HeatWave User Guide  /  ...  /  Advanced ML_TRAIN Options

3.5.1 Advanced ML_TRAIN Options

The ML_TRAIN routine provides advanced options to influence model selection and training.

  • The model_list option permits specifying the type of model to be trained. If more than one type of model specified, the best model type is selected from the list. For a list of supported model types, see Section 3.16.13, “Model Types”. This option cannot be used together with the exclude_model_list option.

    The following example trains either an XGBClassifier or LGBMClassifier model.

    mysql> CALL sys.ML_TRAIN('heatwaveml_bench.census_train', 'revenue', 
              JSON_OBJECT('task','classification', 'model_list', 
              JSON_ARRAY('XGBClassifier', 'LGBMClassifier')), @census_model);
  • The exclude_model_list option specifies types of models that should not be trained. Specified model types are excluded from consideration. For a list of model types you can specify, see Section 3.16.13, “Model Types”. This option cannot be used together with the model_list option.

    The following example excludes the LogisticRegression and GaussianNB models.

    mysql> CALL sys.ML_TRAIN('heatwaveml_bench.census_train', 'revenue', 
              JSON_OBJECT('task','classification',
              'exclude_model_list', JSON_ARRAY('LogisticRegression', 'GaussianNB')), 
              @census_model);
  • The optimization_metric option specifies a scoring metric to optimize for. See: Section 3.16.14, “Optimization and Scoring Metrics”.

    The following example optimizes for the neg_log_loss metric.

    mysql> CALL sys.ML_TRAIN('heatwaveml_bench.census_train', 'revenue', 
              JSON_OBJECT('task','classification', 'optimization_metric', 'neg_log_loss'), 
              @census_model);
  • The exclude_column_list option specifies feature columns to exclude from consideration when training a model.

    The following example excludes the 'age' column from consideration when training a model for the census dataset.

    mysql> CALL sys.ML_TRAIN('heatwaveml_bench.census_train', 'revenue', 
              JSON_OBJECT('task','classification', 'exclude_column_list', JSON_ARRAY('age')), 
              @census_model);