MySQL HeatWave User Guide  /  ...  /  The Model Catalog

3.9.1 The Model Catalog

HeatWave ML stores machine learning models in a model catalog in MySQL. A model catalog is a standard MySQL table named MODEL_CATALOG. HeatWave ML creates a model catalog for any user that creates a machine learning model.

The MODEL_CATALOG table is created in a schema named ML_SCHEMA_user_name, where the user_name is the name of the owning user.

When a user creates a model, the ML_TRAIN routine creates the model catalog schema and table if they do not exist. ML_TRAIN inserts the model as a row in the MODEL_CATALOG table at the end of training.

A model catalog is accessible only to the owning user unless the user grants privileges on the model catalog to another user. This means that HeatWave ML routines can only use models that are accessible to the user running the routines. For information about granting model catalog privileges, see Section 3.9.10, “Sharing Models”.

A database administrator can manage a model catalog table as they would a regular MySQL table.

The Model Catalog Table

The MODEL_CATALOG table has the following columns:

  • model_id

    A unique auto-incrementing numeric identifier for the model.

  • model_handle

    A name for the model. The model handle must be unique in the model catalog. The model handle is generated or set by the user when the ML_TRAIN routine is executed on a training dataset. The generated model_handle format is schemaName_tableName_userName_No, as in the following example: heatwaveml_bench.census_train_user1_1636729526.

    Note

    The format of the generated model handle is subject to change.

  • model_object

    A string in JSON format containing the serialized HeatWave ML model.

  • model_owner

    The user who initiated the ML_TRAIN routine to create the model.

  • build_timestamp

    A timestamp indicating when the model was created (in UNIX epoch time). A model is created when the ML_TRAIN routine finishes executing.

  • target_column_name

    The name of the column in the training table that was specified as the target column.

  • train_table_name

    The name of the input table specified by the ML_TRAIN routine.

  • model_object_size

    The model object size, in bytes.

  • model_type

    The type of model (algorithm) selected by ML_TRAIN to build the model.

  • task

    The task type specified in the ML_TRAIN query (classification or regression).

  • column_names

    The feature columns used to train the model.

  • model_explanations

    The model explanation generated during training. See Section 3.9.7, “Model Explanations”. This column was added in MySQL 8.0.29.

  • last_accessed

    The last time the model was accessed. HeatWave ML routines update this value to the current timestamp when accessing the model.

  • model_metadata

    Metadata for the model. If an error occurs during training or you cancel the training operation, HeatWave ML records the error status in this column. This column was added in MySQL 8.0.31. It contains the following metadata as key-value pairs in JSON format:

    status: Creating | Ready | Error

    The status of the model. Creating means it is still being created, Ready means it is trained and active, and Error means training was canceled or an error occurred during it. Any error message appears in the notes column.

    training_score: number

    The cross-validation score achieved for the model by training.

    n_rows: number

    The number of rows in the training table.

    n_columns: number

    The number of columns in the training table.

    optimization_metric: string

    The optimization metric used for training.

    n_selected_columns: number

    The number of rows selected by feature selection.

    algorithm_name: number

    The name of the chosen algorithm.

    n_selected_rows: number

    The number of rows selected by adaptive sampling.

    training_time: number

    The time in seconds taken to train the model.

    selected_column_names: JSON array

    The names of the columns selected by feature selection.

    format: string

    The model serialization format.

  • notes

    Use this column to record your own notes on the trained model. It is also used to store error messages that occur during model training.