Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.1Mb
PDF (A4) - 2.1Mb


3.16.2 ML_EXPLAIN

Running the ML_EXPLAIN routine on a model and dataset trains a prediction explainer and model explainer, and adds a model explanation to the model catalog.

MySQL 9.0.0 introduces support for large models that changes how HeatWave AutoML stores models, see: Section 3.14.1, “The Model Catalog”. ML_EXPLAIN upgrades older models.

ML_EXPLAIN does not support recommendation models, and a call with a recommendation model will produce an error.

ML_EXPLAIN does not support anomaly detection, and a call with an anomaly detection model will produce an error.

ML_EXPLAIN Syntax

mysql> CALL sys.ML_EXPLAIN ('table_name', 'target_column_name',
          model_handle_variable, [options]);

options: {
     JSON_OBJECT("key","value"[,"key","value"] ...)
          "key","value": {
          ['model_explainer', {'permutation_importance'|'partial_dependence'|'shap'|'fast_shap'}| NULL]
          ['prediction_explainer', {'permutation_importance'|'shap'}]
          ['columns_to_explain', JSON_ARRAY('column'[,'column'] ...)]
          ['target_value', 'target_class']
          }
}

Run the ML_EXPLAIN routine before ML_EXPLAIN_ROW and ML_EXPLAIN_TABLE routines. The ML_TRAIN routine also runs the ML_EXPLAIN routine with the default Permutation Importance model. It is only necessary to use the ML_EXPLAIN routine to train prediction explainers and model explainers with a different model. See Section 3.6, “Training Explainers”.

ML_EXPLAIN parameters:

  • table_name: The name of the table that contains the labeled training dataset. The table name must be valid and fully qualified; that is, it must include the schema name (schema_name.table_name). Use NULL for help. Use the dataset that the model is trained on - running ML_EXPLAIN on a dataset that the model has not been trained on produces errors or unreliable explanations.

  • target_column_name: The name of the target column in the training dataset containing ground truth values.

  • model_handle: A string containing the model handle for the model in the model catalog. Use NULL for help. The model explanation is stored in this model metadata. The model must be loaded first, for example:

    mysql> CALL sys.ML_MODEL_LOAD('ml_data.iris_train_user1_1636729526', NULL);

    If you run ML_EXPLAIN again with the same model handle and model explainer, the model explanation field is overwritten with the new result.

  • options: Optional parameters specified as key-value pairs in JSON format. If an option is not specified, the default setting is used. If you specify NULL in place of the JSON argument, the default Permutation Importance model explainer is trained, and no prediction explainer is trained. The available options are:

    • model_explainer: Specifies the model explainer. Valid values are:

      • permutation_importance: The default model explainer.

      • shap: The SHAP model explainer, which produces global feature importance values based on Shapley values.

      • fast_shap: The Fast SHAP model explainer, which is a subsampling version of the SHAP model explainer that usually has a faster runtime.

      • partial_dependence: Explains how changing the values in one or more columns will change the value predicted by the model. The following additional arguments are required for the partial_dependence model explainer:

        • columns_to_explain: a JSON array of one or more column names in the table specified by table_name. The model explainer explains how changing the value in this column or columns affects the model.

        • target_value: a valid value that the target column containing ground truth values, as specified by target_column_name, can take.

    • prediction_explainer: Specifies the prediction explainer. Valid values are:

      • permutation_importance: The default prediction explainer.

      • shap: The SHAP prediction explainer, which produces global feature importance values based on Shapley values.

Syntax Examples

  • Load the model first:

    mysql> CALL sys.ML_MODEL_LOAD('ml_data.iris_train_user1_1636729526', NULL);
  • Running ML_EXPLAIN to train the SHAP prediction explainer and the Fast SHAP model explainer:

    mysql> CALL sys.ML_EXPLAIN('ml_data.iris_train', 'class',
              'ml_data.iris_train_user1_1636729526', 
              JSON_OBJECT('model_explainer', 'fast_shap', 'prediction_explainer', 'shap'));
  • Run ML_EXPLAIN and use NULL for the options trains the default Permutation Importance model explainer and no prediction explainer:

    mysql> CALL sys.ML_EXPLAIN('ml_data.iris_train', 'class',
              'ml_data.iris_train_user1_1636729526', NULL);
  • Running ML_EXPLAIN to train the Partial Dependence model explainer (which requires extra options) and the SHAP prediction explainer:

    mysql> CALL sys.ML_EXPLAIN('ml_data.iris_train', 'class', @iris_model, 
              JSON_OBJECT('columns_to_explain', JSON_ARRAY('sepal width'), 
              'target_value', 'Iris-setosa', 'model_explainer', 'partial_dependence', 
              'prediction_explainer', 'shap'));
  • Viewing the model explanation, in this case produced by the Permutation Importance model explainer:

    mysql> SELECT model_explanation FROM MODEL_CATALOG WHERE model_handle = @iris_model; 
    +----------------------------------------------------------------------------------------------------------------------+ 
    | model_explanation                                                                                                    | 
    +----------------------------------------------------------------------------------------------------------------------+ 
    | {"permutation_importance": {"petal width": 0.5926, "sepal width": 0.0, "petal length": 0.0423, "sepal length": 0.0}} | 
    +----------------------------------------------------------------------------------------------------------------------+
  • An ML_EXPLAIN example that stores the model in the model_object_catalog.

    mysql> SET @explain_option = JSON_OBJECT('model_explainer', 'shap', 'prediction_explainer', 'shap');  
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> CALL sys.ML_EXPLAIN('mlcorpus.iris_train', 'class', @iris_model, @explain_option);
    Query OK, 0 rows affected (11.51 sec)
    
    mysql> SELECT model_object, model_object_size 
              FROM ML_SCHEMA_user1.MODEL_CATALOG 
              WHERE model_handle=@iris_model;
    +--------------+-------------------+
    | model_object | model_object_size |
    +--------------+-------------------+
    | NULL         |            348954 |
    +--------------+-------------------+
    1 row in set (0.00 sec)
    
    
    mysql> SELECT model_metadata->>'$.format', model_metadata->>'$.chunks'
              FROM ML_SCHEMA_user1.MODEL_CATALOG 
              WHERE model_handle=@iris_model;
    +-----------------------------+-----------------------------+
    | model_metadata->>'$.format' | model_metadata->>'$.chunks' |
    +-----------------------------+-----------------------------+
    | HWMLv2.0                    | 1                           |
    +-----------------------------+-----------------------------+
    1 row in set (0.00 sec)
    
    
    mysql> SELECT chunk_id, length(model_object) 
              FROM ML_SCHEMA_user1.model_object_catalog 
              WHERE model_handle=@iris_model;
    +----------+----------------------+
    | chunk_id | length(model_object) |
    +----------+----------------------+
    |        1 |               348954 |
    +----------+----------------------+
    1 row in set (0.00 sec)