Related Documentation Download this Manual
PDF (US Ltr) - 1.8Mb
PDF (A4) - 1.8Mb


HeatWave User Guide  /  ...  /  ML_EXPLAIN_TABLE

3.15.7 ML_EXPLAIN_TABLE

ML_EXPLAIN_TABLE explains predictions for an entire table of unlabeled data and saves results to an output table.

ML_EXPLAIN_TABLE is a compute intensive process. Limiting operations to batches of 10 to 100 rows by splitting large tables into smaller tables is recommended. MySQL 8.2.0 adds batch processing with the batch_size option.

ML_EXPLAIN_TABLE limits explanations to the 100 most relevant features.

A loaded model, trained with a prediction explainer, is required to run ML_EXPLAIN_TABLE. See Section 3.13.3, “Loading Models” and Section 3.15.2, “ML_EXPLAIN”.

As of MySQL 8.0.32, the returned table includes a primary key:

  • If the input table has a primary key, the output table will have the same primary key.

  • If the input table does not have a primary key, the output table will have a new primary key named _id that auto increments.

The input table must not have a column with the name _id that is not a primary key.

MySQL 8.0.32 introduces anomaly detection. ML_EXPLAIN_TABLE does not support anomaly detection, and a call with an anomaly detection model will produce an error.

MySQL 8.0.32 allows a call to ML_EXPLAIN_TABLE to include columns that were not present during ML_TRAIN. A table can include extra columns, and still use the HeatWave AutoML model. This allows side by side comparisons of target column labels, ground truth, and explanations in the same table. ML_EXPLAIN_TABLE ignores any extra columns, and appends them to the results.

MySQL 8.0.33 introduces recommendation models. ML_EXPLAIN_TABLE does not support recommendation models, and a call with a recommendation model will produce an error.

MySQL 8.2.0 adds batch processing with the batch_size option.

ML_EXPLAIN_TABLE Syntax

MySQL 8.2.0 adds an option to support batch processing.

mysql> CALL sys.ML_EXPLAIN_TABLE(table_name, model_handle, output_table_name, [options]);
 
options: {
    JSON_OBJECT('key','value'[,'key','value'] ...)
        'key','value':
        |'prediction_explainer', {'permutation_importance'|'shap'}|NULL
        |'batch_size', 'N'
}

Before MySQL 8.2.0:

mysql> CALL sys.ML_EXPLAIN_TABLE(table_name, model_handle, output_table_name, [options]);
 
options: {
    JSON_OBJECT('key','value'[,'key','value'] ...)
        'key','value':
        |'prediction_explainer', {'permutation_importance'|'shap'}|NULL
}

ML_EXPLAIN_TABLE parameters:

  • table_name: Specifies the fully qualified name of the input table (schema_name.table_name). The input table should contain the same feature columns as the table used to train the model but no target column.

  • model_handle: Specifies the model handle or a session variable containing the model handle.

  • output_table_name: Specifies the table where explanation data is stored. The table is created if it does not exist. A fully qualified table name must be specified (schema_name.table_name). If the table already exists, an error is returned.

  • options: A set of options in JSON format. The available options are:

    • prediction_explainer: The name of the prediction explainer that you have trained for this model using ML_EXPLAIN. Valid values are:

      • permutation_importance: The default prediction explainer.

      • shap: The SHAP prediction explainer, which produces global feature importance values based on Shapley values.

  • batch_size: The size of each batch. 1 ≤ batch_size ≤ 100. The default is 100, and this provides the best results.

Syntax Examples

  • The following example generates explanations for a table of data with the default Permutation Importance prediction explainer. The ML_EXPLAIN_TABLE call specifies the fully qualified name of the table to generate explanations for, the session variable containing the model handle, and the fully qualified output table name.

    mysql> CALL sys.ML_EXPLAIN_TABLE('ml_data.iris_test', @iris_model, 
              'ml_data.iris_explanations', 
              JSON_OBJECT('prediction_explainer', 'permutation_importance'));

    To view ML_EXPLAIN_TABLE results, query the output table. The SELECT statement retrieves explanation data from the output table. The table includes the primary key, _id, and the ml_results column, which uses JSON format:

    mysql> SELECT * FROM ml_data.iris_explanations LIMIT 5;
    +-----+--------------+-------------+--------------+-------------+-----------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | _id | sepal length | sepal width | petal length | petal width | class           | Prediction      | Notes                                                                                                                                                                         | petal length_attribution | petal width_attribution | ml_results                                                                                                                                                                                                                                                                                            |
    +-----+--------------+-------------+--------------+-------------+-----------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |   1 |          7.3 |         2.9 |          6.3 |         1.8 | Iris-virginica  | Iris-virginica  | petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica                                                                                     |                     0.57 |                    0.73 | {'attributions': {'petal length': 0.57, 'petal width': 0.73}, 'predictions': {'class': 'Iris-virginica'}, 'notes': 'petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica'}                                                                                       |
    |   2 |          6.1 |         2.9 |          4.7 |         1.4 | Iris-versicolor | Iris-versicolor | petal width (1.399999976158142) had the largest impact towards predicting Iris-versicolor                                                                                     |                     0.14 |                     0.6 | {'attributions': {'petal length': 0.14, 'petal width': 0.6}, 'predictions': {'class': 'Iris-versicolor'}, 'notes': 'petal width (1.399999976158142) had the largest impact towards predicting Iris-versicolor'}                                                                                       |
    |   3 |          6.3 |         2.8 |          5.1 |         1.5 | Iris-virginica  | Iris-versicolor | petal width (1.5) had the largest impact towards predicting Iris-versicolor, whereas petal length (5.099999904632568) contributed the most against predicting Iris-versicolor |                    -0.25 |                    0.31 | {'attributions': {'petal length': -0.25, 'petal width': 0.31}, 'predictions': {'class': 'Iris-versicolor'}, 'notes': 'petal width (1.5) had the largest impact towards predicting Iris-versicolor, whereas petal length (5.099999904632568) contributed the most against predicting Iris-versicolor'} |
    |   4 |          6.3 |         3.3 |          4.7 |         1.6 | Iris-versicolor | Iris-versicolor | petal width (1.600000023841858) had the largest impact towards predicting Iris-versicolor                                                                                     |                     0.14 |                    0.58 | {'attributions': {'petal length': 0.14, 'petal width': 0.58}, 'predictions': {'class': 'Iris-versicolor'}, 'notes': 'petal width (1.600000023841858) had the largest impact towards predicting Iris-versicolor'}                                                                                      |
    |   5 |          6.1 |           3 |          4.9 |         1.8 | Iris-virginica  | Iris-virginica  | petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica                                                                                     |                     0.38 |                    0.61 | {'attributions': {'petal length': 0.38, 'petal width': 0.61}, 'predictions': {'class': 'Iris-virginica'}, 'notes': 'petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica'}                                                                                       |
    +-----+--------------+-------------+--------------+-------------+-----------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    5 rows in set (0.00 sec)

    Before MySQL 8.0.32, the output table does not include the ml_results column:

    mysql> SELECT * FROM ml_data.iris_explanations LIMIT 3;
    *************************** 1. row ***************************
                sepal length: 7.3
                 sepal width: 2.9
                petal length: 6.3
                 petal width: 1.8
                  Prediction: Iris-virginica
    petal length_attribution: 0.57
     petal width_attribution: 0.73
    *************************** 2. row ***************************
                sepal length: 6.1
                 sepal width: 2.9
                petal length: 4.7
                 petal width: 1.4
                  Prediction: Iris-versicolor
    petal length_attribution: 0.14
     petal width_attribution: 0.6
    *************************** 3. row ***************************
                sepal length: 6.3
                 sepal width: 2.8
                petal length: 5.1
                 petal width: 1.5
                  Prediction: Iris-virginica
    petal length_attribution: -0.25
     petal width_attribution: 0.31
    3 rows in set (0.0006 sec)
  • Run ML_EXPLAIN_TABLE with the SHAP prediction explainer:

    mysql> CALL sys.ML_EXPLAIN_TABLE('ml_data.`iris_test_temp`', @model, 
              'ml_data.`iris_explanations`', 
              JSON_OBJECT('prediction_explainer', 'shap'));

See also: Section 3.14, “Progress tracking”.