ML_EXPLAIN_TABLE
explains
predictions for an entire table of unlabeled data and saves
results to an output table.
ML_EXPLAIN_TABLE
is a compute
intensive process. Limiting operations to batches of 10 to 100
rows by splitting large tables into smaller tables is
recommended.
ML_EXPLAIN_TABLE
limits
explanations to the 100 most relevant features.
A loaded model, trained with a prediction explainer, is
required to run
ML_EXPLAIN_TABLE
. See
Section 3.14.3, “Loading Models” and
Section 3.16.2, “ML_EXPLAIN”.
The returned table includes a primary key:
If the input table has a primary key, the output table will have the same primary key.
-
If the input table does not have a primary key, the output table will have a new primary key column that auto increments.
As of MySQL 8.4.1, the name of the new primary key column is
_4aad19ca6e_pk_id
. The input table must not have a column with the name_4aad19ca6e_pk_id
that is not a primary key.Before MySQL 8.4.1, the name of the new primary key column is
_id
. The input table must not have a column with the name_id
that is not a primary key.
ML_EXPLAIN_TABLE
does not
support anomaly detection, and a call with an anomaly
detection model will produce an error.
ML_EXPLAIN_TABLE
does not
support recommendation models, and a call with a
recommendation model will produce an error.
A call to ML_EXPLAIN_TABLE
can
include columns that were not present during
ML_TRAIN
. A table can include
extra columns, and still use the HeatWave AutoML model. This allows
side by side comparisons of target column labels, ground
truth, and explanations in the same table.
ML_EXPLAIN_TABLE
ignores any
extra columns, and appends them to the results.
mysql> CALL sys.ML_EXPLAIN_TABLE(table_name, model_handle, output_table_name, [options]);
options: {
JSON_OBJECT("key","value"[,"key","value"] ...)
"key","value": {
['prediction_explainer', {'permutation_importance'|'shap'}|NULL]
['batch_size', 'N']
}
}
ML_EXPLAIN_TABLE
parameters:
table_name
: Specifies the fully qualified name of the input table (schema_name.table_name
). The input table should contain the same feature columns as the table used to train the model but no target column.model_handle
: Specifies the model handle or a session variable containing the model handle.output_table_name
: Specifies the table where explanation data is stored. The table is created if it does not exist. A fully qualified table name must be specified (schema_name.table_name
). If the table already exists, an error is returned.-
options
: A set of options inJSON
format. The available options are:-
prediction_explainer
: The name of the prediction explainer that you have trained for this model usingML_EXPLAIN
. Valid values are:permutation_importance
: The default prediction explainer.shap
: The SHAP prediction explainer, which produces global feature importance values based on Shapley values.
-
batch_size
: The size of each batch. 1 ≤batch_size
≤ 100. The default is 100, and this provides the best results.
-
The following example generates explanations for a table of data with the default Permutation Importance prediction explainer. The
ML_EXPLAIN_TABLE
call specifies the fully qualified name of the table to generate explanations for, the session variable containing the model handle, and the fully qualified output table name.mysql> CALL sys.ML_EXPLAIN_TABLE('ml_data.iris_test', @iris_model, 'ml_data.iris_explanations', JSON_OBJECT('prediction_explainer', 'permutation_importance'));
To view
ML_EXPLAIN_TABLE
results, query the output table. TheSELECT
statement retrieves explanation data from the output table. The table includes the primary key,_4aad19ca6e_pk_id
, and theml_results
column, which usesJSON
format:mysql> SELECT * FROM ml_data.iris_explanations LIMIT 5; +-------------------+--------------+-------------+--------------+-------------+-----------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | _4aad19ca6e_pk_id | sepal length | sepal width | petal length | petal width | class | Prediction | Notes | petal length_attribution | petal width_attribution | ml_results | +-------------------+--------------+-------------+--------------+-------------+-----------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | 7.3 | 2.9 | 6.3 | 1.8 | Iris-virginica | Iris-virginica | petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica | 0.57 | 0.73 | {'attributions': {'petal length': 0.57, 'petal width': 0.73}, 'predictions': {'class': 'Iris-virginica'}, 'notes': 'petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica'} | | 2 | 6.1 | 2.9 | 4.7 | 1.4 | Iris-versicolor | Iris-versicolor | petal width (1.399999976158142) had the largest impact towards predicting Iris-versicolor | 0.14 | 0.6 | {'attributions': {'petal length': 0.14, 'petal width': 0.6}, 'predictions': {'class': 'Iris-versicolor'}, 'notes': 'petal width (1.399999976158142) had the largest impact towards predicting Iris-versicolor'} | | 3 | 6.3 | 2.8 | 5.1 | 1.5 | Iris-virginica | Iris-versicolor | petal width (1.5) had the largest impact towards predicting Iris-versicolor, whereas petal length (5.099999904632568) contributed the most against predicting Iris-versicolor | -0.25 | 0.31 | {'attributions': {'petal length': -0.25, 'petal width': 0.31}, 'predictions': {'class': 'Iris-versicolor'}, 'notes': 'petal width (1.5) had the largest impact towards predicting Iris-versicolor, whereas petal length (5.099999904632568) contributed the most against predicting Iris-versicolor'} | | 4 | 6.3 | 3.3 | 4.7 | 1.6 | Iris-versicolor | Iris-versicolor | petal width (1.600000023841858) had the largest impact towards predicting Iris-versicolor | 0.14 | 0.58 | {'attributions': {'petal length': 0.14, 'petal width': 0.58}, 'predictions': {'class': 'Iris-versicolor'}, 'notes': 'petal width (1.600000023841858) had the largest impact towards predicting Iris-versicolor'} | | 5 | 6.1 | 3 | 4.9 | 1.8 | Iris-virginica | Iris-virginica | petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica | 0.38 | 0.61 | {'attributions': {'petal length': 0.38, 'petal width': 0.61}, 'predictions': {'class': 'Iris-virginica'}, 'notes': 'petal width (1.7999999523162842) had the largest impact towards predicting Iris-virginica'} | +-------------------+--------------+-------------+--------------+-------------+-----------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 5 rows in set (0.00 sec)
-
Run
ML_EXPLAIN_TABLE
with the SHAP prediction explainer:mysql> CALL sys.ML_EXPLAIN_TABLE('ml_data.`iris_test_temp`', @model, 'ml_data.`iris_explanations`', JSON_OBJECT('prediction_explainer', 'shap'));
See also: Section 3.15, “Progress tracking”.