After the
ML_TRAIN
routine, use the
ML_EXPLAIN
routine to train
model explainers for AutoML. By default, the
ML_TRAIN
routine trains the Permutation Importance model explainer.
This topic has the following sections.
-
Review the following:
Explanations help you understand which features have the most influence on a prediction. Feature importance is presented as a value ranging from -1 to 1. A positive value indicates that a feature contributed toward the prediction. A negative value indicates that the feature contributed toward a different prediction. For example, if a feature in a loan approval model with two possible predictions ('approve' and 'reject') has a negative value for an 'approve' prediction, that feature would have a positive value for a 'reject' prediction. A value of 0 or near 0 indicates that the feature value has no impact on the prediction to which it applies.
Model explainers are used when you run the
ML_EXPLAIN
routine to explain what the model learned from the training
dataset. The model explainer provides a list of feature
importance to show what features the model considered
important based on the entire training dataset. The
ML_EXPLAIN
routine can train these model explainers:
The Permutation Importance model explainer, specified as
permutation_importance
, is the default model explainer.ML_TRAIN
generates this model explainer when it runs.The Partial Dependence model explainer, specified as
partial_dependence
, shows how changing the values of one or more columns changes the value that the model predicts. When you train this model explainer, you need to specify some additional options. See ML_EXPLAIN to learn more.The SHAP model explainer, specified as
shap
, produces feature importance values based on Shapley values.The Fast SHAP model explainer, specified as
fast_shap
, is a subsampling version of the SHAP model explainer, which usually has a faster runtime.
The model explanation is stored in the model catalog along
with the machine learning model in the
model_explanation
column. See
The Model
Catalog. If you run
ML_EXPLAIN
again for the same model handle and model explainer, the field
is overwritten with the new result.
You cannot generate model explanations for the following model types:
Forecasting
Recommendation
Anomaly detection
Anomaly detection for logs
Topic modeling
Before running
ML_EXPLAIN
,
you must train, and then load the model you want to use.
-
The following example trains a dataset with the classification machine learning task.
mysql> CALL sys.ML_TRAIN('census_data.census_train', 'revenue', JSON_OBJECT('task', 'classification'), @census_model);
-
The following example loads the trained model.
mysql> CALL sys.ML_MODEL_LOAD(@census_model, NULL);
For more information about training and loading models, see Train a Model and Load a Model.
After training and loading the model, you can generate model explanations. For option and parameter descriptions, see ML_EXPLAIN.
After training and loading a model, you can retrieve the
default model explanation using the
permutation_importance
explainer from the
model catalog. See
The Model
Catalog.
mysql> SELECT column FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=model_handle;
The following example retrieves the model explainer column
from the model catalog of the previously trained model. The
JSON_PRETTY
parameter displays the output
in an easily readable format.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
"permutation_importance": {
"age": 0.0292,
"sex": 0.0023,
"race": 0.0019,
"fnlwgt": 0.0038,
"education": 0.0008,
"workclass": 0.0068,
"occupation": 0.0223,
"capital-gain": 0.0479,
"capital-loss": 0.0117,
"relationship": 0.0234,
"education-num": 0.0352,
"hours-per-week": 0.0148,
"marital-status": 0.024,
"native-country": 0.0
}
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)
Replace user1
and
@census_model
with your own user name and
session variable.
The explanation displays values of permutation importance for each column.
To generate a model explanation, run the
ML_EXPLAIN
routine.
mysql> CALL sys.ML_EXPLAIN ('table_name', 'target_column_name', model_handle, [options]);
The following example generates a model explanation on the
trained and loaded model with the shap
model explainer.
mysql> CALL sys.ML_EXPLAIN('census_data.census_train', 'revenue', @census_model, JSON_OBJECT('model_explainer', 'shap'));
Where:
census_data.census_train
is the fully qualified name of the table that contains the training dataset (schema_name.table_name
).revenue
is the name of the target column, which contains ground truth values.@census_model
is the session variable for the trained model.model_explainer
is set toshap
for the SHAP model explainer.
After running
ML_EXPLAIN
,
you can view the model explanation in the Model Catalog. See
The Model
Catalog. The following example views the model
explanation for the previous command. It provides values for
each column representing importance values with the
shap
explainer.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
"shap": {
"age": 0.0467,
"sex": 0.033,
"race": 0.0155,
"fnlwgt": 0.0185,
"education": 0.016,
"workclass": 0.0255,
"occupation": 0.0001,
"capital-gain": 0.0217,
"capital-loss": 0.0001,
"relationship": 0.0426,
"education-num": 0.0186,
"hours-per-week": 0.0148,
"marital-status": 0.024,
"native-country": 0.0
},
"permutation_importance": {
"age": -0.0057,
"sex": 0.0002,
"race": 0.0001,
"fnlwgt": 0.0103,
"education": 0.0108,
"workclass": 0.0189,
"occupation": 0.0,
"capital-gain": 0.0304,
"capital-loss": 0.0,
"relationship": 0.0195,
"education-num": 0.0152,
"hours-per-week": 0.0235,
"marital-status": 0.0099,
"native-country": 0.0
}
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)
Review ML_EXPLAIN for parameter descriptions and options.
Learn how to Generate Prediction Explanations.
Learn more about the The Model Catalog.