MySQL :: MySQL AI 9.4 :: 4.6.2.4 Query Model Explanation and Generate Prediction Explanations for a Regression Model

MySQL AI 9.4 / ... / Query Model Explanation and Generate Prediction Explanations for a Regression Model

4.6.2.4 Query Model Explanation and Generate Prediction Explanations for a Regression Model

After training a regression model, you can query the default model explanation or query new model explanations. You can also generate prediction explanations. Explanations help you understand which features had the most influence on generating predictions.

Feature importance is presented as an attribution value ranging from -1 to 1. A positive value indicates that a feature contributed toward the prediction. A negative value indicates that the feature contributes positively towards one of the other possible predictions.

Before You Begin

Complete the following tasks:

Generating the Model Explanation

After training a model, you can query the default model explanation with the Permutation Importance explainer.

To generate explanations for other model explainers, see Generate Model Explanations and ML_EXPLAIN.

Query the model_explanation column from the model catalog and define the model handle previously created. Update user1 with your own user name. Use JSON_PRETTY to view the output in an easily readable format.

mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG 
                     WHERE model_handle='regression_use_case';
+------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation)                                                                                         |
+------------------------------------------------------------------------------------------------------------------------+
| {
  "permutation_importance": {
    "id": 0.0257,
    "state": 0.0278,
    "address": 0.0,
    "house_size": 2.3762
  }
} |
+------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)

Feature importance values display for each column.

Generating Prediction Explanations for a Table

After training a model, you can generate a table of prediction explanations on the house_price_testing dataset by using the default Permutation Importance prediction explainer.

To generate explanations for other model explainers, see Generate Prediction Explanations and ML_EXPLAIN_TABLE.

If not already done, load the model. You can use the session variable for the model that is valid for the duration of the connection. Alternatively, you can use the model handle previously set. For the option to set the user name, you can set it to NULL.

The following example uses the session variable.
```
mysql> CALL sys.ML_MODEL_LOAD(@model, NULL);
```
The following example uses the model handle.
```
mysql> CALL sys.ML_MODEL_LOAD('regression_use_case', NULL);
```
Use the ML_EXPLAIN_TABLE routine to generate explanations for predictions made in the test dataset.
```
mysql> CALL sys.ML_EXPLAIN_TABLE(table_name, model_handle, output_table_name, [options]);
```
Replace table_name, model_handle, and output_table_name with your own values. Add options as needed.

You have the option to specify the input table and output table as the same table if specific conditions are met. See Input Tables and Output Tables to learn more.

The following example runs ML_EXPLAIN_TABLE on the testing dataset previously created.
```
mysql> CALL sys.ML_EXPLAIN_TABLE('regression_data.house_price_testing', 'regression_use_case', 'regression_data.regression_explanations', 
                                  JSON_OBJECT('prediction_explainer', 'permutation_importance'));
```
Where:
- regression_data.house_price_testing is the fully qualified name of the test dataset.
- regression_use_case is the model handle for the trained table.
- regression_data.regression_explanations is the fully qualified name of the output table with explanations.
- permutation_importance is the selected prediction explainer to use to generate explanations.

Query Notes and ml_results from the output table to review which column contributed the most against or had the largest impact towards the prediction. You can also review individual attribution values for each column. Use \G to view the output in an easily readable format.

mysql> SELECT Notes, ml_results FROM regression_data.regression_explanations\G
*************************** 1. row ***************************
     Notes: house_size (1400) increased the value the model predicted the most, whereas state (Nevada) reduced the value the model predicted the most
ml_results: {"attributions": {"house_size": 101328.28, "state": -1037.94, "id": -300.23}, "predictions": {"price": 534371.5625}, "notes": "house_size (1400) increased the value the model predicted the most, whereas state (Nevada) reduced the value the model predicted the most"}
*************************** 2. row ***************************
     Notes: house_size (1900) increased the value the model predicted the most
ml_results: {"attributions": {"house_size": 235996.83, "state": 16140.48, "id": 0.06}, "predictions": {"price": 669040.125}, "notes": "house_size (1900) increased the value the model predicted the most"}
*************************** 3. row ***************************
     Notes: house_size (1600) increased the value the model predicted the most, whereas state (Colorado) reduced the value the model predicted the most
ml_results: {"attributions": {"house_size": 79633.12, "state": -1220.23, "id": 5602.78}, "predictions": {"price": 512676.40625}, "notes": "house_size (1600) increased the value the model predicted the most, whereas state (Colorado) reduced the value the model predicted the most"}
*************************** 4. row ***************************
     Notes: house_size (2200) increased the value the model predicted the most
ml_results: {"attributions": {"house_size": 361015.72, "state": 9903.62, "id": 12578.75}, "predictions": {"price": 794059.0}, "notes": "house_size (2200) increased the value the model predicted the most"}
*************************** 5. row ***************************
     Notes: house_size (1300) increased the value the model predicted the most
ml_results: {"attributions": {"house_size": 31384.31, "state": 226.31, "id": 30184.16}, "predictions": {"price": 489206.0}, "notes": "house_size (1300) increased the value the model predicted the most"}
*************************** 6. row ***************************
     Notes: house_size (1700) increased the value the model predicted the most
ml_results: {"attributions": {"house_size": 80747.0, "state": 7330.35, "id": 24427.78}, "predictions": {"price": 534239.8125}, "notes": "house_size (1700) increased the value the model predicted the most"}
*************************** 7. row ***************************
     Notes: house_size (1500) increased the value the model predicted the most, whereas state (Washington) reduced the value the model predicted the most
ml_results: {"attributions": {"house_size": 79051.12, "state": -1316.08, "id": 28659.66}, "predictions": {"price": 532543.9375}, "notes": "house_size (1500) increased the value the model predicted the most, whereas state (Washington) reduced the value the model predicted the most"}
*************************** 8. row ***************************
     Notes: house_size (1800) increased the value the model predicted the most
ml_results: {"attributions": {"house_size": 245256.83, "state": 8604.06, "id": 12578.75}, "predictions": {"price": 698539.9375}, "notes": "house_size (1800) increased the value the model predicted the most"}
*************************** 9. row ***************************
     Notes: id (9) increased the value the model predicted the most, whereas state (Illinois) reduced the value the model predicted the most
ml_results: {"attributions": {"house_size": -0.03, "state": -0.03, "id": 21232.22}, "predictions": {"price": 454275.5}, "notes": "id (9) increased the value the model predicted the most, whereas state (Illinois) reduced the value the model predicted the most"}
*************************** 10. row ***************************
     Notes: house_size (2100) increased the value the model predicted the most
ml_results: {"attributions": {"house_size": 339783.47, "state": 10981.75, "id": 12411.04}, "predictions": {"price": 794059.0}, "notes": "house_size (2100) increased the value the model predicted the most"}

To generate prediction explanations for one or more rows of data, see Generate Prediction Explanations for a Row of Data.

What's Next

Learn how to Score a Regression Model.

PREV HOME UP NEXT