MySQL :: HeatWave User Guide :: 10.2.7 ML_EXPLAIN

HeatWave User Guide / ... / ML_EXPLAIN_ROW

10.2.7 ML_EXPLAIN_ROW

The ML_EXPLAIN_ROW routine generates explanations for one or more rows of unlabeled data. Invoke ML_EXPLAIN_ROW with a SELECT statement. It limits explanations to the 100 most relevant features.

A loaded and trained model with the appropriate prediction explainer is required to run ML_EXPLAIN_ROW. See Generate Prediction Explanations for a Row of Data.

ML_EXPLAIN_ROW does not support recommendation, anomaly detection and topic modeling models. A call with one of these models produces an error.

A call to ML_EXPLAIN_ROW can include columns that were not present during ML_TRAIN. A table can include extra columns, and still use the MySQL HeatWave AutoML model. This allows side by side comparisons of target column labels, ground truth, and explanations in the same table. ML_EXPLAIN_ROW ignores any extra columns, and appends them to the results.

ML_EXPLAIN_ROW does not support the following model types:

Forecasting
Recommendation
Anomaly detection
Anomaly detection for logs
Topic modeling

This topic has the following sections.

ML_EXPLAIN_ROW Syntax

Press CTRL+C to copy
 
mysql> SELECT sys.ML_EXPLAIN_ROW(input_data, model_handle, [options]);

options: {
 JSON_OBJECT("key","value"[,"key","value"] ...)
      "key","value": {
      ['prediction_explainer', {'permutation_importance'|'shap'}|NULL]
      }
}

Required ML_EXPLAIN_ROW Parameters

Set the following required parameters:

input_data: Define the data to generate explanations for. The column names must match the feature column names in the table used to train the model. You can define the input data in the following ways:

Specify a single row of data in JSON format:
```
Press CTRL+C to copy
 
mysql> SELECT sys.ML_EXPLAIN_ROW(JSON_OBJECT("column_name", value, "column_name", value, ...)', model_handle, options);
```
To run ML_EXPLAIN_ROW on multiple rows of data, specify the columns in JSON key-value format and select from an input table:
```
Press CTRL+C to copy
 
mysql> SELECT sys.ML_EXPLAIN_ROW(JSON_OBJECT("output_col_name", schema.`input_col_name`, output_col_name", schema.`input_col_name`, ...), model_handle, options) 
        FROM input_table_name LIMIT N;
```
model_handle: Specifies the model handle or a session variable containing the model handle. See Work with Model Handles.

ML_EXPLAIN_ROW Options

You can set the following option in JSON format as needed:

prediction_explainer: The name of the prediction explainer that you have trained for this model using ML_EXPLAIN.
- permutation_importance: The default prediction explainer.
- shap: The SHAP prediction explainer, which produces feature importance values based on Shapley values.

Syntax Examples

The following example generates a prediction explainer on a single row of data with the default Permutation Importance prediction explainer. The results include the ml_results field, which uses JSON format. Optionally, use \G to display the output in an easily readable format.

Press CTRL+C to copy
 
mysql> SET @row_input = JSON_OBJECT( 
          "age", 31, 
          "workclass", "Private", 
          "fnlwgt", 45781, 
          "education", "Masters", 
          "education-num", 14, 
          "marital-status", "Married-civ-spouse", 
          "occupation", "Prof-specialty", 
          "relationship", "Not-in-family", 
          "race", "White", 
          "sex", "Female", 
          "capital-gain", 14084, 
          "capital-loss", 2042, 
          "hours-per-week", 40, 
          "native-country", "India");
mysql> SELECT sys.ML_EXPLAIN_ROW(@row_input, @census_model, JSON_OBJECT('prediction_explainer', 'permutation_importance'))\G
*************************** 1. row ***************************
sys.ML_EXPLAIN_ROW(@row_input, @census_model, 
          JSON_OBJECT('prediction_explainer', 'permutation_importance')): 
          {
                "age": 31,
                "sex": "Female",
                "race": "White",
                "Notes": "capital-gain (14084) had the largest impact towards predicting >50K",
                "fnlwgt": 45781,
                "education": "Masters",
                "workclass": "Private",
                "Prediction": ">50K",
                "ml_results": {
                   "notes": "capital-gain (14084) had the largest impact towards predicting >50K",
                   "predictions": {
                       "revenue": ">50K"
                    },
                    "attributions": {
                        "age": 0.34,
                        "sex": 0,
                        "race": 0,
                        "fnlwgt": 0,
                        "education": 0,
                        "workclass": 0,
                        "occupation": 0,
                        "capital-gain": 0.97,
                        "capital-loss": 0,
                        "relationship": 0,
                        "education-num": 0.04,
                        "hours-per-week": 0,
                        "marital-status": 0
                    }
                },
                "occupation": "Prof-specialty",
                "capital-gain": 14084,
                "capital-loss": 2042,
                "relationship": "Not-in-family",
                "education-num": 14,
                "hours-per-week": 40,
                "marital-status": "Married-civ-spouse",
                "native-country": "India",
                "age_attribution": 0.34,
                "sex_attribution": 0,
                "race_attribution": 0,
                "fnlwgt_attribution": 0,
                "education_attribution": 0,
                "workclass_attribution": 0,
                "occupation_attribution": 0,
                "capital-gain_attribution": 0.97,
                "capital-loss_attribution": 0,
                "relationship_attribution": 0,
                "education-num_attribution": 0.04,
                "hours-per-week_attribution": 0,
                "marital-status_attribution": 0
          }
1 row in set (6.3072 sec)

The following example generates prediction explainers on two rows of the input table with the SHAP prediction explainer.

Press CTRL+C to copy
 
mysql> SELECT sys.ML_EXPLAIN_ROW(JSON_OBJECT(
	"age", census_train.`age`,
	"workclass", census_train.`workclass`,
	"fnlwgt", census_train.`fnlwgt`,
	"education", census_train.`education`,
	"education-num", census_train.`education-num`,
	"marital-status", census_train.`marital-status`,
	"occupation", census_train.`occupation`,
	"relationship", census_train.`relationship`,
	"race", census_train.`race`,
	"sex", census_train.`sex`,
	"capital-gain", census_train.`capital-gain`,
	"capital-loss", census_train.`capital-loss`,
	"hours-per-week", census_train.`hours-per-week`,
	"native-country", census_train.`native-country`),
	@census_model, JSON_OBJECT('prediction_explainer', 'shap'))FROM census_data.census_train LIMIT 2\G
*************************** 1. row ***************************
sys.ML_EXPLAIN_ROW(JSON_OBJECT( "age", census_train.`age`, "workclass", census_train.`workclass`, "fnlwgt", census_train.`fnlwgt`, "education", census_train.`education`, "education-num", census_train.`education-num`, "marital-status", census_train.`marita: {
    "age": 22,
    "sex": "Female",
    "race": "Black",
    "fnlwgt": 310380,
    "education": "HS-grad",
    "workclass": "Private",
    "Prediction": "<=50K",
    "ml_results": {
        "predictions": {
            "revenue": "<=50K"
        },
        "attributions": {
            "age_attribution": 0.055990096751945995,
            "sex_attribution": 0.011676016319165776,
            "race_attribution": 0.005258734090653583,
            "fnlwgt_attribution": 0,
            "education_attribution": 0,
            "workclass_attribution": 0,
            "occupation_attribution": 0.0036531218497025536,
            "capital-gain_attribution": 0.017052572967215754,
            "capital-loss_attribution": 0,
            "relationship_attribution": 0.03019321048408115,
            "education-num_attribution": 0.01749651048882997,
            "hours-per-week_attribution": 0.003671861337781857,
            "marital-status_attribution": 0.03869036669327783
        }
    },
    "occupation": "Adm-clerical",
    "capital-gain": 0,
    "capital-loss": 0,
    "relationship": "Unmarried",
    "education-num": 9,
    "hours-per-week": 40,
    "marital-status": "Never-married",
    "native-country": "United-States",
    "age_attribution": 0.0559900968,
    "sex_attribution": 0.0116760163,
    "race_attribution": 0.0052587341,
    "fnlwgt_attribution": 0,
    "education_attribution": 0,
    "workclass_attribution": 0,
    "occupation_attribution": 0.0036531218,
    "capital-gain_attribution": 0.017052573,
    "capital-loss_attribution": 0,
    "relationship_attribution": 0.0301932105,
    "education-num_attribution": 0.0174965105,
    "hours-per-week_attribution": 0.0036718613,
    "marital-status_attribution": 0.0386903667
}
*************************** 2. row ***************************
sys.ML_EXPLAIN_ROW(JSON_OBJECT( "age", census_train.`age`, "workclass", census_train.`workclass`, "fnlwgt", census_train.`fnlwgt`, "education", census_train.`education`, "education-num", census_train.`education-num`, "marital-status", census_train.`marita: {
    "age": 45,
    "sex": "Male",
    "race": "White",
    "fnlwgt": 182100,
    "education": "Bachelors",
    "workclass": "Local-gov",
    "Prediction": ">50K",
    "ml_results": {
        "predictions": {
            "revenue": ">50K"
        },
        "attributions": {
            "age_attribution": 0.10591945090998228,
            "sex_attribution": 0.013172526260700925,
            "race_attribution": 0.007606345008707882,
            "fnlwgt_attribution": 0.018097167152459265,
            "education_attribution": -0.007944704365873384,
            "workclass_attribution": 0.01615429281764716,
            "occupation_attribution": 0.08573874801531925,
            "capital-gain_attribution": -0.003364275424074914,
            "capital-loss_attribution": 0,
            "relationship_attribution": 0.099373669980131,
            "education-num_attribution": 0.1380689603088001,
            "hours-per-week_attribution": 0.0124334565747376,
            "marital-status_attribution": 0.0938256104928338
        }
    },
    "occupation": "Sales",
    "capital-gain": 0,
    "capital-loss": 0,
    "relationship": "Husband",
    "education-num": 13,
    "hours-per-week": 40,
    "marital-status": "Married-civ-spouse",
    "native-country": "United-States",
    "age_attribution": 0.1059194509,
    "sex_attribution": 0.0131725263,
    "race_attribution": 0.007606345,
    "fnlwgt_attribution": 0.0180971672,
    "education_attribution": -0.0079447044,
    "workclass_attribution": 0.0161542928,
    "occupation_attribution": 0.085738748,
    "capital-gain_attribution": -0.0033642754,
    "capital-loss_attribution": 0,
    "relationship_attribution": 0.09937367,
    "education-num_attribution": 0.1380689603,
    "hours-per-week_attribution": 0.0124334566,
    "marital-status_attribution": 0.0938256105
}
2 rows in set (5.5382 sec)

10.2.7 ML_EXPLAIN_ROW

ML_EXPLAIN_ROW Syntax

Required ML_EXPLAIN_ROW Parameters

ML_EXPLAIN_ROW Options

Syntax Examples

See Also