ML_PREDICT_ROW
generates
predictions for one or more rows of unlabeled data specified
in JSON
format. Invoke
ML_PREDICT_ROW
with a
SELECT
statement.
ML_PREDICT_ROW
requires a
loaded model to run. See
Section 3.14.3, “Loading Models”.
ML_PREDICT_ROW
supports data
drift detection for classification and regression models with
the following:
The
options
parameter includes theadditional_details
boolean value.The
ml_results
column includes thedrift
JSON object literal.
See: Section 3.14.11, “Data Drift Detection”.
mysql> SELECT sys.ML_PREDICT_ROW(input_data, model_handle), [options]);
options: {
JSON_OBJECT("key","value"[,"key","value"] ...)
"key","value": {
['threshold', 'N']
['topk', 'N']
['recommend', {'ratings'|'items'|'users'|'users_to_items'|'items_to_users'|'items_to_items'|'users_to_users'}|NULL]
['remove_seen', {'true'|'false'}]
['additional_details', {'true'|'false'}]
}
}
A call to ML_PREDICT_ROW
can
include columns that were not present during
ML_TRAIN
. A table can include
extra columns, and still use the HeatWave AutoML model. This
allows side by side comparisons of target column labels,
ground truth, and predictions in the same table.
ML_PREDICT_ROW
ignores any
extra columns, and appends them to the results.
ML_PREDICT_ROW
parameters:
-
input_data
: Specifies the data to generate predictions for.Specify a single row of data in
JSON
format:mysql> SELECT sys.ML_PREDICT_ROW(JSON_OBJECT("column_name", value, "column_name", value, ...), model_handle, options);
To run
ML_PREDICT_ROW
on multiple rows of data, specify the columns as key-value pairs inJSON
format and select from a table:mysql> SELECT sys.ML_PREDICT_ROW(JSON_OBJECT("output_col_name", schema.`input_col_name`, "output_col_name", schema.`input_col_name`, ...), model_handle, options) FROM input_table_name LIMIT N;
model_handle
: Specifies the model handle or a session variable that contains the model handle.-
options
: A set of options inJSON
format.This parameter only supports the
recommendation
andanomaly_detection
tasks. For all other tasks, set this parameter toNULL
.threshold
: The optional threshold for use with theanomaly_detection
task to convert anomaly scores to1
: an anomaly or0
: normal. 0 <threshold
< 1. The default value is (1 -contamination
)-th percentile of all the anomaly scores.topk
: Use with therecommendation
task to specify the number of recommendations to provide. A positive integer. The default is3
.-
recommend
: Use with therecommendation
task to specify what to recommend. Permitted values are:-
ratings
: Use this option to predict ratings. This is the default value.The target column is
prediction
, and the values arefloat
.The input table must contain at least two columns with the same names as the user column and item column from the training model.
-
items
: Use this option to recommend items for users.The target column is
item_recommendation
, and the values are:JSON_OBJECT("column_item_id_name", JSON_ARRAY("item_1", ... , "item_k"), "column_rating_name" , JSON_ARRAY(rating_1, ..., rating_k))
The input table must at least contain a column with the same name as the user column from the training model.
-
users
: Use this option to recommend users for items.The target column is
user_recommendation
, and the values are:JSON_OBJECT("column_user_id_name", JSON_ARRAY("user_1", ... , "user_k"), "column_rating_name" , JSON_ARRAY(rating_1, ..., rating_k))
The input table must at least contain a column with the same name as the item column from the training model.
users_to_items
: This is the same asitems
.items_to_users
: This is the same asusers
.-
items_to_items
: Use this option to recommend similar items for items.The target column is
item_recommendation
, and the values are:JSON_OBJECT("column_item_id_name", JSON_ARRAY("item_1", ... , "item_k"))
The input table must at least contain a column with the same name as the item column from the training model.
-
users_to_users
: Use this option to recommend similar users for users.The target column is
user_recommendation
, and the values are:JSON_OBJECT("column_user_id_name", JSON_ARRAY("user_1", ... , "user_k"))
The input table must at least contain a column with the same name as the user column from the training model.
-
remove_seen
: If the input table overlaps with the training table, andremove_seen
istrue
, then the model will not repeat existing interactions. The default istrue
. Setremove_seen
tofalse
to repeat existing interactions from the training table.additional_details
: Set totrue
forml_results
to include the JSON object literal,drift
.
-
To run
ML_PREDICT_ROW
on a single row of data use a select statement. The results include theml_results
field, which usesJSON
format:mysql> SELECT sys.ML_PREDICT_ROW(JSON_OBJECT("sepal length", 7.3, "sepal width", 2.9, "petal length", 6.3, "petal width", 1.8), @iris_model, NULL); +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"sepal length": 7.3, "sepal width": 2.9, "petal length": 6.3, "petal width": 1.8}', @iris_model, NULL) | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | {"Prediction": "Iris-virginica", "ml_results": "{'predictions': {'class': 'Iris-virginica'}, 'probabilities': {'Iris-setosa': 0.0, 'Iris-versicolor': 0.13, 'Iris-virginica': 0.87}}", "petal width": 1.8, "sepal width": 2.9, "petal length": 6.3, "sepal length": 7.3} | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (1.12 sec)
-
To run
ML_PREDICT_ROW
on five rows of data selected from an input table:mysql> SELECT sys.ML_PREDICT_ROW(JSON_OBJECT("sepal length", iris_test.`sepal length`, "sepal width", iris_test.`sepal width`, "petal length", iris_test.`petal length`, "petal width", iris_test.`petal width`), @iris_model, NULL) FROM ml_data.iris_test LIMIT 5;
See also: