ML_PREDICT_TABLE
generates
predictions for an entire table of unlabeled data and saves
the results to an output table. HeatWave AutoML performs the
predictions in parallel.
ML_PREDICT_TABLE
is a compute
intensive process. Limiting operations to batches of 10 to 100
rows by splitting large tables into smaller tables is
recommended.
A loaded model is required to run
ML_PREDICT_TABLE
. See
Section 3.12.3, “Loading Models”.
As of MySQL 8.0.32, the returned table includes a primary key:
If the input table has a primary key, the output table will have the same primary key.
If the input table does not have a primary key, the output table will have a new primary key named
_id
that auto increments.
The input table must not have a column with the name
_id
that is not a primary key.
MySQL 8.1.0 added more options to support the
recommendation
task.
mysql> CALL sys.ML_PREDICT_TABLE(table_name, model_handle, output_table_name), [options]);
options: {
JSON_OBJECT('key','value'[,'key','value'] ...)
'key','value':
|'threshold', 'N'
|'topk', 'N'
|'recommend', {'ratings'|'items'|'users'|'users_to_items'|'items_to_users'|'items_to_items'|'users_to_users'}|NULL
}
MySQL 8.0.33 added options that support the
recommendation
task.
mysql> CALL sys.ML_PREDICT_TABLE(table_name, model_handle, output_table_name), [options]);
options: {
JSON_OBJECT('key','value'[,'key','value'] ...)
'key','value':
|'threshold', 'N'
|'topk', 'N'
|'recommend', {'ratings'|'items'|'users'}|NULL
}
MySQL 8.0.32 added an options parameter in
JSON
format that supports the
anomaly_detection
task. For all other
tasks, set this parameter to NULL
.
MySQL 8.0.32 allows a call to
ML_PREDICT_TABLE
to include
columns that were not present during
ML_TRAIN
. A table can include
extra columns, and still use the HeatWave AutoML model. This allows
side by side comparisons of target column labels, ground
truth, and predictions in the same table.
ML_PREDICT_TABLE
ignores any
extra columns, and appends them to the results.
mysql> CALL sys.ML_PREDICT_TABLE(table_name, model_handle, output_table_name), [options]);
options: {
JSON_OBJECT('key','value'[,'key','value'] ...)
'key','value':
|'threshold', 'N'
|'topk', 'N'
}
Before MySQL 8.0.32:
mysql> CALL sys.ML_PREDICT_TABLE(table_name, model_handle, output_table_name);
ML_PREDICT_TABLE
parameters:
table_name
: Specifies the fully qualified name of the input table (schema_name.table_name
). The input table should contain the same feature columns as the training dataset but no target column.model_handle
: Specifies the model handle or a session variable containing the model handleoutput_table_name
: Specifies the table where predictions are stored. The table is created if it does not exist. A fully qualified table name must be specified (schema_name.table_name
). If the table already exists, an error is returned.-
options
: A set of options inJSON
format.As of MySQL 8.0.33, this parameter supports the
recommendation
andanomaly_detection
tasks.As of MySQL 8.0.32, this parameter only supports the
anomaly_detection
task.For all other tasks, set this parameter to
NULL
. Before MySQL 8.0.32, ignore this parameter.threshold
: The optional threshold for use with theanomaly_detection
task to convert anomaly scores to1
: an anomaly or0
: normal. 0 <threshold
< 1. The default value is (1 -contamination
)-th percentile of all the anomaly scores.-
topk
: Use with therecommendation
andanomaly_detection
tasks. A positive integer between 1 and the table length.For the
recommendation
task, the number of recommendations to provide. The default is3
.For the
anomaly_detection
task, the results include the topk rows with the highest anomaly scores. Iftopk
is not set,ML_PREDICT_TABLE
usesthreshold
.Do not set both
threshold
andtopk
. Usethreshold
ortopk
, or setoptions
toNULL
. -
recommend
: Use with therecommendation
task to specify what to recommend. Permitted values are:-
ratings
: Use this option to predict ratings. This is the default value.The target column is
prediction
, and the values arefloat
.The input table must contain at least two columns with the same names as the user column and item column from the training model.
-
items
: Use this option to recommend items for users.The target column is
item_recommendation
, and the values are:JSON_OBJECT("column_item_id_name", JSON_ARRAY("item_1", ... , "item_k"), "column_rating_name" , JSON_ARRAY(rating_1, ..., rating_k))
The input table must at least contain a column with the same name as the user column from the training model.
-
users
: Use this option to recommend users for items.The target column is
user_recommendation
, and the values are:JSON_OBJECT("column_user_id_name", JSON_ARRAY("user_1", ... , "user_k"), "column_rating_name" , JSON_ARRAY(rating_1, ..., rating_k))
The input table must at least contain a column with the same name as the item column from the training model.
users_to_items
: This is the same asitems
.items_to_users
: This is the same asusers
.-
items_to_items
: Use this option to recommend similar items for items.The target column is
item_recommendation
, and the values are:JSON_OBJECT("column_item_id_name", JSON_ARRAY("item_1", ... , "item_k"))
The input table must at least contain a column with the same name as the item column from the training model.
-
users_to_users
: Use this option to recommend similar users for users.The target column is
user_recommendation
, and the values are:JSON_OBJECT("column_user_id_name", JSON_ARRAY("user_1", ... , "user_k"))
The input table must at least contain a column with the same name as the user column from the training model.
-
-
A typical usage example that specifies the fully qualified name of the table to generate predictions for, the session variable containing the model handle, and the fully qualified output table name:
mysql> CALL sys.ML_PREDICT_TABLE('ml_data.iris_test', @iris_model, 'ml_data.iris_predictions', NULL);
To view
ML_PREDICT_TABLE
results, query the output table. The table shows the predictions and the feature column values used to make each prediction. The table includes the primary key,_id
, and theml_results
column, which usesJSON
format:mysql> SELECT * from ml_data.iris_predictions LIMIT 5; +-----+--------------+-------------+--------------+-------------+-----------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------+ | _id | sepal length | sepal width | petal length | petal width | class | Prediction | ml_results | +-----+--------------+-------------+--------------+-------------+-----------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------+ | 1 | 7.3 | 2.9 | 6.3 | 1.8 | Iris-virginica | Iris-virginica | {'predictions': {'class': 'Iris-virginica'}, 'probabilities': {'Iris-setosa': 0.0, 'Iris-versicolor': 0.13, 'Iris-virginica': 0.87}} | | 2 | 6.1 | 2.9 | 4.7 | 1.4 | Iris-versicolor | Iris-versicolor | {'predictions': {'class': 'Iris-versicolor'}, 'probabilities': {'Iris-setosa': 0.0, 'Iris-versicolor': 1.0, 'Iris-virginica': 0.0}} | | 3 | 6.3 | 2.8 | 5.1 | 1.5 | Iris-virginica | Iris-versicolor | {'predictions': {'class': 'Iris-versicolor'}, 'probabilities': {'Iris-setosa': 0.0, 'Iris-versicolor': 0.6, 'Iris-virginica': 0.4}} | | 4 | 6.3 | 3.3 | 4.7 | 1.6 | Iris-versicolor | Iris-versicolor | {'predictions': {'class': 'Iris-versicolor'}, 'probabilities': {'Iris-setosa': 0.0, 'Iris-versicolor': 0.99, 'Iris-virginica': 0.01}} | | 5 | 6.1 | 3 | 4.9 | 1.8 | Iris-virginica | Iris-virginica | {'predictions': {'class': 'Iris-virginica'}, 'probabilities': {'Iris-setosa': 0.0, 'Iris-versicolor': 0.32, 'Iris-virginica': 0.68}} | +-----+--------------+-------------+--------------+-------------+-----------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------+ 5 rows in set (0.00 sec)
Before MySQL 8.0.32, the
ML_PREDICT_TABLE
routine does not include options, and the results do not include theml_results
column:mysql> CALL sys.ML_PREDICT_TABLE('ml_data.iris_test', @iris_model, 'ml_data.iris_predictions'); mysql> SELECT * FROM ml_data.iris_predictions LIMIT 3; *************************** 1. row *************************** sepal length: 7.3 sepal width: 2.9 petal length: 6.3 petal width: 1.8 Prediction: Iris-virginica *************************** 2. row *************************** sepal length: 6.1 sepal width: 2.9 petal length: 4.7 petal width: 1.4 Prediction: Iris-versicolor *************************** 3. row *************************** sepal length: 6.3 sepal width: 2.8 petal length: 5.1 petal width: 1.5 Prediction: Iris-virginica
See also: