ML_SCORE
scores a model by generating predictions using the feature
columns in a labeled dataset as input and comparing the
predictions to ground truth values in the target column of the
labeled dataset. The dataset used with
ML_SCORE
should have the same feature columns as the dataset used to
train the model but the data should be different. For example,
you might reserve 20 to 30 percent of the labeled training data
for scoring.
ML_SCORE
returns a computed metric indicating the quality of the model.
mysql> CALL sys.ML_SCORE(table_name, target_column_name, model_handle, metric, score, [options]);
options: {
JSON_OBJECT("key","value"[,"key","value"] ...)
"key","value": {
['threshold', 'N']
['topk', 'N']
['remove_seen', {'true'|'false'}]
}
}
Set the following required parameters.
table_name
: Specifies the fully qualified name of the table used to compute model quality (database_name.table_name
). The table must contain the same columns as the training dataset.target_column_name
: If scoring a supervised or semi-supervised model, specify the name of the target column containing ground truth values. If scoring an unsupervised model, set toNULL
. See MySQL HeatWave AutoML Learning Types.model_handle
: Specifies the model handle or a session variable containing the model handle. See Work with Model Handles.metric
: Specifies the name of the metric. The metric selected must be compatible with thetask
type used for training the model. See Optimization and Scoring Metrics.score
: Specifies the user-defined variable name for the computed score. TheML_SCORE
routine populates the variable. User variables are written as@
. Any valid name for a user-defined variable is permitted.var_name
The following options in JSON
format are
available for recommendation and anomaly detection models.
Set the following options as needed for recommendation models.
threshold
: The optional threshold that defines positive feedback, and a relevant sample. Only use with ranking metrics. It can be used for either explicit or implicit feedback.-
topk
: The optional top number of recommendations to provide. The default is3
. Set a positive integer between 1 and the number of rows in the table.A
recommendation
task and ranking metrics can use boththreshold
andtopk
. remove_seen
: If the input table overlaps with the training table, andremove_seen
istrue
, then the model will not repeat existing interactions. The default istrue
. Setremove_seen
tofalse
to repeat existing interactions from the training table.
Set the following options as needed for anomaly detection models.
threshold
: The threshold you set on anomaly detection models determines which rows in the output table are labeled as anomalies with an anomaly score of1
, or normal with an anomaly score of0
. The value for the threshold is the degree to which a row of data or log segment is considered for anomaly detection. Any sample with an anomaly score above the threshold is classified an anomaly. The default value is (1 -contamination
)-th percentile of all the anomaly scores.-
topk
: The optional top K rows to display with the highest anomaly scores. Set a positive integer between 1 and the number of rows in the table. Iftopk
is not set,ML_SCORE
usesthreshold
.Do not set both
threshold
andtopk
. Usethreshold
ortopk
, or setoptions
toNULL
.
-
The following example runs generates a score by using the
balanced_accuracy
metric. Query the score with the session variable for theML_SCORE
routine.mysql> CALL sys.ML_SCORE('census_data.census_train', 'revenue', 'census_data.census_train_admin_1745439945171', 'balanced_accuracy', @score, NULL); Query OK, 0 rows affected (3.0536 sec) mysql> SELECT @score; +--------------------+ | @score | +--------------------+ | 0.8151071071624756 | +--------------------+ 1 row in set (0.0411 sec)
-
The following example uses the
accuracy
metric with athreshold
set to 90%.mysql> CALL sys.ML_SCORE('mlcorpus_anomaly_detection.volcanoes-b3_anomaly_train', 'target', @anomaly, 'accuracy', @score, JSON_OBJECT('threshold', 0.9)); Query OK, 0 rows affected (1.86 sec) mysql> SELECT @score; +--------------------+ | @score | +--------------------+ | 0.9791129231452942 | +--------------------+ 1 row in set (0.00 sec)
-
The following example uses the
precision_at_k
metric with atopk
value of 10.mysql> CALL sys.ML_SCORE('mlcorpus_anomaly_detection.volcanoes-b3_anomaly_train', 'target', @anomaly, 'precision_at_k', @score, JSON_OBJECT('topk', 10)); Query OK, 0 rows affected (5.84 sec) mysql> SELECT @score; +---------------------+ | @score | +---------------------+ | 0 | +---------------------+ 1 row in set (0.00 sec)
-
The following example overrides the
ensemble_score
value from theML_TRAIN
routine to a new value of 0.5.mysql> CALL sys.ML_SCORE('mlcorpus.anomaly_train_with_target', "target", @semsup_gknn, 'precision_at_k', @semsup_score_gknn_weighted, CAST('{"topk": 10, "experimental": {"semisupervised": {"supervised_submodel_weight": 0.5}}}' as JSON));
See also: