The ML_TRAIN
routine includes
the optimization_metric
option, and the
ML_SCORE
routine includes the
metric
option. Both of these options define
a metric that must be compatible with the
task
type and the target data.
Section 3.15.12, “Model Metadata” includes the
optimization_metric
field.
For more information about scoring metrics, see: scikit-learn.org. For more information about forecasting metrics, see: sktime.org and statsmodels.org.
-
Classification metrics
-
Binary-only metrics
-
Binary and multi-class metrics
-
-
Regression metrics
-
Forecasting metrics
-
Anomaly detection metrics
ML_SCORE
only. Not supported forML_TRAIN
.-
No
threshold
ortopk
options.Do not specify
threshold
andtopk
options. -
threshold
option.Uses the
threshold
option. Do not specify thetopk
option. -
topk
option.Requires the
topk
option. Do not specify thethreshold
option.precision_k
is an Oracle implementation of a common metric for fraud detection and lead scoring.
-
-
Recommendation model metrics
-
Rating metrics to use with recommendation models that use explicit feedback.
-
Ranking metrics to use with recommendation models that use implicit feedback.
ML_SCORE
only. Not supported forML_TRAIN
.If a user and item combination in the input table is not unique the input table is grouped by user and item columns, and the result is the average of the rankings.
If the input table overlaps with the training table, and
remove_seen
istrue
, which is the default setting, then the model will not repeat a recommendation and it ignores the overlap items.-
precision_at_k
is the number of relevanttopk
recommended items divided by the totaltopk
recommended items for a particular user:precision_at_k
= (relevanttopk
recommended items) / (totaltopk
recommended items)For example, if 7 out of 10 items are relevant for a user, and
topk
is 10, thenprecision_at_k
is 70%.The
precision_at_k
value for the input table is the average for all users. Ifremove_seen
istrue
, the default setting, then the average only includes users for whom the model can make a recommendation. If a user has implicitly ranked every item in the training table the model cannot recommend any more items for that user, and they are ignored from the average calculation ifremove_seen
istrue
. -
recall_at_k
is the number of relevanttopk
recommended items divided by the total relevant items for a particular user:recall_at_k
= (relevanttopk
recommended items) / (total relevant items)For example, there is a total of 20 relevant items for a user. If
topk
is 10, and 7 of those items are relevant, thenrecall_at_k
is 7 / 20 = 35%.The
recall_at_k
value for the input table is the average for all users. -
hit_ratio_at_k
is the number of relevanttopk
recommended items divided by the total relevant items for all users:hit_ratio_at_k
= (relevanttopk
recommended items, all users) / (total relevant items, all users)The average of
hit_ratio_at_k
for the input table isrecall_at_k
. If there is only one user,hit_ratio_at_k
is the same asrecall_at_k
. -
ndcg_at_k
is normalized discounted cumulative gain, which is the discounted cumulative gain of the relevanttopk
recommended items divided by the discounted cumulative gain of the relevanttopk
items for a particular user.The discounted gain of an item is the true rating divided by log2(r+1) where
r
is the ranking of this item in the relevanttopk
items. If a user prefers a particular item, the rating is higher, and the ranking is lower.The
ndcg_at_k
value for the input table is the average for all users.
-
-