Once you train and load a recommendation model, you can start generating predictions (rows or tables) and scores for the model.
To generate predictions on the trained model, run the
ML_PREDICT_ROW
or
ML_PREDICT_TABLE
routine. Run
the routines on the data with the same columns as the
training model.
When generating predictions:
A table with the same name as the output table for
ML_PREDICT_TABLE
must not already exist.NULL
values for any row in theusers
oritems
columns will cause an error.
When generating scores for a recommendation model, run the
ML_SCORE
routine. You can use
any of the recommendation metrics to score a recommendation
model. You can use the metric
parameter
to specify a ratings metric for a recommendation model that
uses explicit feedback, or a ranking metric to use with a
recommendation model that uses implicit or explicit
feedback. See:
Recommendation Model Metrics.
For instructions on generating predictions and scores, see
Section 3.7, “Predictions”, and
Section 3.14.6, “Scoring Models”. For the complete
list of option descriptions for predictions and scores, see
ML_PREDICT_ROW
,
ML_PREDICT_TABLE
, and
ML_SCORE
.
ML_EXPLAIN
,
ML_EXPLAIN_ROW
and
ML_EXPLAIN_TABLE
do not
support recommendation models. A call to any of these
routines with a recommendation model will produce an error.
The options
for
ML_PREDICT_ROW
and
ML_PREDICT_TABLE
include the
following:
topk
: The number of recommendations to provide. The default is3
.-
recommend
: Specifies what to recommend. Permitted values are:ratings
: Predicts ratings that users will give. This is the default value.items
: Recommends items for users.users
: Recommends users for items.users_to_items
: This is the same asitems
.items_to_users
: This is the same asusers
.items_to_items
: Recommends similar items for items.users_to_users
: Recommends similar users for users.
remove_seen
: Iftrue
, the model will not repeat existing interactions from the training table. It only applies to the recommendationsitems
,users
,users_to_items
, anditems_to_users
.
The options
for
ML_SCORE
include the
following:
threshold
: The optional threshold that defines positive feedback, and a relevant sample. Only use with ranking metrics. It can be used for either explicit or implicit feedback.topk
: The optional top K rows to recommend. Only use with ranking metrics.remove_seen
: Iftrue
, the model will not repeat existing interactions from the training table.
Recommendation models can recommend the following for explicit and implicit feedback:
-
The rating or ranking that a user will give to an item.
For known users and known items, the output includes the predicted rating or ranking a user will give for an item for a given pair of
user_id
anditem_id
.For a known user with a new item, the prediction is the global average rating or ranking. The routines can add a user bias if the model includes it.
For a new user with a known item, the prediction is the global average rating or ranking. The routines can add an item bias if the model includes it.
For a new user with a new item, the prediction is the global average rating or ranking.
-
Users that will like an item.
For known users and known items, the output includes a list of users that will most likely give a high rating to an item and will also predict the ratings.
-
For a new item, and an explicit feedback model, the prediction is the global top K users who have provided the average highest ratings.
For a new item, and an implicit feedback model, the prediction is the global top K users with the highest number of interactions.
For an item that has been tried by all known users, the prediction is an empty list because it is not possible to recommend any other users. Set
remove_seen
tofalse
to repeat existing interactions from the training table.
-
Items that a user will like.
For known users and known items, the output includes a list of items that the user will most likely give a high rating and the predicted rating.
-
For a new user, and an explicit feedback model, the prediction is the global top K items that received the average highest ratings.
For a new user, and an implicit feedback model, the prediction is the global top K items with the highest number of interactions.
For a user who has tried all known items, the prediction is an empty list because it is not possible to recommend any other items. Set
remove_seen
tofalse
to repeat existing interactions from the training table.
-
Items similar to another item.
For known items, the output includes a list of predicted items that have similar ratings and are appreciated by similar users.
The predictions are expressed in cosine similarity, and range from 0, very dissimilar, to 1, very similar.
For a new item, there is no information to provide a prediction. This will produce an error.
-
Users similar to another user.
For known users, the output includes a list of predicted users that have similar behavior and taste.
The predictions are expressed in cosine similarity, and range from 0, very dissimilar, to 1, very similar.
For a new user, there is no information to provide a prediction. This will produce an error.
Complete the steps for Section 3.11.2, “Training a Recommendation Model”
Once the model is trained, run the
ML_MODEL_LOAD
routine.
-
An
ML_PREDICT_TABLE
example that predicts the ratings for particular users and items. This is the default option forrecommend
, withoptions
set toNULL
.mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.retailrocket-transactionto_to_predict', @model, 'mlcorpus.table_predictions', NULL); Query OK, 0 rows affected (0.7589 sec) mysql> SELECT * FROM table_predictions; +-------------------+---------------+---------+---------+--------+-----------------------------------+ | _4aad19ca6e_pk_id | timestamp | user_id | item_id | rating | ml_results | +-------------------+---------------+---------+---------+--------+-----------------------------------+ | 1 | 1436670000000 | 836347 | 64154 | 1 | {"predictions": {"rating": 1.0}} | | 2 | 1441250000000 | 435603 | 335366 | 1 | {"predictions": {"rating": 1.04}} | | 3 | 1439670000000 | 1150086 | 314062 | 1 | {"predictions": {"rating": 1.03}} | +-------------------+---------------+---------+---------+--------+-----------------------------------+ 3 rows in set (0.00 sec)
The output table displays the following recommendations:
User 836347 is predicted to give a rating of 1.0 for item 64154.
User 435603 is predicted to give a rating of 1.04 for item 335366.
User 1150086 is predicted to give a rating of 1.03 for item 314062.
The values in the
rating
column refer to the past rating theuser_id
gave to theitem_id
. They are not relevant to the values inml_results
. -
A more complete example for the top 3 users that will like particular items.
mysql> SELECT * FROM train_table; +---------+------------+--------+ | user_id | item_id | rating | +---------+------------+--------+ | user_1 | good_movie | 5 | | user_1 | bad_movie | 1 | | user_2 | bad_movie | 1 | | user_3 | bad_movie | 0 | | user_4 | bad_movie | 0 | +---------+------------+--------+ 5 rows in set (0.00 sec) mysql> CALL sys.ML_TRAIN('mlcorpus.train_table', 'rating', JSON_OBJECT('task', 'recommendation', 'users', 'user_id', 'items', 'item_id'), @model); Query OK, 0 rows affected (11.39 sec) mysql> CALL sys.ML_MODEL_LOAD(@model, NULL); Query OK, 0 rows affected (0.98 sec) mysql> SELECT * FROM items_table; +------------+ | item_id | +------------+ | good_movie | | bad_movie | | new_movie | +------------+ 3 rows in set (0.00 sec) mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.items_table', @model, 'mlcorpus.user_recommendation', JSON_OBJECT("recommend", "users", "topk", 3)); Query OK, 0 rows affected (1.21 sec) mysql> SELECT * FROM user_recommendation; +-------------------+------------+-------------------------------------------------------------------------------------------+ | _4aad19ca6e_pk_id | item_id | ml_results | +-------------------+------------+-------------------------------------------------------------------------------------------+ | 1 | good_movie | {"predictions": {"user_id": ["user_2", "user_3", "user_4"], "rating": [1.84, 1.71, 1.7]}} | | 2 | bad_movie | {"predictions": {"user_id": [], "rating": []}} | | 3 | new_movie | {"predictions": {"user_id": ["user_1", "user_2", "user_3"], "rating": [3.0, 1.0, 0.0]}} | +-------------------+------------+-------------------------------------------------------------------------------------------+ 3 rows in set (0.0004 sec)
The training table shows that users have given a rating with a scale of 1 to 5 for a
good_movie
and abad_movie
. There is an additionalnew_movie
item in theitem_id
column. After runningPREDICT_TABLE
, the predicted ratings for the users are generated forgood_movie
andnew_movie
. There are no generated predictions forbad_movie
because all the users have already rated the movie and theremove_seen
option is set to the default value oftrue
. To generate predictions forbad_movie
, setremove_seen
tofalse
. -
An
ML_PREDICT_TABLE
example for the top 3 users that will like particular items with theitems_to_users
option.mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.ml-100k', @model, 'mlcorpus.item_to_users_recommendation', JSON_OBJECT("recommend", "items_to_users", "topk", 3)); Query OK, 0 rows affected (21.2070 sec) mysql> SELECT * FROM mlcorpus.item_to_users_recommendation LIMIT 5; +-------------------+---------+---------+--------+-----------+---------------------------------------------------------------------------------+ | _4aad19ca6e_pk_id | user_id | item_id | rating | timestamp | ml_results | +-------------------+---------+---------+--------+-----------+---------------------------------------------------------------------------------+ | 1 | 846 | 524 | 3 | 883948000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.05, 3.94, 3.91]}} | | 2 | 138 | 474 | 5 | 879024000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.58, 4.47, 4.45]}} | | 3 | 840 | 609 | 4 | 891205000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.25, 4.14, 4.13]}} | | 4 | 660 | 402 | 3 | 891201000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.09, 3.98, 3.96]}} | | 5 | 154 | 89 | 5 | 879139000 | {"predictions": {"user_id": ["7", "164", "151"], "rating": [4.28, 4.18, 4.15]}} | +-------------------+---------+---------+--------+-----------+---------------------------------------------------------------------------------+
The output table displays the following recommendations:
Users 7, 164, and 894 are predicted to like items 524, 474, 609, and 402.
Users 7, 164, and 151 are predicted to like item 89.
The predicted ratings each user will give for the respective items display after the predicted users. For example, for item 524, user 7 is predicted to give a rating of 4.05, user 164 is predicted to give a rating of 3.94, and user 894 is predicted to give a rating of 3.91.
The values in the
rating
column refer to past ratings for respective user-item pairs. They are not relevant to the values inml_results
. -
An
ML_PREDICT_ROW
example for the top 3 items that a particular user will like.mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "836347"}', @model, JSON_OBJECT("recommend", "items", "topk", 3)); +------------------------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"user_id": "836347"}', @model, JSON_OBJECT("recommend", "items", "topk", 3)) | +------------------------------------------------------------------------------------------------------------------------------------+ | {"user_id": "836347", "ml_results": "{"predictions": {"item_id": ["119736", "396042", "224549"], "rating": [2.18, 1.53, 1.53]}}"} | +------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.31 sec)
User
836347
is predicted to like items119736
,396042
, and224549
. The predicted ratings that the user will give for each item are also included afterrating
. -
An
ML_PREDICT_ROW
example for the top 3 items similar to another item.mysql> SELECT sys.ML_PREDICT_ROW('{"item_id": "524"}', @model, JSON_OBJECT("recommend", "items_to_items", "topk", 3)); +------------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"item_id": "524"}', @model, JSON_OBJECT("recommend", "items_to_items", "topk", 3)) | +------------------------------------------------------------------------------------------------------------------------+ | {"item_id": "524", "ml_results": "{"predictions": {"item_id": ["665", "633", "378"], "similarity": [1.0, 1.0, 1.0]}}"} | +------------------------------------------------------------------------------------------------------------------------+
Item
524
is predicted to be most similar to items665
,633
, and378
. The items have the highest similarity value of 1.0. -
An
ML_PREDICT_ROW
example for the top 3 users similar to another user.mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "846"}', @model, JSON_OBJECT("recommend", "users_to_users", "topk", 3)); +--------------------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"user_id": "846"}', @model, JSON_OBJECT("recommend", "users_to_users", "topk", 3)) | +--------------------------------------------------------------------------------------------------------------------------------+ | {"user_id": "846", "ml_results": "{"predictions": {"user_id": ["62", "643", "172"], "similarity": [0.7413, 0.7275, 0.6709]}}"} | +--------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.2373 sec)
User
846
is predicted to be most similar to users62
,643
, and172
. The respective similarity values for each user are included aftersimilarity
. -
An
ML_SCORE
example:mysql> CALL sys.ML_SCORE('mlcorpus.ipinyou-click_test', 'rating', @model, 'neg_mean_squared_error', @score, NULL); Query OK, 0 rows affected (1 min 18.29 sec) mysql> SELECT @score; +----------------------+ | @score | +----------------------+ | -0.23571448028087616 | +----------------------+ 1 row in set (0.00 sec)
-
An
ML_PREDICT_ROW
example that predicts the ranking for a particular user and item.mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "836", "item_id": "226"}', @model, NULL); +---------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"user_id": "836", "item_id": "226"}', @model, NULL) | +---------------------------------------------------------------------------------------+ | {"item_id": "226", "user_id": "836", "ml_results": {"predictions": {"rating": 2.46}}} | +---------------------------------------------------------------------------------------+ 1 row in set (0.1390 sec)
The predicted ranking of item
226
and user836
is 2.46. -
An
ML_PREDICT_TABLE
example that predicts the rankings for particular users and items.mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.test_table', @model, 'mlcorpus.table_predictions', NULL); mysql> SELECT * FROM mlcorpus.table_predictions LIMIT 10; +-------------------+---------+---------+--------+------------------------------------+ | _4aad19ca6e_pk_id | user_id | item_id | rating | ml_results | +-------------------+---------+---------+--------+------------------------------------+ | 1 | 1026 | 13763 | 1 | {"predictions": {"rating": 1.25}} | | 2 | 992 | 16114 | 1 | {"predictions": {"rating": -0.15}} | | 3 | 1863 | 4527 | 1 | {"predictions": {"rating": 0.42}} | | 4 | 3725 | 3981 | 1 | {"predictions": {"rating": 3.11}} | | 5 | 3436 | 5854 | 1 | {"predictions": {"rating": 0.45}} | | 6 | 2236 | 13608 | 1 | {"predictions": {"rating": -0.35}} | | 7 | 5230 | 1181 | 1 | {"predictions": {"rating": 1.96}} | | 8 | 1684 | 10140 | 1 | {"predictions": {"rating": -0.42}} | | 9 | 3438 | 8022 | 1 | {"predictions": {"rating": -0.11}} | | 10 | 1536 | 7578 | 1 | {"predictions": {"rating": 0.1}} | +-------------------+---------+---------+--------+------------------------------------+ 10 rows in set (0.0004 sec)
The predicted rankings are displayed in
ml_results
. The values in therating
column refer to past interactions for the user-item pair. They are not relevant to the values inml_results
. -
An
ML_PREDICT_ROW
example that recommends the top 3 users that will like a particular item and includes existing interactions from the training table.mysql> SELECT sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model, JSON_OBJECT("recommend", "users", "topk", 3, "remove_seen", false)); +--------------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model, JSON_OBJECT("recommend", "users", "topk", 3, "remove_seen", false)) | +--------------------------------------------------------------------------------------------------------------------------+ | {"item_id": "13763", "ml_results": {"predictions": {"rating": [1.26, 1.26, 1.26], "user_id": ["4590", "1822", "3585"]}}} | +--------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.3098 sec)
Users
4590
,1822
, and3585
are predicted to like item13763
. The respective ranking of each user is included afterrating
. -
An
ML_PREDICT_ROW
example that recommends the top 3 items that a particular user will like.mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "1026"}', @model, JSON_OBJECT("recommend", "items", "topk", 3)); +-------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"user_id": "1026"}', @model, JSON_OBJECT("recommend", "items", "topk", 3)) | +-------------------------------------------------------------------------------------------------------------------+ | {"user_id": "1026", "ml_results": {"predictions": {"rating": [3.43, 3.37, 3.18], "item_id": ["10", "14", "11"]}}} | +-------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.6586 sec)
User
1026
is predicted to like items10
,14
, and11
. The respective ranking for each item is included afterrating
. -
An
ML_PREDICT_ROW
example for the top 3 items similar to another item.mysql> SELECT sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model, JSON_OBJECT("recommend", "items_to_items", "topk", 3)); +------------------------------------------------------------------------------------------------------------------------------------+ | sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model, JSON_OBJECT("recommend", "items_to_items", "topk", 3)) | +------------------------------------------------------------------------------------------------------------------------------------+ | {"item_id": "13763", "ml_results": {"predictions": {"item_id": ["13751", "13711", "13668"], "similarity": [1.0, 0.9999, 0.9999]}}} | +------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.4607 sec)
Item
13763
is predicted to be most similar to items13751
,13711
, and13668
. The respective similarity scores for each item is included aftersimilarity
. -
ML_SCORE
examples for the four metrics suitable for a recommendation model with implicit feedback, withthreshold
set to3
,topk
set to50
and including existing interactions from the training table withremove_seen
set tofalse
.mysql> CALL sys.ML_MODEL_LOAD(@model, NULL); mysql> SET @options = JSON_OBJECT('threshold', 3, 'topk', 50, 'remove_seen', false); Query OK, 0 rows affected (0.00 sec) mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'precision_at_k', @score, @options); Query OK, 0 rows affected (2.03 sec) mysql> SELECT @score; +---------------------+ | @score1 | +---------------------+ | 0.03488215431571007 | +---------------------+ 1 row in set (0.00 sec) mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'recall_at_k', @score, @options); Query OK, 0 rows affected (2.35 sec) mysql> SELECT @score; +---------------------+ | @score | +---------------------+ | 0.24304823577404022 | +---------------------+ 1 row in set (0.00 sec) mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'hit_ratio_at_k', @score, @options); Query OK, 0 rows affected (2.30 sec) mysql> SELECT @score; +---------------------+ | @score | +---------------------+ | 0.18799902498722076 | +---------------------+ 1 row in set (0.00 sec) mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'ndcg_at_k', @score, @options); Query OK, 0 rows affected (2.35 sec) mysql> SELECT @score; +---------------------+ | @score | +---------------------+ | 0.12175655364990234 | +---------------------+ 1 row in set (0.00 sec)