Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.0Mb
PDF (A4) - 2.0Mb


HeatWave User Guide  /  ...  /  Using a Recommendation Model

3.11.3 Using a Recommendation Model

Once you train and load a recommendation model, you can start generating predictions (rows or tables) and scores for the model.

Generating Predictions and Scores

To generate predictions on the trained model, run the ML_PREDICT_ROW or ML_PREDICT_TABLE routine. Run the routines on the data with the same columns as the training model.

When generating predictions:

  • A table with the same name as the output table for ML_PREDICT_TABLE must not already exist.

  • NULL values for any row in the users or items columns will cause an error.

When generating scores for a recommendation model, run the ML_SCORE routine. You can use any of the recommendation metrics to score a recommendation model. You can use the metric parameter to specify a ratings metric for a recommendation model that uses explicit feedback, or a ranking metric to use with a recommendation model that uses implicit or explicit feedback. See: Recommendation Model Metrics.

For instructions on generating predictions and scores, see Section 3.7, “Predictions”, and Section 3.14.6, “Scoring Models”. For the complete list of option descriptions for predictions and scores, see ML_PREDICT_ROW, ML_PREDICT_TABLE, and ML_SCORE.

ML_EXPLAIN, ML_EXPLAIN_ROW and ML_EXPLAIN_TABLE do not support recommendation models. A call to any of these routines with a recommendation model will produce an error.

Options for Generating Predictions and Scores

The options for ML_PREDICT_ROW and ML_PREDICT_TABLE include the following:

  • topk: The number of recommendations to provide. The default is 3.

  • recommend: Specifies what to recommend. Permitted values are:

    • ratings: Predicts ratings that users will give. This is the default value.

    • items: Recommends items for users.

    • users: Recommends users for items.

    • users_to_items: This is the same as items.

    • items_to_users: This is the same as users.

    • items_to_items: Recommends similar items for items.

    • users_to_users: Recommends similar users for users.

  • remove_seen: If true, the model will not repeat existing interactions from the training table. It only applies to the recommendations items, users, users_to_items, and items_to_users.

The options for ML_SCORE include the following:

  • threshold: The optional threshold that defines positive feedback, and a relevant sample. Only use with ranking metrics. It can be used for either explicit or implicit feedback.

  • topk: The optional top K rows to recommend. Only use with ranking metrics.

  • remove_seen: If true, the model will not repeat existing interactions from the training table.

Output Values

Recommendation models can recommend the following for explicit and implicit feedback:

  • The rating or ranking that a user will give to an item.

    • For known users and known items, the output includes the predicted rating or ranking a user will give for an item for a given pair of user_id and item_id.

    • For a known user with a new item, the prediction is the global average rating or ranking. The routines can add a user bias if the model includes it.

    • For a new user with a known item, the prediction is the global average rating or ranking. The routines can add an item bias if the model includes it.

    • For a new user with a new item, the prediction is the global average rating or ranking.

  • Users that will like an item.

    • For known users and known items, the output includes a list of users that will most likely give a high rating to an item and will also predict the ratings.

    • For a new item, and an explicit feedback model, the prediction is the global top K users who have provided the average highest ratings.

      For a new item, and an implicit feedback model, the prediction is the global top K users with the highest number of interactions.

    • For an item that has been tried by all known users, the prediction is an empty list because it is not possible to recommend any other users. Set remove_seen to false to repeat existing interactions from the training table.

  • Items that a user will like.

    • For known users and known items, the output includes a list of items that the user will most likely give a high rating and the predicted rating.

    • For a new user, and an explicit feedback model, the prediction is the global top K items that received the average highest ratings.

      For a new user, and an implicit feedback model, the prediction is the global top K items with the highest number of interactions.

    • For a user who has tried all known items, the prediction is an empty list because it is not possible to recommend any other items. Set remove_seen to false to repeat existing interactions from the training table.

  • Items similar to another item.

    • For known items, the output includes a list of predicted items that have similar ratings and are appreciated by similar users.

    • The predictions are expressed in cosine similarity, and range from 0, very dissimilar, to 1, very similar.

    • For a new item, there is no information to provide a prediction. This will produce an error.

  • Users similar to another user.

    • For known users, the output includes a list of predicted users that have similar behavior and taste.

    • The predictions are expressed in cosine similarity, and range from 0, very dissimilar, to 1, very similar.

    • For a new user, there is no information to provide a prediction. This will produce an error.

Before You Begin

  1. Complete the steps for Section 3.11.2, “Training a Recommendation Model”

  2. Once the model is trained, run the ML_MODEL_LOAD routine.

Syntax Examples for Explicit Feedback

  • An ML_PREDICT_TABLE example that predicts the ratings for particular users and items. This is the default option for recommend, with options set to NULL.

    mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.retailrocket-transactionto_to_predict', 
              @model, 'mlcorpus.table_predictions', NULL);
    Query OK, 0 rows affected (0.7589 sec)
    
    mysql> SELECT * FROM  table_predictions;
    +-------------------+---------------+---------+---------+--------+-----------------------------------+
    | _4aad19ca6e_pk_id | timestamp     | user_id | item_id | rating | ml_results                        |
    +-------------------+---------------+---------+---------+--------+-----------------------------------+
    |                 1 | 1436670000000 | 836347  | 64154   |      1 | {"predictions": {"rating": 1.0}}  |
    |                 2 | 1441250000000 | 435603  | 335366  |      1 | {"predictions": {"rating": 1.04}} |
    |                 3 | 1439670000000 | 1150086 | 314062  |      1 | {"predictions": {"rating": 1.03}} |
    +-------------------+---------------+---------+---------+--------+-----------------------------------+
    3 rows in set (0.00 sec)

    The output table displays the following recommendations:

    • User 836347 is predicted to give a rating of 1.0 for item 64154.

    • User 435603 is predicted to give a rating of 1.04 for item 335366.

    • User 1150086 is predicted to give a rating of 1.03 for item 314062.

    The values in the rating column refer to the past rating the user_id gave to the item_id. They are not relevant to the values in ml_results.

  • A more complete example for the top 3 users that will like particular items.

    mysql> SELECT * FROM train_table;
    +---------+------------+--------+
    | user_id | item_id    | rating |
    +---------+------------+--------+
    | user_1  | good_movie |      5 |
    | user_1  | bad_movie  |      1 |
    | user_2  | bad_movie  |      1 |
    | user_3  | bad_movie  |      0 |
    | user_4  | bad_movie  |      0 |
    +---------+------------+--------+
    5 rows in set (0.00 sec)
    
    mysql> CALL sys.ML_TRAIN('mlcorpus.train_table', 'rating', 
              JSON_OBJECT('task', 'recommendation', 'users', 'user_id', 'items', 'item_id'), 
              @model);
    Query OK, 0 rows affected (11.39 sec)
    
    mysql> CALL sys.ML_MODEL_LOAD(@model, NULL);
    Query OK, 0 rows affected (0.98 sec)
    
    mysql> SELECT * FROM items_table;
    +------------+
    | item_id    |
    +------------+
    | good_movie |
    | bad_movie  |
    | new_movie  |
    +------------+
    3 rows in set (0.00 sec)
    
    mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.items_table', 
              @model, 'mlcorpus.user_recommendation', 
              JSON_OBJECT("recommend", "users", "topk", 3));
    Query OK, 0 rows affected (1.21 sec)
    
    mysql> SELECT * FROM user_recommendation;
    +-------------------+------------+-------------------------------------------------------------------------------------------+
    | _4aad19ca6e_pk_id | item_id    | ml_results                                                                                |
    +-------------------+------------+-------------------------------------------------------------------------------------------+
    |                 1 | good_movie | {"predictions": {"user_id": ["user_2", "user_3", "user_4"], "rating": [1.84, 1.71, 1.7]}} |
    |                 2 | bad_movie  | {"predictions": {"user_id": [], "rating": []}}                                            |
    |                 3 | new_movie  | {"predictions": {"user_id": ["user_1", "user_2", "user_3"], "rating": [3.0, 1.0, 0.0]}}   |
    +-------------------+------------+-------------------------------------------------------------------------------------------+
    3 rows in set (0.0004 sec)

    The training table shows that users have given a rating with a scale of 1 to 5 for a good_movie and a bad_movie. There is an additional new_movie item in the item_id column. After running PREDICT_TABLE, the predicted ratings for the users are generated for good_movie and new_movie. There are no generated predictions for bad_movie because all the users have already rated the movie and the remove_seen option is set to the default value of true. To generate predictions for bad_movie, set remove_seen to false.

  • An ML_PREDICT_TABLE example for the top 3 users that will like particular items with the items_to_users option.

    mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.ml-100k', 
              @model, 'mlcorpus.item_to_users_recommendation',  
              JSON_OBJECT("recommend", "items_to_users", "topk", 3));
    Query OK, 0 rows affected (21.2070 sec)
    
    mysql> SELECT * FROM mlcorpus.item_to_users_recommendation LIMIT 5;
    +-------------------+---------+---------+--------+-----------+---------------------------------------------------------------------------------+
    | _4aad19ca6e_pk_id | user_id | item_id | rating | timestamp | ml_results                                                                      |
    +-------------------+---------+---------+--------+-----------+---------------------------------------------------------------------------------+
    |                 1 | 846     | 524     |      3 | 883948000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.05, 3.94, 3.91]}} |
    |                 2 | 138     | 474     |      5 | 879024000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.58, 4.47, 4.45]}} |
    |                 3 | 840     | 609     |      4 | 891205000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.25, 4.14, 4.13]}} |
    |                 4 | 660     | 402     |      3 | 891201000 | {"predictions": {"user_id": ["7", "164", "894"], "rating": [4.09, 3.98, 3.96]}} |
    |                 5 | 154     | 89      |      5 | 879139000 | {"predictions": {"user_id": ["7", "164", "151"], "rating": [4.28, 4.18, 4.15]}} |
    +-------------------+---------+---------+--------+-----------+---------------------------------------------------------------------------------+

    The output table displays the following recommendations:

    • Users 7, 164, and 894 are predicted to like items 524, 474, 609, and 402.

    • Users 7, 164, and 151 are predicted to like item 89.

    • The predicted ratings each user will give for the respective items display after the predicted users. For example, for item 524, user 7 is predicted to give a rating of 4.05, user 164 is predicted to give a rating of 3.94, and user 894 is predicted to give a rating of 3.91.

    The values in the rating column refer to past ratings for respective user-item pairs. They are not relevant to the values in ml_results.

  • An ML_PREDICT_ROW example for the top 3 items that a particular user will like.

    mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "836347"}', @model,  
              JSON_OBJECT("recommend", "items", "topk", 3));
    +------------------------------------------------------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"user_id": "836347"}', @model,  JSON_OBJECT("recommend", "items", "topk", 3))                                 |
    +------------------------------------------------------------------------------------------------------------------------------------+
    | {"user_id": "836347", "ml_results": "{"predictions": {"item_id": ["119736", "396042", "224549"], "rating": [2.18, 1.53, 1.53]}}"}  |
    +------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.31 sec)

    User 836347 is predicted to like items 119736, 396042, and 224549. The predicted ratings that the user will give for each item are also included after rating.

  • An ML_PREDICT_ROW example for the top 3 items similar to another item.

    mysql> SELECT sys.ML_PREDICT_ROW('{"item_id": "524"}', @model,  
              JSON_OBJECT("recommend", "items_to_items", "topk", 3));
    +------------------------------------------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"item_id": "524"}', @model,  JSON_OBJECT("recommend", "items_to_items", "topk", 3))               |
    +------------------------------------------------------------------------------------------------------------------------+
    | {"item_id": "524", "ml_results": "{"predictions": {"item_id": ["665", "633", "378"], "similarity": [1.0, 1.0, 1.0]}}"} |
    +------------------------------------------------------------------------------------------------------------------------+

    Item 524 is predicted to be most similar to items 665, 633, and 378. The items have the highest similarity value of 1.0.

  • An ML_PREDICT_ROW example for the top 3 users similar to another user.

    mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "846"}', @model,  
              JSON_OBJECT("recommend", "users_to_users", "topk", 3));
    +--------------------------------------------------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"user_id": "846"}', @model,  JSON_OBJECT("recommend", "users_to_users", "topk", 3))                       |
    +--------------------------------------------------------------------------------------------------------------------------------+
    | {"user_id": "846", "ml_results": "{"predictions": {"user_id": ["62", "643", "172"], "similarity": [0.7413, 0.7275, 0.6709]}}"} |
    +--------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.2373 sec)

    User 846 is predicted to be most similar to users 62, 643, and 172. The respective similarity values for each user are included after similarity.

  • An ML_SCORE example:

    mysql> CALL sys.ML_SCORE('mlcorpus.ipinyou-click_test', 
              'rating', @model, 'neg_mean_squared_error', @score, NULL);
    Query OK, 0 rows affected (1 min 18.29 sec)  
     
    mysql> SELECT @score;
    +----------------------+
    | @score               |
    +----------------------+
    | -0.23571448028087616 |
    +----------------------+
    1 row in set (0.00 sec)

Syntax Examples for Implicit Feedback

  • An ML_PREDICT_ROW example that predicts the ranking for a particular user and item.

    mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "836", "item_id": "226"}', @model,  NULL);
    +---------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"user_id": "836", "item_id": "226"}', @model,  NULL)             |
    +---------------------------------------------------------------------------------------+
    | {"item_id": "226", "user_id": "836", "ml_results": {"predictions": {"rating": 2.46}}} |
    +---------------------------------------------------------------------------------------+
    1 row in set (0.1390 sec)

    The predicted ranking of item 226 and user 836 is 2.46.

  • An ML_PREDICT_TABLE example that predicts the rankings for particular users and items.

    mysql> CALL sys.ML_PREDICT_TABLE('mlcorpus.test_table', @model, 'mlcorpus.table_predictions', NULL);
    
    mysql> SELECT * FROM mlcorpus.table_predictions LIMIT 10;
    +-------------------+---------+---------+--------+------------------------------------+
    | _4aad19ca6e_pk_id | user_id | item_id | rating | ml_results                         |
    +-------------------+---------+---------+--------+------------------------------------+
    |                 1 | 1026    | 13763   |      1 | {"predictions": {"rating": 1.25}}  |
    |                 2 | 992     | 16114   |      1 | {"predictions": {"rating": -0.15}} |
    |                 3 | 1863    | 4527    |      1 | {"predictions": {"rating": 0.42}}  |
    |                 4 | 3725    | 3981    |      1 | {"predictions": {"rating": 3.11}}  |
    |                 5 | 3436    | 5854    |      1 | {"predictions": {"rating": 0.45}}  |
    |                 6 | 2236    | 13608   |      1 | {"predictions": {"rating": -0.35}} |
    |                 7 | 5230    | 1181    |      1 | {"predictions": {"rating": 1.96}}  |
    |                 8 | 1684    | 10140   |      1 | {"predictions": {"rating": -0.42}} |
    |                 9 | 3438    | 8022    |      1 | {"predictions": {"rating": -0.11}} |
    |                10 | 1536    | 7578    |      1 | {"predictions": {"rating": 0.1}}   |
    +-------------------+---------+---------+--------+------------------------------------+
    10 rows in set (0.0004 sec)

    The predicted rankings are displayed in ml_results. The values in the rating column refer to past interactions for the user-item pair. They are not relevant to the values in ml_results.

  • An ML_PREDICT_ROW example that recommends the top 3 users that will like a particular item and includes existing interactions from the training table.

    mysql> SELECT sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model,  JSON_OBJECT("recommend", "users", "topk", 3, "remove_seen", false));
    +--------------------------------------------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model,  JSON_OBJECT("recommend", "users", "topk", 3, "remove_seen", false))  |
    +--------------------------------------------------------------------------------------------------------------------------+
    | {"item_id": "13763", "ml_results": {"predictions": {"rating": [1.26, 1.26, 1.26], "user_id": ["4590", "1822", "3585"]}}} |
    +--------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.3098 sec)

    Users 4590, 1822, and 3585 are predicted to like item 13763. The respective ranking of each user is included after rating.

  • An ML_PREDICT_ROW example that recommends the top 3 items that a particular user will like.

    mysql> SELECT sys.ML_PREDICT_ROW('{"user_id": "1026"}', @model,  JSON_OBJECT("recommend", "items", "topk", 3));
    +-------------------------------------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"user_id": "1026"}', @model,  JSON_OBJECT("recommend", "items", "topk", 3))                  |
    +-------------------------------------------------------------------------------------------------------------------+
    | {"user_id": "1026", "ml_results": {"predictions": {"rating": [3.43, 3.37, 3.18], "item_id": ["10", "14", "11"]}}} |
    +-------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.6586 sec)

    User 1026 is predicted to like items 10, 14, and 11. The respective ranking for each item is included after rating.

  • An ML_PREDICT_ROW example for the top 3 items similar to another item.

    mysql> SELECT sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model,  JSON_OBJECT("recommend", "items_to_items", "topk", 3));
    +------------------------------------------------------------------------------------------------------------------------------------+
    | sys.ML_PREDICT_ROW('{"item_id": "13763"}', @model,  JSON_OBJECT("recommend", "items_to_items", "topk", 3))                         |
    +------------------------------------------------------------------------------------------------------------------------------------+
    | {"item_id": "13763", "ml_results": {"predictions": {"item_id": ["13751", "13711", "13668"], "similarity": [1.0, 0.9999, 0.9999]}}} |
    +------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.4607 sec)

    Item 13763 is predicted to be most similar to items 13751, 13711, and 13668. The respective similarity scores for each item is included after similarity.

  • ML_SCORE examples for the four metrics suitable for a recommendation model with implicit feedback, with threshold set to 3, topk set to 50 and including existing interactions from the training table with remove_seen set to false.

    mysql> CALL sys.ML_MODEL_LOAD(@model, NULL);
    
    mysql> SET @options = JSON_OBJECT('threshold', 3, 'topk', 50, 'remove_seen', false);
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'precision_at_k', @score, @options);
    Query OK, 0 rows affected (2.03 sec)
     
    mysql> SELECT @score;
    +---------------------+
    | @score1             |
    +---------------------+
    | 0.03488215431571007 |
    +---------------------+
    1 row in set (0.00 sec)
    
    mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'recall_at_k', @score, @options);
    Query OK, 0 rows affected (2.35 sec)
    
    mysql> SELECT @score;
    +---------------------+
    | @score              |
    +---------------------+
    | 0.24304823577404022 |
    +---------------------+
    1 row in set (0.00 sec)
    
    mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'hit_ratio_at_k', @score, @options);
    Query OK, 0 rows affected (2.30 sec)
    
    mysql> SELECT @score;
    +---------------------+
    | @score              |
    +---------------------+
    | 0.18799902498722076 |
    +---------------------+
    1 row in set (0.00 sec)
    
    mysql> CALL sys.ML_SCORE('mlcorpus.recsys_item_1', 'target', @model, 'ndcg_at_k', @score, @options);
    Query OK, 0 rows affected (2.35 sec)
    
    mysql> SELECT @score;
    +---------------------+
    | @score              |
    +---------------------+
    | 0.12175655364990234 |
    +---------------------+
    1 row in set (0.00 sec)