This topic describes how to prepare the data to use for a recommendation machine learning model using explicit feedback. It uses a data sample generated by OCI GenAI. To prepare the data for this use case, you set up a training dataset and a testing dataset. The training dataset has 86 records, and the testing dataset has 40 records. In a real-life use case, you should prepare a larger amount of records for training and testing, and ensure the predictions are valid and reliable before testing on unlabeled data. To ensure reliable predictions, you should create an additional validation dataset. You can reserve 20% of the records in the training dataset to create the validation dataset.
Learn how to Prepare Data.
To prepare the data for the recommendation model:
Connect to the MySQL Server.
-
Create and use the database to store the data.
mysql> CREATE DATABASE recommendation_data; mysql> USE recommendation_data;
-
Create the table to insert the sample data into. This is the training dataset. The columns for users and items (
user_id
anditem_id
), must be in string data type.mysql> CREATE TABLE training_dataset ( user_id VARCHAR(3), item_id VARCHAR(3), rating DECIMAL(3, 1), PRIMARY KEY (user_id, item_id) );
-
Insert the sample data to train into the table. Copy and paste the following commands.
INSERT INTO training_dataset (user_id, item_id, rating) VALUES (1, 1, 5.0), (1, 3, 8.0), (1, 5, 2.5), (1, 7, 6.5), (1, 9, 4.0), (1, 11, 7.5), (1, 13, 3.0), (1, 15, 9.0), (1, 17, 1.5), (1, 19, 5.5), (2, 2, 4.5), (2, 4, 7.5), (2, 6, 2.0), (2, 8, 5.5), (2, 10, 9.0), (2, 12, 3.5), (2, 14, 6.0), (2, 16, 1.0), (2, 18, 4.5), (2, 20, 8.5), (3, 1, 3.5), (3, 4, 6.5), (3, 7, 2.5), (3, 9, 5.0), (3, 11, 8.5), (3, 13, 1.0), (3, 15, 4.0), (3, 17, 7.0), (3, 19, 2.5), (4, 2, 5.5), (4, 5, 8.5), (4, 8, 3.0), (4, 10, 6.5), (4, 12, 9.5), (4, 14, 2.0), (4, 16, 4.5), (4, 18, 7.5), (5, 3, 7.0), (5, 6, 1.5), (5, 8, 4.0), (5, 11, 6.0), (5, 13, 8.0), (5, 15, 2.5), (5, 17, 5.5), (5, 19, 9.0), (6, 1, 4.5), (6, 4, 7.5), (6, 6, 3.0), (6, 9, 5.5), (6, 12, 8.0), (6, 14, 1.5), (6, 16, 4.0), (6, 18, 6.5), (7, 2, 6.0), (7, 5, 3.5), (7, 7, 5.0), (7, 10, 7.5), (7, 12, 2.0), (7, 14, 4.5), (7, 16, 7.0), (7, 18, 9.5), (8, 3, 8.5), (8, 6, 2.5), (8, 8, 5.0), (8, 11, 3.5), (8, 13, 6.5), (8, 15, 1.0), (8, 17, 4.5), (8, 19, 7.0), (9, 2, 5.0), (9, 5, 8.0), (9, 7, 1.5), (9, 10, 4.0), (9, 12, 6.5), (9, 14, 9.0), (9, 16, 2.5), (9, 18, 5.5), (10, 1, 6.5), (10, 4, 3.0), (10, 6, 5.5), (10, 8, 8.0), (10, 11, 2.0), (10, 13, 4.5), (10, 15, 7.0), (10, 17, 9.5), (10, 19, 1.5);
-
Create the table to use for generating predictions. This is the test dataset. It has the same columns as the training dataset.
mysql> CREATE TABLE testing_dataset ( user_id VARCHAR(3), item_id VARCHAR(3), rating DECIMAL(3, 1), PRIMARY KEY (user_id, item_id) );
-
Insert the sample data to test into the table. Copy and paste the following commands.
INSERT INTO testing_dataset (user_id, item_id, rating) VALUES (1, 2, 4.0), (1, 4, 7.0), (1, 6, 1.5), (1, 8, 3.5), (2, 1, 5.0), (2, 3, 8.0), (2, 5, 2.5), (2, 7, 6.5), (3, 2, 3.5), (3, 5, 6.5), (3, 8, 2.5), (3, 18, 7.0), (4, 1, 5.5), (4, 3, 8.5), (4, 6, 2.0), (4, 7, 5.5), (5, 2, 7.0), (5, 4, 1.5), (5, 6, 4.0), (5, 12, 5.0), (6, 3, 6.0), (6, 5, 1.5), (6, 7, 4.5), (6, 8, 7.0), (7, 1, 6.5), (7, 4, 3.0), (7, 5, 5.5), (7, 9, 8.0), (8, 2, 8.5), (8, 4, 2.5), (8, 6, 5.0), (8, 9, 3.5), (9, 1, 5.0), (9, 3, 8.0), (9, 7, 2.5), (9, 8, 5.5), (10, 2, 6.5), (10, 5, 3.0), (10, 6, 5.5), (10, 18, 1.5);
Learn how to Train a Recommendation Model.