10.2.13 NL2ML

MySQL 9.4.0 introduces the NL2ML (natural language to machine learning) routine, which provides step-by-step guidelines on using MySQL HeatWave AutoML for a particular business problem. Use this routine to ask general questions about MySQL HeatWave AutoML and to generate AutoML queries you can copy and run.

Note

Oracle Cloud Infrastructure Generative AI is available in specific regions. To confirm if your region has Generative AI, see Regions with Generative AI.

NL2ML Syntax

mysql> CALL sys.NL2ML (query, response);

NL2ML parameters:

  • query: Enter a question in natural language related to MySQL HeatWave AutoML. For example, "What are the different types of machine learning models I can create?".

  • response: The name of the JSON object session variable that contains the response to the question.

To view the chat history of the last 20 questions and responses and the current Generative AI model_id, query the @nl2ml_options session variable. Review Syntax Examples to see how to reset the chat history and change the model_id. To review available options for model_id, see Pretrained Foundational Models in Generative AI.

Syntax Examples

  • Ask a question about using MySQL HeatWave AutoML:

    mysql> CALL sys.NL2ML("Is there any limit on the size of training table?", @output);
    Query OK, 0 rows affected (11.0086 sec)
    
    mysql> SELECT @output;
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | @output |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {"text": "Yes, there are limits on the size of the training table for HeatWave AutoML models. The table used to train a model cannot exceed 10 GB in size, 100 million rows, or 1017 columns."} |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.0003 sec)
  • Ask a question, and then ask a follow-up question based on the first one:

    mysql> CALL sys.NL2ML("What is the purpose of task in ML_TRAIN?", @output);
    Query OK, 0 rows affected (32.4279 sec)
    
    mysql> SELECT @output;
    +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | @output |
    +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {"text": "The purpose of 'task' in ML_TRAIN is to specify the machine learning task, such as classification, regression, forecasting, anomaly detection, recommendation, or topic modeling."} |
    +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.0003 sec)
    mysql> CALL sys.NL2ML("Can you give me an example of using that?", @output);
    Query OK, 0 rows affected (11.7221 sec)
    
    mysql> SELECT @output;
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | @output |
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {"text": "Here's an example of specifying a machine learning task in ML_TRAIN:
               \n\n```sql\nCALL sys.ML_TRAIN('mlcorpus.emails', 'spam_label', JSON_OBJECT('task', 'classification'), @model);
               \n```\n\nIn this example, the machine learning task is specified as 'classification' and the target column is 'spam_label' in the 'emails' table."} |
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.0003 sec)
  • Ask for an appropriate table in the DB System to use for a training query:

    mysql> CALL sys.NL2ML("Can you give me a training query for email spam detection?", @output);
    Query OK, 0 rows affected (8.5084 sec)
    
    mysql> SELECT @output;
    +-------------------------------------------------------------------------------------------------------------------------------------------+
    | @output |
    +-------------------------------------------------------------------------------------------------------------------------------------------+
    | {"text": "```sql\nCALL sys.ML_TRAIN('mlcorpus.emails', 'spam_label', \n JSON_OBJECT('task', 'classification'), @email_spam_model);\n```"} |
    +-------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.0003 sec)
  • Request debugging assistance for an error message:

    mysql> CALL sys.NL2ML("When I tried to train a model I got this error: can not train a regression or recommendation task with non-numeric targte column What does this mean? How can I resolve the issue?", @output);
    Query OK, 0 rows affected (18.0766 sec)
    
    mysql> SELECT @output;
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | @output                                                                                                                                                                                                                                                                                                                                |
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {"text": "The error \"can not train a regression or recommendation task with non-numeric target column\" means that you are trying to train a regression or recommendation model on a dataset where the target column (the column you are trying to predict) is not numeric.
                \n\nRegression models require a numeric target column, as they are designed to predict continuous values. Recommendation models also require numeric target columns, as they are designed to predict ratings or scores.
                \n\nTo resolve this issue, you need to either:\n\n1. Choose a different machine learning task that can handle non-numeric target columns, such as classification.\n2. Convert your target column to a numeric data type, if possible. 
                For example, if your target column is a categorical variable with a limited number of categories, you can use one-hot encoding or label encoding to convert it to a numeric representation.\n3. Select a different column as the target column that is numeric.
                \n\nIn your case, since you are using the \"food_delivery\" table, you may want to check the data type of the column you are trying to predict and see if it can be converted to a numeric data type. If not, you may want to choose a different machine learning task or select a different column as the target column."} |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.0003 sec)
  • Retrieve the chat history of questions and responses with the @nl2ml_options session variable:

    mysql> SELECT JSON_PRETTY(@nl2ml_options);
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | JSON_PRETTY(@nl2ml_options) |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {
    "model_id": "meta.llama-3.1-405b-instruct",
    "chat_history": [
    {
    "user_message": "Is there any limit on the size of training table?",
    "chat_bot_message": "Yes, there is a size limit for training tables in HeatWave. The table used to train a model cannot exceed 10 GB, 100 million rows, or 1017 columns."
    }
    ]
    } |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.0003 sec)
  • Use one of the following commands to reset the chat history:

    mysql> SET @nl2ml_options = NULL;
    
    mysql> SET @nl2ml_options = JSON_REMOVE(@nl2ml_options, '$.chat_history');
  • Change the current model_id and reset the chat history:

    mysql> SET @nl2ml_options = JSON_OBJECT("model_id", "meta.llama-3.1-405b-instruct");
  • Change the current model_id and maintain the current chat history:

    mysql> SET @nl2ml_options = JSON_SET(@nl2ml_options, '$.model_id', "meta.llama-3.1-405b-instruct");