10.3.9 NL_SQL

Generates SQL queries using natural-language statements. The routine also runs the generated SQL statement and displays the result set. You can use this routine for generating and running SQL queries only for databases and tables that you have access to.

The LLM-generated SQL statements might contain syntax errors. The routine automatically detects these errors, and retries the SQL generation until a syntactically valid SQL statement is generated, with a maximum of 3 generation attempts.

This routine is available as of MySQL 9.4.1.

NL_SQL Syntax

mysql> CALL sys.NL_SQL("NaturalLanguageStatement", @output[, options]);

  options: JSON_OBJECT(keyvalue[, keyvalue]...)
keyvalue: 
{
  'execute', {true|false}
  | 'schemas', JSON_ARRAY('DBName'[, 'DBName'] ...)
  | 'tables', JSON_ARRAY(TableJSON[, TableJSON] ...)
  | 'model_id', 'ModelID'
  | 'verbose', {0|1|2}
  | 'include_comments', {true|false}
  | 'use_retry', {true|false}
}

Following are NL_SQL parameters:

  • NaturalLanguageQuery: natural-language query pertaining to your data available in MySQL HeatWave that you want to convert to an SQL query.

  • @output: output parameter that includes the list of tables and databases considered for generating the SQL query, Model ID of the Large Language Model (LLM) used for generating the query, the generated SQL query, and whether the generated SQL query is valid.

  • options: specifies optional parameters as key-value pairs in JSON format. It can include the following parameters:

    • execute: specifies whether the procedure automatically runs the generated SQL statement. Default value is true.

    • schemas: specifies the databases to consider for generating and running SQL queries. You can specify up to five databases. By default, databases that the routine finds most relevant to the entered natural-language statement are considered.

    • tables: specifies the tables to consider for generating and running SQL queries in JSON format. You can specify up to 50 tables. By default, tables that the routine finds most relevant to the entered natural-language statement are considered.

      TableJSON: JSON_OBJECT('schema_name', 'DBName', 'table_name', 'TableName')
      Note

      You can either use the schemas option to specify the databases to consider or the tables option to specify the tables to consider. If you set both these options, the routine fails.

    • model_id: specifies the LLM to use for generating the SQL query. Default value is meta.llama-3.3-70b-instruct. Possible values are meta.llama-3.3-70b-instruct, llama3.1-8b-instruct-v1, and llama3.2-3b-instruct-v1.

    • verbose: specifies whether to print an output. Possible values are: 0: prints nothing, 1: prints the generated SQL statement, and 2: prints debugging information. Default value is 1.

    • include_comments: specifies whether comments are to be included during metadata collection for columns or tables. Default value is true.

    • use_retry: specifies whether generation retries for syntactically invalid SQL statements can be attempted. Default value is true.

Syntax Examples

The examples in this topic uses a sample database, airport. To learn how to download and then load airportdb in MySQL HeatWave Cluster, see Section 3.2, “Quickstart: AirportDB Analytics”.

  • Following example specifies the database to consider for the SQL query:

    mysql> CALL sys.NL_SQL("How many flights are there in total?",@output, JSON_OBJECT('schemas',JSON_ARRAY('airportdb'),'model_id','llama3.1-8b-instruct-v1'));
    +-----------------------------------------------------+
    | Executing generated SQL statement...                |
    +-----------------------------------------------------+
    | SELECT COUNT(`flight_id`) FROM `airportdb`.`flight` |
    +-----------------------------------------------------+
    1 row in set (1 min 39.4060 sec)
    
    +--------------------+
    | COUNT(`flight_id`) |
    +--------------------+
    |             462553 |
    +--------------------+

    View the value stored in the variable @output:

    mysql> SELECT JSON_PRETTY(@output);
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | JSON_PRETTY(@output)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {
      "tables": [
        "airportdb.airline",
        "airportdb.airplane",
        "airportdb.airplane_type",
        "airportdb.airport",
        "airportdb.airport_geo",
        "airportdb.airport_reachable",
        "airportdb.booking",
        "airportdb.employee",
        "airportdb.flight",
        "airportdb.flight_log",
        "airportdb.flightschedule",
        "airportdb.passenger",
        "airportdb.passengerdetails",
        "airportdb.weatherdata"
      ],
      "schemas": [
        "airportdb"
      ],
      "model_id": "llama3.1-8b-instruct-v1",
      "sql_query": "SELECT COUNT(`flight_id`) FROM `airportdb`.`flight`",
      "is_sql_valid": 1
    } |
    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  • Following example specifies the tables to consider for the SQL query:

    mysql> CALL sys.NL_SQL("List five airlines that have the highest number of Airbus A330 aircrafts with the total number of the Airbus A330 aircrafts they have.",
      @output, JSON_OBJECT('tables', 
      JSON_ARRAY(JSON_OBJECT("schema_name","airportdb","table_name","airlines"), 
      JSON_OBJECT("schema_name","airportdb","table_name","airplane"), 
      JSON_OBJECT("schema_name","airportdb","table_name","airplane_type")),
      'model_id','llama3.1-8b-instruct-v1'));
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Executing generated SQL statement...                                                                                                                                                                                                                 |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | SELECT `T1`.`airline_id`, COUNT(*) FROM `airportdb`.`airplane` AS `T1` JOIN `airportdb`.`airplane_type` AS `T2` ON `T1`.`type_id` = `T2`.`type_id` WHERE `T2`.`identifier` = 'Airbus A330' GROUP BY `T1`.`airline_id` ORDER BY COUNT(*) DESC LIMIT 5 |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (1 min 40.3538 sec)
    
    +------------+----------+
    | airline_id | COUNT(*) |
    +------------+----------+
    |         78 |       13 |
    |         55 |       11 |
    |         46 |       11 |
    |         73 |       10 |
    |         33 |       10 |
    +------------+----------+

    View the value stored in the variable @output:

    mysql> SELECT JSON_PRETTY(@output);
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | JSON_PRETTY(@output)                                                                                                                                                                                                                                                                                                                                                                                                                                   |
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {
      "tables": [
        "airportdb.airplane",
        "airportdb.airplane_type"
      ],
      "schemas": [
        "airportdb"
      ],
      "model_id": "llama3.1-8b-instruct-v1",
      "sql_query": "SELECT `T1`.`airline_id`, COUNT(*) FROM `airportdb`.`airplane` AS `T1` JOIN `airportdb`.`airplane_type` AS `T2` ON `T1`.`type_id` = `T2`.`type_id` WHERE `T2`.`identifier` = 'Airbus A330' GROUP BY `T1`.`airline_id` ORDER BY COUNT(*) DESC LIMIT 5",
      "is_sql_valid": 1
    } |
    +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  • Following example does not specify any additional options:

    mysql> CALL sys.NL_SQL('Find five most frequent weekly flights from United States to United Kingdom. First deduce what frequency refers to wrt the WEEKLY part', @output, NULL);
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Executing generated SQL statement...                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | SELECT `T1`.`flightno`, `T2`.`name` AS `departure_airport`, `T3`.`name` AS `arrival_airport`, SUM(`T1`.`monday` + `T1`.`tuesday` + `T1`.`wednesday` + `T1`.`thursday` + `T1`.`friday` + `T1`.`saturday` + `T1`.`sunday`) AS `frequency` FROM `airportdb`.`flightschedule` AS `T1` JOIN `airportdb`.`airport` AS `T2` ON `T1`.`from` = `T2`.`airport_id` JOIN `airportdb`.`airport` AS `T3` ON `T1`.`to` = `T3`.`airport_id` JOIN `airportdb`.`airport_geo` AS `T4` ON `T2`.`airport_id` = `T4`.`airport_id` JOIN `airportdb`.`airport_geo` AS `T5` ON `T3`.`airport_id` = `T5`.`airport_id` WHERE `T4`.`country` = 'United States' AND `T5`.`country` = 'United Kingdom' GROUP BY `T1`.`flightno`, `T2`.`name`, `T3`.`name` ORDER BY `frequency` DESC LIMIT 5 |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (27.2586 sec)
    
    +----------+---------------------+-----------------+-----------+
    | flightno | departure_airport   | arrival_airport | frequency |
    +----------+---------------------+-----------------+-----------+
    | UZ6906   | SANTA YNEZ          | BRISTOL         |         6 |
    | ZI7472   | ENID WOODRING REGL  | ST ATHAN AB     |         6 |
    | BR4282   | KINGS LAND O'LAKES  | SIBSON          |         6 |
    | GR1230   | TIPTON              | WADDINGTON AB   |         6 |
    | AZ4913   | BIG PINEY-MARBLETON | LUTON           |         5 |
    +----------+---------------------+-----------------+-----------+
    5 rows in set (27.2586 sec)

    View the value stored in the variable @output:

    mysql> SELECT JSON_PRETTY(@output);
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | JSON_PRETTY(@output)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | {
      "tables": [
        "airportdb.flightschedule",
        "airportdb.flight",
        "airportdb.airport_reachable",
        "airportdb.booking",
        "airportdb.airplane",
        "airportdb.flight_log",
        "airportdb.airport_geo",
        "airportdb.passengerdetails",
        "airportdb.weatherdata",
        "airportdb.airline",
        "airportdb.passenger",
        "airportdb.airplane_type",
        "airportdb.airport",
        "airportdb.employee"
      ],
      "schemas": [
        "airportdb"
      ],
      "model_id": "meta.llama-3.3-70b-instruct",
      "sql_query": "SELECT `T1`.`flightno`, `T2`.`name` AS `departure_airport`, `T3`.`name` AS `arrival_airport`, SUM(`T1`.`monday` + `T1`.`tuesday` + `T1`.`wednesday` + `T1`.`thursday` + `T1`.`friday` + `T1`.`saturday` + `T1`.`sunday`) AS `frequency` FROM `airportdb`.`flightschedule` AS `T1` JOIN `airportdb`.`airport` AS `T2` ON `T1`.`from` = `T2`.`airport_id` JOIN `airportdb`.`airport` AS `T3` ON `T1`.`to` = `T3`.`airport_id` JOIN `airportdb`.`airport_geo` AS `T4` ON `T2`.`airport_id` = `T4`.`airport_id` JOIN `airportdb`.`airport_geo` AS `T5` ON `T3`.`airport_id` = `T5`.`airport_id` WHERE `T4`.`country` = 'United States' AND `T5`.`country` = 'United Kingdom' GROUP BY `T1`.`flightno`, `T2`.`name`, `T3`.`name` ORDER BY `frequency` DESC LIMIT 5",
      "is_sql_valid": 1
    } |
    +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+