MySQL :: HeatWave User Guide :: 4.3.2 Summarizing Content

Before You Begin

Review the Requirements.
Connect to your HeatWave Database System.
For Running Batch Queries, add the natural-language queries to a column in a new or existing table.

Summarizing Content

To summarize text, perform the following steps:

To load the LLM in HeatWave memory, use the ML_MODEL_LOAD routine:
```
call sys.ML_MODEL_LOAD("LLM", NULL);
```
Replace LLM with the name of the LLM that you want to use. Summarization supports HeatWave In-Database LLMs only.

For example:
```
call sys.ML_MODEL_LOAD("mistral-7b-instruct-v1", NULL);
```
This step is optional. The ML_GENERATE routine loads the specified LLM too. But it takes a bit longer to load the LLM and generate the output when you run it for the first time.

To define the text that you want to summarize, set the @text variable:

set @text="TextToSummarize";

Replace TextToSummarize with the text that you want to summarize.

For example:

set @text="Artificial Intelligence (AI) is a rapidly growing field that has the potential to
revolutionize how we live and work. AI refers to the development of computer systems that can
perform tasks that typically require human intelligence, such as visual perception, speech
recognition, decision-making, and language translation.\n\nOne of the most significant developments in
AI in recent years has been the rise of machine learning, a subset of AI that allows computers to learn
from data without being explicitly programmed. Machine learning algorithms can analyze vast amounts
of data and identify patterns, making them increasingly accurate at predicting outcomes and making
decisions.\n\nAI is already being used in a variety of industries, including healthcare, finance, and
transportation. In healthcare, AI is being used to develop personalized treatment plans for patients
based on their medical history and genetic makeup. In finance, AI is being used to detect fraud and make
investment recommendations. In transportation, AI is being used to develop self-driving cars and improve
traffic flow.\n\nDespite the many benefits of AI, there are also concerns about its potential impact on
society. Some worry that AI could lead to job displacement, as machines become more capable of performing
tasks traditionally done by humans. Others worry that AI could be used for malicious ";

To generate the text summary, pass the original text to the LLM using the ML_GENERATE routine, with the task parameter set to summarization:

select sys.ML_GENERATE(@query, JSON_OBJECT("task", "summarization", "model_id", "LLM", "language", "Language"));

Replace the following:

LLM: LLM to use, which must be the same as the one you loaded in the previous step. To view the lists of supported LLMs, see HeatWave In-Database LLMs and OCI Generative AI Service LLMs.
Language: the two-letter ISO 639-1 code for the language you want to use. Default language is en, which is English. To view the list of supported languages, see Languages.

The language parameter is supported in MySQL 9.0.1-u1 and later versions.

For example:

select sys.ML_GENERATE(@text, JSON_OBJECT("task", "summarization", "model_id", "mistral-7b-instruct-v1", "language", "en"));

A text summary generated by the LLM in response to your query is printed as output. It looks similar to the text output shown below:

| {"text": " Artificial Intelligence (AI) is a rapidly growing field with the potential to revolutionize
how we live and work. It refers to computer systems that can perform tasks requiring human intelligence, such
as visual perception, speech recognition, decision-making, and language translation. Machine learning, a
subset of AI, allows computers to learn from data without being explicitly programmed, making them increasingly
accurate at predicting outcomes and making decisions. AI is already being used in healthcare, finance, and
transportation industries for personalized treatment plans, fraud detection, and self-driving cars. However,
there are concerns about its potential impact on society, including job displacement and malicious use."} |

Running Batch Queries

To run multiple summarization queries in parallel, use the ML_GENERATE_TABLE routine. This method is faster than running the ML_GENERATE routine multiple times.

Note

To alter an existing table or create a new table, MySQL requires you to set the sql-require-primary-key system variable to 0.

The ML_GENERATE_TABLE routine is supported in MySQL 9.0.1-u1 and later versions.

To run batch queries using ML_GENERATE_TABLE, perform the following steps:

To load the LLM in HeatWave memory, use the ML_MODEL_LOAD routine:
```
call sys.ML_MODEL_LOAD("LLM", NULL);
```
Replace LLM with the name of the LLM that you want to use. To view the lists of supported LLMs, see HeatWave In-Database LLMs and OCI Generative AI Service LLMs.

For example:
```
call sys.ML_MODEL_LOAD("mistral-7b-instruct-v1", NULL);
```
This step is optional. The ML_GENERATE_TABLE routine loads the specified LLM too. But it takes a bit longer to load the LLM and generate the output when you run it for the first time.
In the ML_GENERATE_TABLE routine, specify the table columns containing the input queries and for storing the generated text summaries:
```
call sys.ML_GENERATE_TABLE("InputDBName.InputTableName.InputColumn", "OutputDBName.OutputTableName.OutputColumn", JSON_OBJECT("task", "summarization", "model_id", "LLM", "language", "Language"));
```
Replace the following:
- InputDBName: the name of the database that contains the table column where your input queries are stored.
- InputTableName: the name of the table that contains the column where your input queries are stored.
- InputColumn: the name of the column that contains input queries.
- OutputDBName: the name of the database that contains the table where you want to store the generated outputs. This can be the same as the input database.
- OutputTableName: the name of the table where you want to create a new column to store the generated outputs. This can be the same as the input table. If the specified table doesn't exist, a new table is created.
- OutputColumn: the name for the new column where you want to store the output generated for the input queries.
- LLM: LLM to use, which must be the same as the LLM you loaded in the previous step.
- Language: the two-letter ISO 639-1 code for the language you want to use. Default language is en, which is English. To view the list of supported languages, see Languages.
For example:
```
call sys.ML_GENERATE_TABLE("demo_db.input_table.Input", "demo_db.output_table.Output", JSON_OBJECT("task", "summarization", "model_id", "mistral-7b-instruct-v1", "language", "en"));
```
To learn more about the available routine options, see ML_GENERATE_TABLE Syntax.