HeatWave User Guide  /  ...  /  Summarizing Content

4.3.2 Summarizing Content

The following sections in this topic describe how to summarize exiting content using HeatWave GenAI:

Before You Begin

  • Review the Requirements.

  • Connect to your HeatWave Database System.

  • For Running Batch Queries, add the natural-language queries to a column in a new or existing table.

Summarizing Content

To summarize text, perform the following steps:

  1. To define the text that you want to summarize, set the @text variable:

    SET @text="TextToSummarize";

    Replace TextToSummarize with the text that you want to summarize.

    For example:

    SET @text="Artificial Intelligence (AI) is a rapidly growing field that has the potential to
    revolutionize how we live and work. AI refers to the development of computer systems that can
    perform tasks that typically require human intelligence, such as visual perception, speech
    recognition, decision-making, and language translation.\n\nOne of the most significant developments in
    AI in recent years has been the rise of machine learning, a subset of AI that allows computers to learn
    from data without being explicitly programmed. Machine learning algorithms can analyze vast amounts
    of data and identify patterns, making them increasingly accurate at predicting outcomes and making
    decisions.\n\nAI is already being used in a variety of industries, including healthcare, finance, and
    transportation. In healthcare, AI is being used to develop personalized treatment plans for patients
    based on their medical history and genetic makeup. In finance, AI is being used to detect fraud and make
    investment recommendations. In transportation, AI is being used to develop self-driving cars and improve
    traffic flow.\n\nDespite the many benefits of AI, there are also concerns about its potential impact on
    society. Some worry that AI could lead to job displacement, as machines become more capable of performing
    tasks traditionally done by humans. Others worry that AI could be used for malicious ";
  2. To generate the text summary, pass the original text to the LLM using the ML_GENERATE routine, with the task parameter set to summarization:

    SELECT sys.ML_GENERATE(@query, JSON_OBJECT("task", "summarization", "model_id", "LLM", "language", "Language"));

    Replace the following:

    • LLM: LLM to use. To view the lists of available LLMs, see HeatWave In-Database LLMs and OCI Generative AI Service LLMs.

    • Language: the two-letter ISO 639-1 code for the language you want to use. Default language is en, which is English. To view the list of supported languages, see Languages.

      The language parameter is supported as of MySQL 9.0.1-u1.

    For example:

    SELECT sys.ML_GENERATE(@text, JSON_OBJECT("task", "summarization", "model_id", "mistral-7b-instruct-v1", "language", "en"));

    A text summary generated by the LLM in response to your query is printed as output. It looks similar to the text output shown below:

    | {"text": " Artificial Intelligence (AI) is a rapidly growing field with the potential to revolutionize
    how we live and work. It refers to computer systems that can perform tasks requiring human intelligence, such
    as visual perception, speech recognition, decision-making, and language translation. Machine learning, a
    subset of AI, allows computers to learn from data without being explicitly programmed, making them increasingly
    accurate at predicting outcomes and making decisions. AI is already being used in healthcare, finance, and
    transportation industries for personalized treatment plans, fraud detection, and self-driving cars. However,
    there are concerns about its potential impact on society, including job displacement and malicious use."} |

Running Batch Queries

To run multiple summarization queries in parallel, use the ML_GENERATE_TABLE routine. This method is faster than running the ML_GENERATE routine multiple times.

Note

In versions older than MySQL 9.2.1, to alter an existing table or create a new table, MySQL requires you to set the sql-require-primary-key system variable to 0.

The ML_GENERATE_TABLE routine is supported as of MySQL 9.0.1-u1.

To run the steps in this section, create a new database demo_db and table input_table:

CREATE DATABASE demo_db;
USE demo_db;
CREATE TABLE input_table (id INT AUTO_INCREMENT, Input TEXT, primary key (id));
INSERT INTO input_table (Input) VALUES('MySQL is a widely used open-source relational database management system or RDBMS that is based on the SQL standard. It is designed to be highly scalable, reliable, and secure, making it an ideal choice for businesses of all sizes. MySQL uses a client-server architecture, where the server stores and manages the data, while clients connect to the server to access and manipulate the data. The MySQL server can be installed on a variety of operating systems, including Linux, Windows, and macOS. One of the key features of MySQL is its support for stored procedures, which allow developers to create reusable blocks of code that can be executed multiple times. This makes it easier to manage complex database operations and reduces the amount of code that needs to be written. MySQL also supports a wide range of data types, including integers, floating-point numbers, dates, and strings. It also has built-in support for encryption, which helps to protect sensitive data from unauthorized access. Another important feature of MySQL is its ability to handle large amounts of data. It can scale horizontally by adding more servers to the cluster, or vertically by upgrading the hardware.');
INSERT INTO input_table (Input) VALUES('Artificial Intelligence or AI refers to the simulation of human intelligence in machines that are programmed to think and act like humans. The goal of AI is to create systems that can function intelligently and independently, exhibiting traits associated with human intelligence such as reasoning, problem-solving, perception, learning, and understanding language. There are two main types of AI: narrow or weak AI, and general or strong AI. Narrow AI is designed for a specific task and is limited in its abilities, while general AI has the capability to understand or learn any intellectual task that a human being can. AI technologies include machine learning, which allows systems to improve their performance based on data, and deep learning, which involves the use of neural networks to model complex patterns. Other AI techniques include natural language processing, robotics, and expert systems. AI has numerous applications across various industries, including healthcare, finance, transportation, and education. It has the potential to revolutionize the way we live and work by automating tasks, improving efficiency, and enabling new innovations. However, there are also concerns about the impact of AI on employment, privacy, and safety.');
INSERT INTO input_table (Input) VALUES('Machine learning is a subset of artificial intelligence that involves the development of algorithms and statistical models that enable systems to improve their performance on a specific task over time by learning from data. At its core, machine learning is about using data to train machines to make predictions or decisions without being explicitly programmed to do so. There are many different types of machine learning, including supervised learning, unsupervised learning, and reinforcement learning. In supervised learning, the algorithm is trained on labeled data, meaning that the input data has been categorized or classified by a human. The goal of supervised learning is to enable the machine to make predictions based on this training data. Unsupervised learning, on the other hand, involves training the algorithm on unlabeled data. In this case, the algorithm must identify patterns and relationships in the data on its own. This type of learning is often used for tasks such as clustering or anomaly detection. Reinforcement learning involves an agent interacting with an environment and learning by trial and error. The agent receives feedback in the form of rewards or punishments, which it uses to improve its behavior over time. This type of learning is often used in game playing or robotics.');

To run batch queries using ML_GENERATE_TABLE, perform the following steps:

  1. In the ML_GENERATE_TABLE routine, specify the table columns containing the input queries and for storing the generated text summaries:

    CALL sys.ML_GENERATE_TABLE("InputDBName.InputTableName.InputColumn", "OutputDBName.OutputTableName.OutputColumn", JSON_OBJECT("task", "summarization", "model_id", "LLM", "language", "Language"));

    Replace the following:

    • InputDBName: the name of the database that contains the table column where your input queries are stored.

    • InputTableName: the name of the table that contains the column where your input queries are stored.

    • InputColumn: the name of the column that contains input queries.

    • OutputDBName: the name of the database that contains the table where you want to store the generated outputs. This can be the same as the input database.

    • OutputTableName: the name of the table where you want to create a new column to store the generated outputs. This can be the same as the input table. If the specified table doesn't exist, a new table is created.

    • OutputColumn: the name for the new column where you want to store the output generated for the input queries.

    • LLM: LLM to use.

    • Language: the two-letter ISO 639-1 code for the language you want to use. Default language is en, which is English. To view the list of supported languages, see Languages.

    For example:

    CALL sys.ML_GENERATE_TABLE("demo_db.input_table.Input", "demo_db.output_table.Output", JSON_OBJECT("task", "summarization", "model_id", "mistral-7b-instruct-v1", "language", "en"));
  2. View the contents of the output table:

    SELECT * FROM output_table;
    | id | Output                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
    +----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |  1 | {"text": " MySQL is an open-source RDBMS that is widely used for its scalability, reliability, and security. It uses a client-server architecture and supports stored procedures, multiple data types, encryption, and large amounts of data.", "error": null}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
    |  2 | {"text": " AI refers to the development of machines that can think and act like humans. The goal is to create systems that can function independently and exhibit human-like intelligence traits such as reasoning, problem-solving, perception, learning, and language understanding. There are two types of AI: narrow and general. Narrow AI is designed for a specific task and has limited abilities, while general AI can understand any intellectual task that a human can. AI technologies include machine learning, deep learning, natural language processing, robotics, and expert systems. AI has numerous applications across various industries and has the potential to revolutionize how we live and work. However, there are concerns about its impact on employment, privacy, and safety.", "error": null} |
    |  3 | {"text": " Machine learning is a subset of AI that uses algorithms and statistical models to improve performance on tasks by learning from data. It involves supervised, unsupervised, and reinforcement learning methods where the algorithm is trained on labeled or unlabeled data, identifies patterns, and learns by trial and error.", "error": null}                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
    +----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

    As of MySQL 9.3.0, the output table generated using the ML_GENERATE_TABLE routine contains an additional details for error reporting. In case the routine fails to generate output for specific rows, details of the errors encountered and default values used are added for the row in the output column.

If you created a new database for testing the steps in this section, ensure that you delete the database to avoid being billed for it:

DROP DATABASE demo_db;

To learn more about the available routine options, see ML_GENERATE_TABLE Syntax.