MySQL :: MySQL AI 9.5 :: 8.2.8 ML_EMBED

9.6
9.4

MySQL AI 9.5 / ... / ML_EMBED_TABLE

8.2.8 ML_EMBED_TABLE

The ML_EMBED_TABLE routine runs multiple embedding generations in a batch, in parallel.

This topic contains the following sections:

To learn about the privileges you need to run this routine, see Section 5.3, “Required Privileges for using GenAI”.

ML_EMBED_TABLE Syntax

mysql> CALL sys.ML_EMBED_TABLE('InputTableColumn', 'OutputTableColumn'[, options]);
    
options: JSON_OBJECT(keyvalue[, keyvalue] ...)
keyvalue: 
{
  'model_id', {'ModelID'}
  |'truncate', {true|false}
  |'batch_size', BatchSize
  |'details_column', 'ErrorDetailsColumnName'
}

Following are ML_EMBED_TABLE parameters:

InputTableColumn: specifies the names of the input database, table, and column that contains the text to encode. The InputTableColumn is specified in the following format: DBName.TableName.ColumnName.
- The specified input table can be an internal or external table.
- The specified input table must already exist, must not be empty, and must have a primary key.
- The input column must already exist and must contain text or varchar values.
- The input column must not be a part of the primary key and must not have NULL values or empty strings.
- There must be no backticks used in the DBName, TableName, or ColumnName and there must be no period used in the DBName or TableName.
OutputTableColumn: specifies the names of the database, table, and column where the generated embeddings are stored. The OutputTableColumn is specified in the following format: DBName.TableName.ColumnName.
- The specified output table must be an internal table.
- If the specified output table already exists, then it must be the same as the input table. And, the specified output column must not already exist in the input table. A new VECTOR column is added to the table. External tables are read only. So if input table is an external table, then it cannot be used to store the output.
- If the specified output table doesn't exist, then a new table is created. The new output table has key columns which contains the same primary key values as the input table and a VECTOR column that stores the generated embeddings.
- There must be no backticks used in the DBName, TableName, or ColumnName and there must be no period used in the DBName or TableName.
options: specifies optional parameters as key-value pairs in JSON format. It can include the following parameters:
- model_id: specifies the embedding model to use for encoding the text. Default value is multilingual-e5-small. Possible values are:
  - all_minilm_l12_v2 or minilm
  - multilingual-e5-small
  To view the lists of available embedding models, see In-Database Embedding Model.
- truncate: specifies whether to truncate inputs longer than the maximum token size. Default value is true.
- batch_size: specifies the batch size for the routine. This option is supported for internal tables only. Default value is 1000. Possible values are integer values between 1 and 1000.
- details_column: specifies a name for the output table column that is created for adding details of errors encountered for rows that aren't processed successfully by the routine. Ensure that a column by the specified name does not already exist in the table. Default value is details.

Syntax Examples

Generate embeddings for text stored in demo_db.input_table.Input using the all_minilm_l12_v2 embedding model, and save the generated embeddings in the output table demo_db.output_table.Output:

mysql> CALL sys.ML_EMBED_TABLE("demo_db.input_table.Input", "demo_db.output_table.Output", JSON_OBJECT("model_id", "all_minilm_l12_v2"));

8.2.8 ML_EMBED_TABLE

ML_EMBED_TABLE Syntax

Syntax Examples

See Also