HeatWave User Guide  /  ...  /  ML_EMBED_ROW

4.6.7 ML_EMBED_ROW

The ML_EMBED_ROW routine uses the specified embedding model to encode the specified text or query into a vector embedding. The routine returns a VECTOR that contains a numerical representation of the specified text.

ML_EMBED_ROW Syntax

mysql> select sys.ML_EMBED_ROW('Text', [options]);

options: {
  JSON_OBJECT('key','value'[,'key','value'] ...)
    'key','value': {
    ['model_id', {'all_minilm_l12_v2'|'multilingual-e5-small'}]
    ['truncate', {true|false}]
    }
}

Following are ML_EMBED_ROW parameters:

  • Text: specifies the text to encode.

  • options: specifies optional parameters as key-value pairs in JSON format. It can include the following parameters:

    • model_id: specifies the embedding model to use for encoding the text. Default value is all_minilm_l12_v2. Possible values are:

      • all_minilm_l12_v2: for encoding English text.

      • multilingual-e5-small: for encoding text in supported languages other than English (en). This embedding model is available in HeatWave 9.0.1-u1 and later versions.

      To view the lists of supported models, see Embedding Models. To view the list of supported languages, see Languages.

    • truncate: specifies whether to truncate inputs longer than the maximum token size. Default value is true.

Syntax Examples

  • Embedding an English query using the all_minilm_l12_v2 embedding model, and store the generated embedding in the @text_embedding session variable:

    select sys.ML_EMBED_ROW("What is artificial intelligence?", JSON_OBJECT("model_id", "all_minilm_l12_v2")) into @text_embedding;