This section describes how to generate vector embeddings for files or folders, and load the embeddings into a vector store table.
The following sections in this topic describe how to ingest files into a vector store:
Review the GenAI requirements and privileges.
-
Place the files that you want to load in the vector store directory that you specified in the MySQL AI installer.
Vector store can ingest files in the following formats: PDF, PPTX, PPT, TXT, HTML, DOCX, and DOC.
To test the steps in this topic, create a folder
demo-directory
inside the vector store director/var/lib/mysql-files
for storing files that you want to ingest into the vector store. Then, download and place the MySQL HeatWave user guide PDF in thedemo-directory
folder. -
To create and store vector store tables using the steps described in this topic, you can create a new database
demo_db
:CREATE DATABASE demo_db;
The
VECTOR_STORE_LOAD
routine creates and loads vector embeddings asynchronously
into the vector store. You can ingest the source files into
the vector store using the following methods:
Perform the following steps:
-
To create the vector store table, use a new or existing database:
mysql> USE DBName;
Replace
DBName
with the database name.For example:
mysql> USE demo_db;
-
Optionally, to specify a name for the vector store table and language to use, set the
@options
variable:mysql> SET @options = JSON_OBJECT("table_name", "VectorStoreTableName", "language", "Language");
Replace the following:
VectorStoreTableName
: the name you want for the vector store table.Language
: the two-letterISO 639-1
code for the language you want to use. Default language isen
, which is English. To view the list of supported languages, see Languages.
For example:
mysql> SET @options = JSON_OBJECT("table_name", "demo_embeddings", "language", "en");
To learn more about the available routine options, see VECTOR_STORE_LOAD Syntax.
-
To import a file from the local filesystem and create a vector store table, use the
VECTOR_STORE_LOAD
routine:mysql> CALL sys.VECTOR_STORE_LOAD("file://FilePath", @options);
Replace
FilePath
with the unique reference index (URI) of the files or directories to be ingested into the vector store. A URI is considered to be one of the following:A glob pattern, if it contains at least one unescaped
?
or*
character.A prefix, if it is not a pattern and ends with a
/
character like a folder path.A file path, if it is neither a glob pattern nor a prefix.
NoteEnsure that the documents to be loaded are present in the directory that you specified for loading documents into the vector store during installation or using the
secure_file_priv
server system variable.For example:
mysql> CALL sys.VECTOR_STORE_LOAD("file:///var/lib/mysql-files/demo-directory/heatwave-en.pdf", @options);
This loads the specified file or files from the specified directory into the vector store table.
This creates an asynchronous task that runs in background and loads the specified file or files from the specified directory into the vector store table. The output of the
VECTOR_STORE_LOAD
routine contains the following:An ID of the task that is created.
A task query that you can use to track the progress of asynchronous task.
A task query that you can use to view the asynchronous task logs.
-
After the task is completed, verify that embeddings are loaded in the vector store table:
mysql> SELECT COUNT(*) FROM VectorStoreTableName;
For example:
mysql> SELECT COUNT(*) FROM demo_embeddings;
If you see a numerical value in the output, your embeddings are successfully loaded in the vector store table.
-
To view the details of the vector store table, use the following statement:
mysql> DESCRIBE demo_embeddings; +-------------------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------+---------------+------+-----+---------+-------+ | document_name | varchar(1024) | NO | | NULL | | | metadata | json | NO | | NULL | | | document_id | int unsigned | NO | PRI | NULL | | | segment_number | int unsigned | NO | PRI | NULL | | | segment | varchar(1024) | NO | | NULL | | | segment_embedding | vector(384) | NO | | NULL | | +-------------------+---------------+------+-----+---------+-------+
If you created a new database for testing the steps in this topic, delete the database to free up space:
mysql> DROP DATABASE demo_db;
Learn how to Update the Vector Store.
Learn how to Perform Vector Search With Retrieval-Augmented Generation.
Learn how to Start a Conversational Chat.