You can use Auto Parallel Load to set up the automatic creation of an
external table along with the loading of data from files in
Object Storage. We recommend setting up Auto Parallel Load with an
input_list variable, and then using the
HEATWAVE_LOAD command, which creates the
external table and loads the data in one operation. To learn
more about loading data, see
Load
Structured Data Using Lakehouse Auto Parallel Load.
This topic contains the following sections:
-
Prepare to create external tables by reviewing the following:
-
Review the relevant syntax for creating external tables and loading data:
Auto Parallel Load Syntax for the HEATWAVE_LOAD command
The following example shows how to create an external table
with Auto Parallel Load in one command using an
OCIFS URI.
Since no dialect parameters are included, Lakehouse
specifies a CSV file by default with all default settings.
Options are set to NULL.
mysql> CALL sys.HEATWAVE_LOAD(
'[{"db_name": "db_1",
"tables": [{
"table_name": "table_1",
"engine_attribute": {
"file": [{"uri": "oci://bucket_1@tenant_1/data_file_1.csv"}]
}
}]
}]', NULL);
While you can specify all parameters in one command, it
might be easier for you to first create the
input_list and store it in a variable.
You can then pass the variable in the
HEATWAVE_LOAD command when you load the
data.
The following example uses the same command to create the
external table, but creates the variable with the
input_list, and then passes that in the
HEATWAVE_LOAD command to load the data.
mysql> SET @input_list = '[{
"db_name": "db_1",
"tables": [{
"table_name": "table_1",
"engine_attribute": {
"file": [{"uri": "oci://bucket_1@tenant_1/data_file_1.csv"}]
}
}]
}]';
mysql> CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);
To demonstrate how to create an external table using Lakehouse Auto Parallel Load, the following example specifies a single CSV file by using an OCIFS URI.
The CSV file in this example is from
Bank
Marketing. To use this file, visit
Bank
Marketing and download the
bank+marketing.zip file. Unzip the file,
and then unzip the bank.zip file. Refer
to the bank.csv file.
To load external data using Lakehouse Auto Parallel Load:
Prepare the files to load in the proper format. See Supported File Formats.
Upload the files to load into Object Storage. See Uploading an Object Storage Object to a Bucket in Oracle Cloud Infrastructure Documentation.
-
Select the method to specify the files that will load the data in the external table: uniform resource identifier (URI), resource principals, or PAR. To learn more about each method, see the following:
-
Create a session variable with the characteristics of the
input_listto create the external table and specify the file to use to load data into the table. To review all syntax options, see HEATWAVE_LOAD.mysql> SET @input_list = '[{ "db_name": "bank_marketing", "tables": [{ "table_name": "bank_train", "engine_attribute": { "dialect": {"format": "csv", "has_header": true}, "file": [{"uri": "oci://mybucket@mynamespace/data_files/data_file_1.csv"}] } }] }]';Where:
@input_listis the name of the session variable.db_nameidentifies the database name to store the table:bank_marketing. The database is automatically created if it does not exist.table_namesets the table name to store the data:bank_train. The table is automatically created if it does not exist.engine_attributedefines the parameters of the external file.formatdefines the format of the external file:csv.has_headeridentifies a header in the external file. Auto Parallel Load then infers the column names from the first row in the file.urisets the URI link to access the file. Replacemybucket@mynamespace/data_files/data_file_1.csvwith your own link.
If you are on MySQL 9.1.2 and earlier, you need to update
dialectwith thefield delimiterandrecord delimiterparameters. As of MySQL 9.2.0, MySQL HeatWave Lakehouse can automatically detect these values. See Lakehouse External Table Syntax to learn more. -
Run the
HEATWAVE_LOADcommand with the characteristics of theinput_listinnormalmode to automatically create the external table and load data into the table. See Load Structured Data Using Lakehouse Auto Parallel Load for more information about loading data indryrunmode to test out the creation of the table and the loading of data beforehand. Keep in mind that theinput_listvariable is valid for the current session. If you end the session, you can no longer use the variable.mysql> CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), JSON_OBJECT('mode', 'normal'));
-
Review different ways to specify files when creating external tables by using Lakehouse Auto Parallel Load:
Review how to Load Structured Data Using Lakehouse Auto Parallel Load.