As of MySQL 9.3.1, and only on MySQL HeatWave on OCI, you can use uniform resource identifiers (URI) to specify files for loading data when creating external tables manually. Review the following examples for different ways to create external tables.
This topic contains the following sections:
-
To use URIs, you require the same privileges to use pre-authenticated requests and resource prinicipals.
To use PAR URIs, make sure you have the required PAR Privileges.
To use Native URIs and OCIFS URIs, make sure you have the required Resource Principal Privileges.
-
Create the external tables that will ingest the loaded data by either using SQL syntax or JSON syntax.
CREATE TABLE Statement for SQL syntax. Also review relevant examples showing how to create external tables.
CREATE TABLE Statement for JSON syntax. Also review relevant examples showing how to create external tables.
Review how to Create an External Table Manually.
There are three types of URIs supported: OCIFS, pre-authenticated request (PAR), and Native URI, and you can specify a name of a file as your source in Object Storage, or a prefix, or a pattern.
When you define the object_path
of a URI,
the name, prefix, or glob pattern is inferred by the
following.
Name: If the
object_path
is neither a glob pattern nor prefix.Prefix: If the
object_path
is not a glob pattern and ends with an unencoded/
character, such as a folder path.Glob pattern: If the
object_path
contains at least one of the following unencoded characters:?
,*
, or[
. To use these characters as literals, you need to escape them or encode them as needed depending on the URI syntax. Regex patterns are not supported. See Glob Patterns from the Oracle Cloud Infrastructure Documentation to learn more.
Depending on the type of URI you use, you need to ensure that the following parameters are present in the URI.
bucket_name
: Buckets allow you to store objects in a compartment. This parameter is needed for all three types of URIs. To learn more about buckets, see Object Storage Buckets in Oracle Cloud Infrastructure Documentation.namespace_name
: This is the top-level container for all buckets and objects. This parameter is needed for all three types of URIs. To learn how to view your namespace, see Object Storage Namespaces in Oracle Cloud Infrastructure Documentation.region_name
: The region identifier for your OCI tenancy. This parameter is needed for PAR URI and Native URI. For OCIFS URI, it is set to the region of the instance running Lakehouse. To view the list of available regions and region identifiers, see Regions and Availability Domains in Oracle Cloud Infrastructure Documentation.
If you are on MySQL 9.1.2 and earlier, you need to update
dialect
with the field
delimiter
and record
delimiter
parameters. As of MySQL 9.2.0, MySQL HeatWave Lakehouse
can automatically detect these values. See
Lakehouse
External Table Syntax to learn more.
To review all syntax options for loading external files, see Lakehouse External Table Syntax.
The format of an OCIFS URI is the following:
oci://bucket_name@namespace_name/object_path
The object_path
in an OCIFS URI cannot be
empty. It also does not have a region parameter. The
instance's associated region is used as the default region.
See
OCIFS
Documentation to learn more about OCIFS and its
specifications.
The following examples specify a single file as a name since
the object_path
is not a glob pattern or
prefix. The example uses the
data_file_1.csv
file to load the external
table.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "oci://mybucket@mynamespace/data_files/data_file_1.csv"}]}';
In the following examples the object_path
is encoded, so it is treated as a name instead of a prefix.
The original file name is
data_file_[1].csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_%5B1%5D.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "oci://mybucket@mynamespace/data_files/data_file_%5B1%5D.csv"}]}';
The following examples specify files as a prefix because the
object_path
ends with a
/
character and does not contain any of
the following characters: *
,
?
, or [
. The examples
use all files in the data_files/
Object
Storage folder.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "oci://mybucket@mynamespace/data_files/"}]}';
The following examples use a glob pattern and specify two
files, data_file_1.csv
and
data_file_2.csv
, to load the external
table. The object_path
is a pattern
because it has a [
character.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-2].csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "oci://mybucket@mynamespace/data_files/data_file_[1-2].csv"}]}';
The following examples use a glob pattern because they have
an unencoded *
character. The examples
specify all CSV files that start with
data_file_
to load the external table.
For example, they use the files
data_file_1.csv
,
data_file_2.csv
, and
data_file_3.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_*.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "oci://mybucket@mynamespace/data_files/data_file_*.csv"}]}';
The following examples use a glob pattern because they have
an unencoded ?
character. The examples
specify all CSV files that start with
data_file_
and have any single character
followed by 0
to load the external table.
For example, they use the files
data_file_10.csv
,
data_file_20.csv
, and
data_file_30.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_?0.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "oci://mybucket@mynamespace/data_files/data_file_?0.csv"}]}';
The format of a PAR URI is the following if using dedicated endpoints:
https://namespace_name.objectstorage.region_name.oci.customer-oci.com/p/PAR_token/n/namespace_name/b/bucket_name/o/object_path
The format of a PAR URI can also be the following:
https://objectstorage.region_name.oraclecloud.com/p/PAR_token/n/namespace_name/b/bucket_name/o/object_path
The object_path
in a PAR URI can be
empty. If the object_path
is empty, the
PAR URI is treated as a PAR with no additional pattern,
prefix, or name target.
For more information about pre-authenticated requests, see Object Storage Pre-Authenticated Requests in Oracle Cloud Infrastructure Documentation.
The following examples specify a single file as a name since
the object_path
is not a glob pattern or
prefix. The examples use the file
data_file_2.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_2.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_2.csv"}]}';
In the following examples the object_path
is encoded, so it is treated as a name instead of a prefix.
The original file name is
data_file_[1].csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_%5B1%5D.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_%5B1%5D.csv"}]}';
In the following examples, the
object_path
is empty, which means that no
additional target is specified. Therefore, the examples load
all the files under the PAR.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/"}]}';
The following examples specify files as a prefix because the
object_path
ends with a
/
character and does not contain any of
the following characters: *
,
?
, or [
. The examples
use any files in the data_files/
Object
Storage folder to load the external table.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/"}]}';
The following examples use a glob pattern and load two
files, data_file_2
and
data_file_3.csv
, to load the external
table. The object_path
is a pattern
because it has a [
character.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_[2-3].csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_[2-3].csv"}]}';
The following examples use a glob pattern because they have
an unencoded *
character. The examples
specify all CSV files that start with
data_file_
to load the external table.
For example, they use the files
data_file_1.csv
,
data_file_2.csv
, and
data_file_3.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_*.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_*.csv"}]}';
The following examples use a glob pattern because they have
an unencoded ?
character. The examples
specify all CSV files that start with
data_file_
and have any single character
followed by 0
to load the external table.
For example, they use the files
data_file_10.csv
,
data_file_20.csv
, and
data_file_30.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_?0.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/.../n/mynamespace/b/mybucket/o/data_files/data_file_?0.csv"}]}';
The format of a Native URI is the following if using dedicated endpoints:
https://namespace_name.objectstorage.region_name.oci.customer-oci.com/n/namespace_name/b/bucket_name/o/object_path
The format of a Native URI can also be the following:
https://objectstorage.region_name.oraclecloud.com/n/namespace_name/b/bucket_name/o/object_path
The object_path
cannot be empty.
The following examples specify a single file as a name since
the object_path
is not a glob pattern or
prefix. The examples use the
data_file_1.csv
file.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_file_1.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_file_1.csv"}]}';
In the following examples the object_path
is encoded, so it is treated as a name instead of a prefix.
The original file name is
data_file_[1].csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_file_%5B1%5D.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_file_%5B1%5D.csv"}]}';
The following examples specify files as a prefix because the
object_path
ends with a
/
character and does not contain any of
the following characters: *
,
?
, or [
. The examples
use any files in the data_files/
Object
Storage folder to load the external table.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/"}]}';
The following examples use a glob pattern and specify two
files, data_file_1.csv
and
data_file_2.csv
, to load the external
table. The object_path
is a pattern
because it has a [
character.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/data_file_[1-2].csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/data_file_[1-2].csv"}]}';
The following examples use a glob pattern because they have
an unencoded *
character. The examples
specify all CSV files that start with
data_file_
to load the external table.
For example, they use the files
data_file_1.csv
,
data_file_2.csv
, and
data_file_3.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/data_file_*.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/data_file_*.csv"}]}';
The following examples use a glob pattern because they have
an unencoded ?
character. The examples
specify all CSV files that start with
data_file_
and have any single character
followed by 0
to load the external table.
For example, they use the files
data_file_10.csv
,
data_file_20.csv
, and
data_file_30.csv
.
SQL syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/data_file_?0.csv');
JSON syntax example:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
ENGINE_ATTRIBUTE='{"dialect": {"format": "csv"},
"file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/data_files/data_file_?0.csv"}]}';
After successfully creating external tables manually and specifying the files to load data into the table, learn how to Load Structured Data Manually.