Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.5Mb
PDF (A4) - 2.5Mb


HeatWave User Guide  /  ...  /  Uniform Resource Identifiers

5.6.3 Uniform Resource Identifiers

As of MySQL 9.3.1, HeatWave Lakehouse lets you use Uniform Resource Identifier (URI) in a unified way to specify Object Storage resources for loading data into HeatWave Cluster. This feature is supported in HeatWave on OCI only.

There are three types of URIs supported: OCIFS, PAR, and Native URI, and you can specify a name of a file as your source in Object Storage, or a prefix, or a pattern.

When you define the object_path of a URI, the name, prefix, or glob pattern is inferred by the following.

  • Name: If the object_path is neither a glob pattern nor prefix.

  • Prefix: If the object_path is not a glob pattern and ends with an unencoded / character, such as a folder path.

  • Glob pattern: If the object_path contains at least one of the following unencoded characters: ?, *, or [. To use these characters as literals, you need to escape them or encode them as needed depending on the URI syntax. Regex patterns are not supported. See Glob Patterns from the Oracle Cloud Infrastructure Documentation to learn more.

Depending on the type of URI you use, you need to ensure that the following parameters are present in the URI.

  • bucket_name: Buckets allow you to store objects in a compartment. This parameter is needed for all three types of URIs. To learn more about buckets, see Object Storage Buckets in the HeatWave on OCI Service Guide.

  • namespace_name: This is the top-level container for all buckets and objects. This parameter is needed for all three types of URIs. To learn how to view your namespace, see Object Storage Namespaces in the HeatWave on OCI Service Guide.

  • region_name: The region identifier for your OCI tenancy. This parameter is needed for PAR URI and Native URI. For OCIFS URI, it is set to the region of the instance running Lakehouse. To view the list of available regions and region identifiers, see Regions and Availability Domains in the HeatWave on OCI Service Guide.

The following examples load files using URI with Auto Parallel Load. See Load Data Using Auto Parallel Load to learn more. To review examples that load files manually using URI, see Load Data Manually with URI.

OCIFS URI

The format of an OCIFS URI is the following:

oci://bucket_name@namespace_name/object_path

The object_path in an OCIFS URI cannot be empty. See OCIFS Documentation to learn more about OCIFS and its specifications.

The following example loads files as a glob pattern and loads two files, file1 and file2. The object_path is a pattern because it uses a [ character.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table1",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "oci://mybucket@mynamespace/folder1/file[1-2]"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

The following example loads files as a prefix because the object_path ends with a / character and does not contain any of the following characters: *, ?, or [. All files in the folder1/ directory are loaded.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table2",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "oci://mybucket@mynamespace/folder1/"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

The following example loads a single file as a name since the object_path is not a glob pattern or prefix. The file file1.csv is loaded.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table3",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "oci://mybucket@mynamespace/folder1/file1.csv"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

PAR URI

The format of a PAR URI is the following if using dedicated endpoints:

https://namespace_name.objectstorage.region_name.oci.customer-oci.com/p/PAR_token/n/namespace_name/b/bucket_name/o/object_path

The format of a PAR URI can also be the following:

https://objectstorage.region_name.oraclecloud.com/p/PAR_token/n/namespace_name/b/bucket_name/o/object_path

The object_path in a PAR URI can be empty. If the object_path is empty, the PAR URI is treated as a PAR with no additional pattern, prefix, or name target.

In the following example, the object_path is empty, which means that no additional target is specified. Therefore, all the files under the PAR are loaded.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table4",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/myaccessuri/n/mynamespace/b/mybucket/o/"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

The following example loads files as a glob pattern and loads two files, file2 and file3. The object_path is a pattern because it uses a [ character.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table5",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/myaccessuri/n/mynamespace/b/mybucket/o/path/file[2-3]"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

The following example loads files as a prefix because the object_path ends with a / character and does not contain any of the following characters: *, ?, or [. Any files in the folder1/ directory are loaded.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table6",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/myaccessuri/n/mynamespace/b/mybucket/o/folder1/"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

The following example loads a single file as a name since the object_path is not a glob pattern or prefix. The file file2.csv is loaded.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table7",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/p/myaccessuri/n/mynamespace/b/mybucket/o/path/file2.csv"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

Native URI

The format of a Native URI is the following if using dedicated endpoints:

https://namespace_name.objectstorage.region_name.oci.customer-oci.com/n/namespace_name/b/bucket_name/o/object_path

The format of a Native URI can also be the following:

https://objectstorage.region_name.oraclecloud.com/n/namespace_name/b/bucket_name/o/object_path

The object_path cannot be empty.

The following example loads files as a prefix because the object_path ends with a / character and does not contain any of the following characters: *, ?, or [. Any files in the folder1/ directory are loaded.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table8",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/folder1/"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

The following example loads files as a glob pattern and loads two files, file2 and file3. The object_path is a pattern because it uses an unencoded [ character.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table9",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/path%2Ffile[2-3]"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);

In the following example the object_path is encoded, so it is treated as a name instead of a prefix.

mysql>SET @input_list = '[{
  "db_name": "db_files_1",
  "tables": [{
    "table_name": "table10",
    "engine_attribute": {"dialect": {"format": "csv"},
                         "file": [{"uri": "https://mynamespace.objectstorage.myregion.oci.customer-oci.com/n/mynamespace/b/mybucket/o/folder%2F"}]}
}]}]';
mysql>CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), NULL);