For the full Auto Parallel Load syntax, see:
Section 2.2.3, “Loading Data Using Auto Parallel Load”. HeatWave Lakehouse extends
Auto Parallel Load with the external_tables
option. This
is a JSON array that includes one or more
db_object
:
db_object: {
"db_name": "name",
"tables": JSON_ARRAY(table [, table] ...)
}
table: {
"table_name": "name",
"sampling": true|false,
"file": JSON_ARRAY(file_section [, file_section]...),
"dialect": {dialect_section},
}
-
db_object
: the details of one or more tables. Eachdb_object
contains the following:db_name
: name of the database. If the database does not exist, Lakehouse Auto Parallel Load creates it during the load process.-
tables
: a JSON array oftable
. Eachtable
contains the following:table_name
: the name of the table to load.-
sampling
: if set totrue
, the default setting, Lakehouse Auto Parallel Load infers the schema by sampling the data and collect statistics.If set to
false
, Lakehouse Auto Parallel Load performs a full scan to infer the schema and collect statistics. Depending on the size of the data, this can take a long time.Auto Parallel Load uses the inferred schema to generate
CREATE TABLE
statements. The statistics are used to estimate storage requirements and load times. dialect
: details about the file format. See thedialect
parameter in Section 4.3.1, “Lakehouse External Table Syntax”.file
: the location of the data in Object Storage. This can use a pre-authenticated request or a resource principal, and can be a path to a file, a file prefix, or a file pattern. See thefile
parameter in Section 4.3.1, “Lakehouse External Table Syntax”, and see: Section 4.5, “Access Object Storage”.