MySQL 8.1.0 introduces a bulk load extension to the
LOAD DATA
statement which is only
available in HeatWave. It adds the following features:
The optimized loading of data sorted by primary key.
The optimized loading of unsorted data.
The optimized loading of data from an object store.
The optimized loading of data from a series of files.
LOAD DATA
[LOW_PRIORITY | CONCURRENT]
[FROM]
INFILE | URL | S3 'file_prefix' [COUNT N]
[IN PRIMARY KEY ORDER]
INTO TABLE tbl_name
[CHARACTER SET charset_name]
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[TERMINATED BY 'string']
]
[IGNORE number {LINES | ROWS}]
[PARALLEL = number]
[MEMORY = M]
[ALGORITHM = BULK]
The additional LOAD DATA
clauses
are:
FROM
: Makes the statement more readable.-
URL
: The Oracle Cloud Infrastructure file location.This requires a pre-authenticated request, PAR. See: Section 4.5.1, “Pre-Authenticated Requests”. It also requires the user privilege
LOAD_FROM_URL
.See: Section 2.18.8, “Bulk Ingest Data to MySQL Server Limitations”.
-
S3
: The AWS S3 file location.This requires the user privilege
LOAD_FROM_S3
.See: Section 2.18.8, “Bulk Ingest Data to MySQL Server Limitations”.
-
COUNT
: The number of files in a series of files.For
COUNT 5
andfile_prefix
set todata.csv.
, the five files would be:data.csv.1
,data.csv.2
,data.csv.3
,data.csv.4
, anddata.csv.5
. -
IN PRIMARY KEY ORDER
: Use when the data is already sorted. The values should be in ascending order within the file.For a file series, the primary keys in each file must be disjoint and in ascending order from one file to the next.
-
PARALLEL
: The number of concurrent threads to use. A typical value might be 16, 32 or 48. The default value is 16.PARALLEL
does not requireCONCURRENT
. MEMORY
: The amount of memory to use. A typical value might be 512M or 4G. The default value is 1G.ALGORITHM
: Set toBULK
for bulk load. The file format is CSV.
LOAD DATA
with
ALGORITHM=BULK
does not support these
clauses:
LOAD DATA
[LOCAL]
[REPLACE | IGNORE]
[PARTITION (partition_name [, partition_name] ...)]
]
[LINES
[STARTING BY 'string']
]
[(col_name_or_user_var
[, col_name_or_user_var] ...)]
[SET col_name={expr | DEFAULT}
[, col_name={expr | DEFAULT}] ...]
LOAD DATA
with
ALGORITHM=BULK
supports the following data
types:
For the data types that HeatWave supports, see: Section 2.10, “Supported Data Types”
See: Section 2.18.8, “Bulk Ingest Data to MySQL Server Limitations”.
-
An example that loads unsorted data from AWS S3 with 48 concurrent threads and 4G of memory.
mysql> GRANT LOAD_FROM_S3 ON *.* TO load_user@localhost; mysql> LOAD DATA FROM S3 's3-us-east-1://innodb-bulkload-dev-1/lineitem.tbl' INTO TABLE lineitem FIELDS TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' PARALLEL = 48 MEMORY = 4G ALGORITHM=BULK;
-
An example that loads eight files of sorted data from AWS S3. The
file_prefix
ends with a period. The files arelineitem.tbl.1
,lineitem.tbl.2
, ...lineitem.tbl.8
.mysql> GRANT LOAD_FROM_S3 ON *.* TO load_user@localhost; mysql> LOAD DATA FROM S3 's3-us-east-1://innodb-bulkload-dev-1/lineitem.tbl.' COUNT 8 IN PRIMARY KEY ORDER INTO TABLE lineitem FIELDS TERMINATED BY "|" OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' ALGORITHM=BULK;