After creating an external table manaully and specifying the files to load into the table, you have the option (as of MySQL 9.6.0) to validate a load command before loading data into the external table.
This topic contains the following sections:
Complete the steps to Create an External Table Manually.
To load an external table into MySQL HeatWave Cluster with data from
files, specify the SECONDARY_LOAD clause in
an ALTER TABLE statement.
As of MySQL 9.6.0, you have the option to validate a load
operation before loading data. To do this, specify the
SECONDARY_LOAD and VALIDATE
ONLY clauses in an ALTER
TABLE statement.
Use the following syntax:
ALTER TABLE table_name SECONDARY_LOAD VALIDATE [{n | ALL} ROWS] ONLY [GUIDED {ON | OFF}];
Specify the following clauses in the
ALTER TABLE statement:
table_name: Specify the name of the table to load to MySQL HeatWave.VALIDATE ONLY: Available as of MySQL 9.6.0. Allows you to validate the load. Lakehouse validates the rows in the external table, but does not load the data. See Validate Load.GUIDED: Available as of MySQL 9.4.1. Allows you to disable and enable Guided Load as needed. Set toOFFto disable Guided Load. By default, Guided Load is enabled. See Guided Load.
Available as of MySQL 8.2.0, the Guided Load feature performs a set of checks and validations before loading data.
These checks include the following:
Automatically detect tables and columns that cannot be loaded. If there are tables and columns that are not compatible, stop the load.
Automatically set
SECONDARY_ENGINEtorapid.Detect any errors with
ENGINE_ATTRIBUTEand report them.Infer the table definition and make any necessary adjustments before loading data. These adjustments are similar to those performed by Autopilot during Lakehouse Auto Parallel Load. See: About Lakehouse Auto Parallel Load Schema Inference. If the inferred table definition is not compatible, stop the load.
Predict the amount of memory required for loading data. If the required memory is not available, stop the load.
Infer the record and field delimiters for CSV files and make the necessary adjustments.
As of MySQL 9.4.1, you have the option to disable Guided Load if you want to skip these checks.
The following example manually creates an external table, and then loads the table into MySQL HeatWave with Guided Load disabled:
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql> ALTER TABLE table_1 SECONDARY_LOAD GUIDED OFF;
Available as of MySQL 9.6.0, the VALIDATE
ONLY clause allows you to validate a load without
loading any data.
You have the option to specify the number of rows to validate, or use
ALL ROWSto validate all rows.If the number of rows is not specified, or
ALL ROWSis not included, then the default is to validate all rows.Review any errors or warning messages before running the load command.
If
GUIDED ONis specified, or the statement does not contain theGUIDEDclause, then Guided Load is enabled for the statement. With Guided Load enabled, the table definition may be modified.
The following example manually creates an external table, and then validates 1000 rows from the external table. No data is loaded. Guided Load is enabled by default.
mysql> CREATE EXTERNAL TABLE table_1(col_1 int, col_2 int, col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql> ALTER TABLE table_1 SECONDARY_LOAD VALIDATE 1000 ROWS ONLY;
To demonstrate how to load data manually, the following example uses the same file in the Create External Table Manually Example.
-
Create the external table. For this example, use the following command:
mysql> CREATE EXTERNAL TABLE bank_marketing( age int, job varchar(255), marital varchar(255), education varchar(255), default1 varchar(255), balance float, housing varchar(255), loan varchar(255), contact varchar(255), day int, month varchar(255), duration float, campaign int, pdays float, previous float, poutcome varchar(255), y varchar(255) ) FILE_FORMAT = (FORMAT csv HEADER ON) FILES = (URI = 'oci://mybucket@mynamespace/bucket_1/bank.csv'); -
Optionally, validate the load before loading the external table.
mysql> ALTER TABLE bank_marketing SECONDARY_LOAD VALIDATE ONLY; Warning (code 3877): Command executed during preprocessing: 'ALTER TABLE `tests`.`bank_marketing` ENGINE_ATTRIBUTE='{"file": [{"uri": "oci://Carlos_bucket@mysql2/bank.csv"}], "dialect": {"format": "csv", "has_header": true, "field_delimiter": ";", "record_delimiter": "\\n"}}''.The command runs with Guided Load enabled, so changes to the table definition are included in a warning message. Review errors and warnings during validation before proceeding to load the external table.
-
Use the
ALTER TABLEclause in theSECONDARY_LOADcommand to load the data into the table.mysql> ALTER TABLE bank_marketing SECONDARY_LOAD; Warning (code 3877): Command executed during preprocessing: 'ALTER TABLE `bank_data`.`bank_marketing` ENGINE_ATTRIBUTE='{"file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenant_1/b/bucket_1/o/bank.csv"}], "dialect": {"format": "csv", "has_header": true, "field_delimiter": ";", "record_delimiter": "\\n"}}''.Review the message to confirm the external table is successfully created. For MySQL 9.1.2 and later, you can also review the
field_delimiterandrecord_delimitervalues that MySQL HeatWave automatically detected for the table. -
Optionally, query five rows of the table to confirm the data is loaded and accessible.
mysql> SELECT * FROM bank_marketing.bank_train LIMIT 5; +-----+-------------+----------+-----------+----------+---------+---------+------+-----------+-----+-------+----------+----------+-------+----------+----------+----+ | age | job | marital | education | default1 | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | y | +-----+-------------+----------+-----------+----------+---------+---------+------+-----------+-----+-------+----------+----------+-------+----------+----------+----+ | 37 | services | married | secondary | no | 4760 | yes | no | cellular | 8 | may | 182 | 2 | 169 | 2 | failure | no | | 32 | technician | single | secondary | no | 2979 | no | no | cellular | 25 | may | 156 | 1 | -1 | 0 | unknown | no | | 43 | management | married | tertiary | no | 690 | yes | no | cellular | 6 | aug | 171 | 3 | -1 | 0 | unknown | no | | 50 | blue-collar | divorced | secondary | no | 203 | yes | no | telephone | 19 | nov | 265 | 1 | 127 | 4 | other | no | | 34 | blue-collar | married | secondary | no | 322 | yes | no | cellular | 20 | apr | 10 | 3 | -1 | 0 | unknown | no | +-----+-------------+----------+-----------+----------+---------+---------+------+-----------+-----+-------+----------+----------+-------+----------+----------+----+
After successfully loading the table to MySQL HeatWave Cluster with data from the specified files, learn how to do the following: