Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 3.8Mb
PDF (A4) - 3.8Mb


4.3.6.2 Lakehouse External Table SQL Syntax

As of MySQL 9.4.0, you can set external table options using SQL syntax. The following options apply to CREATE TABLE and ALTER TABLE statements. For ALTER TABLE statements, ADD COLUMN is not supported for external tables.

This topic contains the following sections:

CREATE TABLE Statement

The following syntax shows how to specify these external table options with SQL syntax.

CREATE EXTERNAL TABLE table_name 
(col_name data_type ['Col_Options_SQL'], ...) Tbl_Options_SQL;

The CREATE EXTERNAL TABLE statement automatically sets ENGINE to lakehouse and SECONDARY_ENGINE to rapid. If you use the CREATE TABLE statement, you need to manually set ENGINE to lakehouse and SECONDARY_ENGINE tov rapid.

External Table Options with SQL Syntax
Tbl_Options_SQL
  (table_option [table_option]...)

table_option: {
  | FILE_FORMAT [=] format_spec
  | FILES [=] (file_spec[, file_spec] ...)
  | ALLOW_MISSING_FILES [=] {0 | 1 | DEFAULT}
  | AUTO_REFRESH_SOURCE [=] {'string' | NONE}
  | VERIFY_KEY_CONSTRAINTS [=] {0 | 1 | DEFAULT}
  | STRICT_LOAD [=] {0 | 1 | DEFAULT}
}

format_spec: {
  [general_options]
  [field_options]
  [LINES [TERMINATED BY 'string']]
  [IGNORE number {LINES | ROWS}]
}
  general_options: {
    general_option [general_option] ...}
    general_option: {
        [FORMAT {csv | avro | json | parquet}]
      | [COMPRESSION {zip | gzip | bzip2 | auto}]
      | [CHARACTER SET charset name]
      | [HEADER {ON | OFF}]
}
  field_options:{
    {FIELDS | COLUMNS} field_option [field_option]...}
    field_option:{
        DATE FORMAT 'date_format'
      | DATETIME FORMAT 'datetime_format'
      | TIME FORMAT 'time_format'
      | ESCAPED BY 'char'
      | TERMINATED BY 
      | {  [OPTIONALLY] ENCLOSED BY 'char'}
}
file_spec: {
  file_attribute [file_attribute] ...}
  file_attribute: {
      {URI | URL} [=] 'uri_string'
    | FILE_NAME [=] 'object_name'
    | FILE_PATTERN [=] 'pattern'
    | FILE_PREFIX [=] 'prefix_name'
    | ALLOW_MISSING_FILES [=] {0 | 1 | DEFAULT}
    | STRICT_LOAD [=] {0 | 1 | DEFAULT}
}
Note

For a complete list of table options, see CREATE TABLE.

For FILES options, you can set the following:

You cannot use resource principals to specify details for Object Storage files. To use resource principals, you must use External Table ENGINE_ATTRIBUTE Options. Alternatively, you can convert resource principals to URIs.

Column Definitions

You can override some options at the column level by specifying column-level parameters with the EXTERNAL FORMAT statement. You can set DATE, TIME, DATETIME, and TIMESTAMP as optional column-level parameters.

  EXTERNAL_FORMAT {'date_format' | 'time_format' | 'timestamp_format'}

The following example loads a table from a file specified with a uniform resource identifier (URI) and sets the date_format, time_format, and timestamp_format at the column level.

mysql> CREATE EXTERNAL TABLE table_1(
        col_1 date EXTERNAL FORMAT '%W %M %Y',
        col_2 time EXTERNAL FORMAT 'hh:mm:ss',
        col_3 timestamp EXTERNAL FORMAT '%Y-%m-%d %H:%i:%s',
        col_4 int)
        FILE_FORMAT = (FORMAT csv)
        FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
mysql> ALTER TABLE table_1 SECONDARY_LOAD;

For more information about column definitions, see File Format Options.

Table Parameters

The following table parameters are optional and specify options at the table level.

  • ALLOW_MISSING_FILES: Ignores the missing files and completes the data loading based on the selected option. This overrides the table parameter STRICT_LOAD for missing files. The supported options are:

    A missing file is defined as:

    • With the FILE_NAME parameter: There is no file with that name.

    • With the FILE_PATTERN parameter: There are no files that match the pattern.

    • With the FILE_PREFIX parameter: There are no files with that prefix.

    The following example loads a table and sets ALLOW_MISSING_FILES to 1 in the table options. If any of the specified files are not available, an error does not occur and data continues loading for the existing files. If all files are missing, an error occurs. If a file-specific setting is set, this may override the ALLOW_MISSING_FILES setting.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv',
                     URI = 'oci://mybucket@mynamespace/data_files/data_file_2.csv',
                     URI = 'oci://mybucket@mynamespace/data_files/data_file_3.csv')
            ALLOW_MISSING_FILES = 1;
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;
  • AUTO_REFRESH_SOURCE: Supported as of MySQL 9.4.1. Specifies a Stream OCID for event-based incremental load. This string value is verified as a valid Stream resource identifier. Learn more about Resource Identifiers from Oracle Cloud Infrastructure Documentation. Learn how to Refresh Data Using Event-Based Incremental Load.

    The following example specifies a Stream OCID to set up event-based incremental load.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')        
            AUTO_REFRESH_SOURCE = 'ocid1.stream.oc1.iad...';
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;
  • STRICT_LOAD: Loads data in strict mode or non-strict mode, based on the selected option. By default, STRICT_LOAD is set to the sql_mode value. See Strict SQL Mode. The FILES parameter STRICT_LOAD can override this setting. The supported options are:

    This setting overrides the global sql_mode setting for handling missing files. The default is the value of sql_mode. See Strict SQL Mode. The FILES parameter STRICT_LOAD can override this setting.

    The following example loads a table from a file specified with a uniform resource identifier (URI) and sets STRICT_LOAD to 0 in the table options.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')        
            STRICT_LOAD = 0;
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;
  • VERIFY_KEY_CONSTRAINTS: Lakehouse validates primary key and unique key constraints during the initial load based on the selected option. The supported options are:

    • 1: The default value. Lakehouse validates primary key and unique key constraints only during the initial load of the table. If there are subsequent loads or refreshes of the table, validation does not occur.

    • 0: Lakehouse does not validate primary key and unique key constraints.

    • DEFAULT: Lakehouse does not validate primary key and unique key constraints.

    The following example loads a table from a file specified with a uniform resource identifier (URI) and sets VERIFY_KEY_CONSTRAINTS to 0 in the table options.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')        
            VERIFY_KEY_CONSTRAINTS = 0;
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;

Refer to the following table to review how different combinations of settings for ALLOW_MISSING_FILES and STRICT_LOAD affect errors for missing files.

Table 4.6 Combinations of Settings for allow_missing_files and is_strict_mode

allow_missing_files is_strict_mode Possible Errors
T T No error generated unless all files are missing.
T F
T Not set
Not set F
F T Error generated for any missing files.
F F
F Not set
Not set T
Not set Not set Error generation depends on setting for sql_mode.

File Specifications

File specifications are enclosed in FILES.

A FILES clause is required if you specify any additional external table options. The following parameters are available:

  • FILES: Defines the Object Storage files. You can define the file locations using uniform resource identifier (URI) or PAR URL.

    Lakehouse supports a maximum of 170 file locations. To define more than the maximum number of files, store the files under the same bucket or use FILE_PREFIX or FILE_PATTERN.

  • FILES parameters for uniform resource identifiers:

    • URI: The URL of the URI you specify. Do not provide a pattern, prefix, or name parameter with the URI. These details are inferred by the object_path you define.

      The following examples use the URI syntax for OCIFS:

      oci://bucket_name@namespace_name/object_path
      • Name: If the object_path is neither a glob pattern nor prefix.

        The following example loads a table with a single file specified with an OCIFS URI.

        mysql> CREATE EXTERNAL TABLE table_1(
                col_1 int,
                col_2 int,
                col_3 int)
                FILE_FORMAT = (FORMAT csv)
                FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
        mysql> ALTER TABLE table_1 SECONDARY_LOAD;
      • Prefix: If the object_path is not a glob pattern and ends with an unencoded / character, such as a folder path.

        The following example loads a table with a prefix from a file specified with an OCIFS URI. The prefix specifies files in an Object Storage folder.

        mysql> CREATE EXTERNAL TABLE table_1(
                col_1 int,
                col_2 int,
                col_3 int)
                FILE_FORMAT = (FORMAT csv)
                FILES = (URI = 'oci://mybucket@mynamespace/data_files/');
        mysql> ALTER TABLE table_1 SECONDARY_LOAD;
      • Glob pattern: If the object_path contains at least one of the following characters: ?, *, or [. To use these characters as literals, you need to escape them or encode them as needed depending on the URI syntax. Regex patterns are not supported. See Glob Patterns from the Oracle Cloud Infrastructure Documentation to learn more.

        The following example loads a table by using an unencoded * character to create a glob pattern.

        mysql> CREATE EXTERNAL TABLE table_1(
                col_1 int,
                col_2 int,
                col_3 int)
                FILE_FORMAT = (FORMAT csv)
                FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_files/data_file_*.csv');
        mysql> ALTER TABLE table_1 SECONDARY_LOAD;

    Review additional examples that use URIs to load data manually.

  • FILES parameters for pre-authenticated requests:

    • URL: The PAR URL. Review PAR Recommendations.

      The following example loads a table with a single file specified with a PAR.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv)
              FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/data_file_1.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • FILE_NAME: A specific Object Storage file name. For pre-authenticated requests, the file name provided in the PAR URL is considered.

      The following example uses the FILE_NAME parameter to specify a file in an Object Storage bucket. The PAR points to the Object Storage bucket.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv)
              FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/'
                       FILE_NAME = 'data_files/data_file_1.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • FILE_PATTERN: A regular expression that defines a set of Object Storage files.

      The following example uses a pattern to load files that have a numerical suffix.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv)
              FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/'
                       FILE_PATTERN = 'data_files/data_file_\\\\d+\\\\.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;

      See the following to learn more:

    • FILE_PREFIX: The prefix for a set of Object Storage files. The prefix or bucket name present in the PAR URL is considered.

      The following example uses a prefix to specify files to load in an Object Storage folder.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv)
              FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/'
                       FILE_PREFIX = 'data_files/');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;

      Review additional examples that use PARs to load data manually.

The following parameters enclosed in FILES are optional:

  • ALLOW_MISSING_FILES: Ignores the missing files and completes the data loading based on the selected option. This overrides the table option parameter STRICT_LOAD for missing files. To learn about the default value, see the ALLOW_MISSING_FILES table option parameter in Table Parameters. The supported options are:

    • 1: If any file is missing, no error occurs and data loading continues with the existing files. An error occurs if all the files are not available.

    • 0: If any file is missing, an error occurs and data is not loaded.

    The following example loads a table from a glob pattern specified with a uniform resource identifier (URI) and sets ALLOW_MISSING_FILES to 1 in the FILES parameters. If any of the specified files in the pattern are not available, an error does not occur and data continues loading for the existing files. If all files are missing, an error occurs.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-10].csv'
                     ALLOW_MISSING_FILES = 1);
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;

    The following example loads a table and sets ALLOW_MISSING_FILES to 0 for the first file and ALLOW_MISSING_FILES to 1 for the second file. If the first file is missing, an error occurs. If the second file is missing, the load continues with the loading of the first file.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv' ALLOW_MISSING_FILES = 0,
                     URI = 'oci://mybucket@mynamespace/data_files/data_file_2.csv' ALLOW_MISSING_FILES = 1);
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;
  • STRICT_LOAD: Loads data in strict mode or non-strict mode, based on the selected option. This overrides the STRICT_LOAD table option parameter. To learn about the default value, see the STRICT_LOAD table option parameter in Table Parameters. The supported options are:

    • 1: Loads the data in strict mode. The data loading stops if there is an error due to missing files, empty columns, formatting errors or parsing errors.

    • 0: Loads the data in non-strict mode. Missing files, empty columns, formatting errors or parsing errors display a warning, and data is loaded.

    The following example loads a table from a file specified with a uniform resource identifier (URI) and sets STRICT_LOAD to 0 in the FILES parameters.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv'
                     STRICT_LOAD = 0);
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;
File Format Options

File format options are enclosed in FILE_FORMAT.

The following file format options are available:

  • FORMAT: The file format defined in a table. You can define only one file format per table. The supported file formats are:

    • csv: The default file format.

      The following example loads a CSV file into a table with a uniform resource identifier (URI).

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv)
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • json: Only Newline Delimited JSON files are supported. Tables created with json format must only have a single column that conforms to the JSON data type.

      The following example loads a JSON file into a table with a uniform resource identifier (URI).

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 json)
              FILE_FORMAT = (FORMAT json)
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.json');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • parquet: The Parquet data type.

      The following example loads a Parquet file into a table with a uniform resource identifier (URI).

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT parquet)
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.parquet');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • avro: The Avro data type.

      The following example loads an Avro file into a table with a uniform resource identifier (URI).

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT avro)
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.avro');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
  • COMPRESSION: Use this to load compressed files.

    Compression is supported for all file formats:

    • csv: The default value is no compression. You can set compression to zip, gzip, or bzip2.

      The following example loads a table from a file specified with a uniform resource identifier (URI) and uses a CSV file compressed in zip format.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv COMPRESSION 'zip')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv.zip');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • json: The default value is no compression. You can set compression to zip, gzip, or bzip2.

      The following example loads a table from a file specified with a uniform resource identifier (URI) and uses a JSON file compressed in gzip format.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 json)
              FILE_FORMAT = (FORMAT json COMPRESSION 'gzip')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.json.gz');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • avro: The default value is auto. You cannot set any other options for COMPRESSION, as Avro files declare their compression format in the file metadata.

      The following example loads a table from a file specified with a uniform resource identifier (URI) and uses an Avro file with the auto compression option.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT avro COMPRESSION 'auto')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.avro');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • parquet: The default value is auto. You cannot set any other options for COMPRESSION, as Parquet files declar their compression format in the file metadata.

      The following example loads a table from a file specified with a uniform resource identifier (URI) and uses a Parquet file with the auto compression option.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT parquet COMPRESSION 'auto')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.parquet');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;

    Note the following when loading compressed files:

    • Each compressed file must only contain a single original file. You cannot provide a compressed file with archives or folders with multiple files.

    • Files in zip format must be compressed with the Deflate algorithm.

    • If loading multiple files, all the files must be in the same compression format.

    • When loading multiple compressed files, it is best to prepare files of similar size, and in the range of 80-120 MB each. Otherwise, you may experience longer loading times.

  • CHARACTER SET: Defines the character encoding. The default is 'utf8mb4'.

  • HEADER: Adds a header row to the CSV file, based on the selected option (ON or OFF). The default is OFF.

    If HEADER and IGNORE are both defined, Lakehouse first skips the number of rows, and then uses the next row as the header row.

The following field options are available:

  • The FIELDS and COLUMNS clauses define the format for fields and columns. The following parameters are available:

    • DATE FORMAT: The date format, see: date_format. You can also set date formats for each column by using the EXTERNAL_FORMAT statement, which overrides the format in the table parameter.

    • TIME FORMAT: The time format, see: String and Numeric Literals in Date and Time Context. You can also set time formats for each column by using the EXTERNAL_FORMAT statement, which overrides the format in the table parameter.

    • DATETIME FORMAT: The datetime format, see: date_format. You can also set datetime formats for each column by using the EXTERNAL_FORMAT statement, which overrides the format in the table parameter.

    • The following example loads a table from a file specified with a uniform resource identifier (URI) and sets the DATE FORMAT, TIME FORMAT, and DATETIME FORMAT.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 date,
              col_2 time,
              col_3 timestamp)
              FILE_FORMAT = (FORMAT csv FIELDS DATE FORMAT '%W %M %Y' TIME FORMAT 'hh:mm:ss' DATETIME FORMAT '%Y-%m-%d %H:%i:%s')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • TERMINATED BY: Defines one or more characters used to enclose fields. The maximum field delimiter length is 64 characters. You can add any string as a custom_field_delimiter.

      When FIELDS TERMINATED BY is set to auto, Lakehouse automatically detects the field delimiters.

      The following field delimiters are detected automatically by Lakehouse:

      • |: Pipe.

      • ,: Comma

      • \t: Tab

      • ;: Semicolon

      The following example loads a table from a file specified with a uniform resource identifier (URI) and sets , and sets FIELDS TERMINATED BY and LINES TERMINATED BY.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv FIELDS TERMINATED BY '\r\n' LINES TERMINATED BY ';')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
    • [OPTIONALLY] ENCLOSED BY: Defines one or more characters used to enclose fields. The default is '\''. You can add any string as a custom_quotation_marks.

    • ESCAPED BY: Defines one or more characters used to escape special characters. The default is '\'. You might have to use an additional escape character depending on the client you use. You can add any string as a custom_escape_character.

      The following example loads a table from a file specified with a uniform resource identifier (URI) and sets ESCAPED BY, and [OPTIONALLY] ENCLOSED BY.

      mysql> CREATE EXTERNAL TABLE table_1(
              col_1 int,
              col_2 int,
              col_3 int)
              FILE_FORMAT = (FORMAT csv FIELDS ESCAPED BY '\\' OPTIONALLY ENCLOSED BY '\"')
              FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
      mysql> ALTER TABLE table_1 SECONDARY_LOAD;
Additional File Format Options

The following additional file format options are available:

  • The LINES clause defines the format for each new line. The following parameters are available:

    • TERMINATED BY: Defines one or more characters used to delimit records. The maximum record delimiter length is 64 characters. You can add any string as a delimiter.

      The default record delimiter for is '\n'.

      For CSV files, if you set LINES TERMINATED BY to auto, Lakehouse automatically detects the record delimiters and composite record delimiters with field delimiters as prefixes.

      The following record delimiters detected automatically by Lakehouse:

      • \r: Carriage return.

      • \n: Line feed. This is the default.

      • \r\n : Carriage return and line feed

  • IGNORE: The number of rows to skip at the start of the file. The default value is 0 and the maximum value is 20.

    If HEADER and IGNORE are both defined, Lakehouse first skips the number of rows, and then uses the next row as the header row.

    The following example loads a table from a file specified with a uniform resource identifier (URI) and sets IGNORE and HEADER. Since both parameters are set, the first row is skipped and the second row is set as the header row.

    mysql> CREATE EXTERNAL TABLE table_1(
            col_1 int,
            col_2 int,
            col_3 int)
            FILE_FORMAT = (FORMAT csv HEADER ON IGNORE 1 LINES)
            FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
    mysql> ALTER TABLE table_1 SECONDARY_LOAD;
Additional Syntax Examples

The following examples load tables with additional options and show the proper order of options in the commands.

mysql> CREATE EXTERNAL TABLE table_1(
        col_1 int,
        col_2 int,
        col_3 DATE EXTERNAL_FORMAT '%e/%c %Y')
        FILE_FORMAT = (FORMAT csv COMPRESSION 'zip' CHARACTER SET utf8mb4 HEADER OFF
                       FIELDS DATE FORMAT '%Y%M%D' TERMINATED BY ',' LINES TERMINATED BY '\\n' IGNORE 2 LINES)
        FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv.zip' ALLOW_MISSING_FILES = 0);
mysql> ALTER TABLE table_1 SECONDARY_LOAD;
mysql> CREATE EXTERNAL TABLE table_1(
        col_1 int,
        col_2 int,
        col_3 int)
        FILE_FORMAT = (FORMAT csv HEADER ON FIELDS ESCAPED BY '\\' IGNORE 1 LINES)
        FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-5].csv' STRICT_LOAD = 1)
        ALLOW_MISSING_FILES = 1;
mysql> ALTER TABLE table_1 SECONDARY_LOAD;
mysql> CREATE EXTERNAL TABLE table_1(
        col_1 int,
        col_2 int,
        col_3 int)
        FILE_FORMAT = (FORMAT csv HEADER ON FIELDS TERMINATED BY ','
                       OPTIONALLY ENCLOSED BY '"' IGNORE 1 LINES)
        FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')
        VERIFY_KEY_CONSTRAINTS = 0;
mysql> ALTER TABLE table_1 SECONDARY_LOAD;