Related Documentation Download this Manual
PDF (US Ltr) - 2.2Mb
PDF (A4) - 2.2Mb


HeatWave User Guide  /  HeatWave Lakehouse  /  Exporting Query Results to Object Storage

5.8 Exporting Query Results to Object Storage

As of MySQL 9.1.1-u1 and later, you can export the result set generated from running queries on HeatWave and Lakehouse to Object Storage, as csv, or parquet format files. After the results are exported successfully, the number of rows written is displayed to the user.

You can also move the historical data from HeatWave, periodically to the Object Storage, and hence reduce the billing cost.

Prerequisites

Ensure that you have the following roles/privileges assigned, to export the query results:

  • EXPORT_QUERY_RESULTS privilege to export query results to Object Storage. A user with database admin rights can grant this privilege by executing the following MySQL command:

    mysql> GRANT EXPORT_QUERY_RESULTS ON *.* to user;

    To check if you have the GRANT privileges to export query results, run the following MySQL command:

    mysql> SHOW GRANTS;
  • Lakehouse Role to access the AWS DB system. This role is required even when Lakehouse is disabled.

Non-JSON Syntax For Exporting Query Results

Export query results from HeatWave and Lakehouse using the following non-JSON SELECT INTO syntax.

SELECT INTO OUTFILE
URL uri
[FORMAT {CSV | PARQUET}]
[COMPRESSION string] 
[CHARACTER SET charset_name]
[HEADER {ON | OFF}]
[{FIELDS | COLUMNS} 
    [NULL AS null_char]
    [TERMINATED BY 'string'] 
         [[[OPTIONALLY] ENCLOSED BY 'char']]
         [ESCAPED BY 'char'] ...]
[LINES [TERMINATED BY 'string'] [STARTING BY 'string'] ...]

JSON Syntax for Exporting Query Results Using a PAR URL

Export the query results by specifying the Pre-Authenticated Requests (PAR) URL in the following extended SELECT INTO syntax. prefix is optional.

SELECT ... INTO OUTFILE 
    WITH PARAMETERS
'{"file": [{“par": ..., "prefix": ..}],
  "dialect": 
  {
    "format": CSV | PARQUET,
    "field_delimiter": ",", 
    "record_delimiter": "\n", 
    "encoding": "utf8mb4", 
    "escape_character": "\\\\", 
    "quotation_marks": "\"",
    "null_value": "\\n", 
    "has_header": "true | false",
    "compression": "UNCOMPRESSED" | "SNAPPY" | "GZIP" | "BROTLI" | "ZSTD" | "LZ4"
  }
 }'

PAR pattern ("https?*/p/?*/namespace/?*/bucket_name/?*/object_identifier/*") is added to a list of colon-separated patterns in order for MySQL to ignore statements with PAR for logging purposes. Now by default, the pattern list is: "*IDENTIFIED*:*PASSWORD*: https?*/p/?*/namespace/?*/bucket_name/?*/object_identifier/*".

JSON Syntax for Exporting Query Results Using Resource Principals

Export query results by specifying the resource principals in the following SELECT INTO syntax.

SELECT ... INTO OUTFILE 
    WITH PARAMETERS
'{"file": [{"bucket": ..., "prefix": .., "namespace: :.., "region":..}],
  "dialect": 
  {
    "format": CSV | PARQUET,
    "field_delimiter": ",", 
    "record_delimiter": "\n", 
    "encoding": "utf8mb4", 
    "escape_character": "\\\\", 
    "quotation_marks": "\"",
    "null_value": "\\n", 
    "has_header": "true | false",
    "compression": "UNCOMPRESSED" | "SNAPPY" | "GZIP" | "BROTLI" | "ZSTD" | "LZ4"
  }
 }'

Export Query Result Parameters

Parameter description for JSON and non-JSON syntax to export the query results.

  • URI: URL is a mandatory parameter for non-JSON syntax to export the query results. You can use HeatWave on OCI non-PAR / PAR URL, or HeatWave on AWS URL to access the Object Storage.

    While creating the PAR URL using the AWS HeatWave Console, enable the options Permit object reads and writes and Enable Object Listing.

    PAR pattern ("*https?*/p/?*/n/?*/b/?*/o/*") is added to a list of colon-separated patterns in order for MySQL to ignore statements with PAR for logging purposes. Now by default, the pattern list is: "*IDENTIFIED*:*PASSWORD*:*https?*/p/?*/n/?*/b/?*/o/*"

  • format: The file format to export the query results. Export the query results as a csv, or parquet file. By default, the results are exported to a csv file.

  • COMPRESSION: The file compression type for Parquet format. The supported compression types are UNCOMPRESSED, SNAPPY, GZIP, BROTLI, ZSTD and LZ4 (as supported by the Lakehouse load).

  • encoding: Defines the character encoding. The default is "utf8mb4". Encoding is ignored with warning for export.

  • has_header: Whether the CSV file has a header row, or not. The default is false. Represented as HEADER ON/HEADER OFF in the non-JSON syntax.

  • null_value: Represents a string as a null value. null_value is supported on csv format. Represented as a NULL AS in non-JSON format. The supported null values are "NULL" and "\N".

  • field_delimiter: Defines single character used to enclose fields. The delimiter for csv format is ",". When you are exporting query results to Object Storage, COLUMNS or FIELDS TERMINATED BY only supports single character values.

  • record_delimiter: Defines one or more characters used to delimit records. The maximum field delimiter length is 64 characters. The default is "\\n".

  • escape_character: Defines one or more characters used to enclose special characters. The default is "\\". Escape characters are ignored with warning for export.

  • quotation_marks: Defines a single character used to enclose fields. Represented as FIELDS ENCLOSED BY in non-JSON syntax. The default is "\"". Quotation marks are ignored with warning for export.

Output File Name Format

The query results are exported to a CSV or a Parquet file. The number of output files and size of individual files is dependent on the amount of data generated by the query.

The names for the output files are auto-generated and based on the following format:

file_prefix/universal unique number/node_ID-thread_ID-Chunk_ID.compression.file_format

File Name Parameters

  • File_prefix: The prefix for the file name, provided by the user.

    After the query results are exported to the Object Storage, the export file prefix is added to Section 7.3.7, “The rpd_query_stats Table”. To view the export file prefix, execute the following command:

    mysql> select QUERY_ID,EXPORT_FILE_PREFIX from performance_schema.rpd_query_stats WHERE EXPORT_FILE_PREFIX is NOT NULL;
  • Universal unique number: The unique ID generated for each query.

  • Node_ID: The node ID of the file.

  • Thread_ID: The thread ID of the file.

  • Chunk_ID: The chunk ID of the file.

  • Compression: The file compression type selected by the user.

  • File_format: The ouput file format selected by the user.

Sample Object Names

The following object names are generated for user-defined prefix: outfile-basic.

universal unique number: 976b11a5-4499-11ef-acb7- 0200171701b (unique number generated for each query), 
compression: snappy,
format : parquet
  - outfile-basic/976b11a5-4499-11ef-acb7-0200171701b/n0-t1-c1.snappy.parquet
  - outfile-basic/976b11a5-4499-11ef-acb7-0200171701b/n0-t0-c1.snappy.parquet
  - outfile-basic/976b11a5-4499-11ef-acb7-0200171701b/n0-t0-c2.snappy.parquet
  - outfile-basic/976b11a5-4499-11ef-acb7-0200171701b/n0-t1-c2.snappy.parquet

JSON/Non-JSON Examples to Export Query Results

JSON/Non-JSON samples to export query results to a CSV file using PAR URL or resource principals.

The input values used in these samples are:

  • FORMAT: csv or parquet file. The default format is csv file.

  • URL : The URI/URL to export the query results. The URL contains the following details:

    • Object Storage: The Object storage location with the region name -region 1.

    • Tenancy: tenancy1

    • Bucket: aa-test

    • Directory: summary/2024/Mar/

  • PAR URL: The PAR URL to export the query results. The PAR value contains the following details:

    • Object Storage: The Object storage location with the region name (region 1).

    • Tenancy: tenancy1

    • Bucket: aa-test

    • Prefix: summary/2025

  • Region: region1

  • Namespace: org_namespace

  • Bucket:aa-test

  • Prefix: export/summary/

  • HEADER: ON (Non-JSON syntax) and True (JSON syntax).

  • COMPRESSION: snappy

The following example uses default delimiters, and exports the query results to a csv file.

mysql> select sum(l_quantity) from lineitem 
into outfile URL 
'https://objectstorage.region1.oraclecloud.com/n/tenancy1/b/aa-test/o/summary/2024/Mar/' 
;

The following example uses default delimiters, and exports the query results to a csv file with header row.

mysql> select sum(l_quantity) from lineitem 
into outfile URL 
'https://objectstorage.region1.oraclecloud.com/n/tenancy1/b/aa-test/o/summary/2024/Mar/'FORMAT CSV 
HEADER ON;

The following example uses default delimiters, and exports the query results to a parquet file with compression method SNAPPY.

mysql> select sum(l_quantity) from lineitem 
into outfile URL 
'https://objectstorage.region1.oraclecloud.com/n/tenancy1/b/aa-test/o/summary/2024/Mar/'FORMAT parquet
COMPRESSION snappy;

The following example uses default delimiters, and exports the query results as a csv file with the header row, using the specified PAR URL.

mysql> select sum(l_quantity) from lineitem limit 1000
into outfile URL {"file":[{"par":"https://objectstorage.region1.oraclecloud.com/p/PTn9XsVegDBbSz1v-FgDma4G_e_78fJjv1GK04zlSqD9975MPZydieUJAGUypxQb/n/tenancy1/b/aa-test/o/"
, "prefix":"summary/2025"}]
, "dialect": 
	{"has_header": true, 
	 "format": "csv"
}}' ;
'https://objectstorage..oraclecloud.com/n/tenancy_name/b/bucket_name/o/summary/2024/Mar/'FORMAT CSV 
HEADER ON;

The following example exports the query results as a csv file, using the specified resource principals.

This sample uses the default values and hence dialect is not required.

mysql> select sum(l_quantity) from lineitem limit 1000
INTO OUTFILE WITH PARAMETERS
'{"file":[{"region": region1, 
		  "namespace": org_namespace, 
		  "bucket": aa-test,
		  "prefix": export/summary/
		}]
}' ;

The following example exports the query results as a csv file with the header row, using the specified resource principals.

mysql> select sum(l_quantity) from lineitem limit 1000
INTO OUTFILE WITH PARAMETERS
'{"file":[{"region": region1, 
		  "namespace": org_namespace, 
		  "bucket": aa-test,
		  "prefix": export/summary/
  }]
		, "dialect": {"has_header": true, "format": "csv"}
}' ;