MySQL HeatWave Lakehouse has the following limitations for Parquet files.
NaN values in Parquet files are loaded as NULL.
-
As of MySQL 9.3.2, Lakehouse supports the
VECTORdata type. Consider the following limitations:-
To load Parquet files with the
VECTORdata type (from Parquet strings or from Parquet (numeric) Lists), you must load the table manually withis_strict_modeset tofalse. For loading simple lists asJSON, you can load the data manually or automatically. To learn more, see Vector Data Type and External Table Syntax.TipIf you need to manually create a table but do not have the appropriate table definition, you can load the table automatically in
dryrunmode and review the load script. You also need to convert columns fromJSONtoVECTORas needed. See the example commands below to do this.mysql> SET @input_list = '[{ "db_name": "data_db", "tables": [{ "table_name": "table_1", "engine_attribute": {"dialect": {"format": "parquet" "embed_model_id": "minilm"}, "file": [{"par": "https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/data_file_1.parquet"}]} }]}]'; mysql> CALL sys.HEATWAVE_LOAD(CAST(@input_list AS JSON), JSON_OBJECT('mode', 'dryrun')); mysql> SELECT log->>"$.sql" AS "Load Script" FROM sys.heatwave_autopilot_report WHERE type = "sql" ORDER BY id; You cannot give the
VECTORdata type a default value other thanNULL.-
You cannot directly convert
VECTORtoJSONorJSONtoVECTOR.-
To convert
VECTORtoJSON, first convertVECTORto a STRING data type by using the VECTOR_TO_STRING() function. Then, cast this string asJSON.For example, if
col_1intable_1is aVECTORcolumn that you want to convert toJSON, you can do the following:SELECT CAST(VECTOR_TO_STRING(col_1) as JSON) FROM table_1; -
To convert
JSONtoVECTOR, first convertJSONto a STRING data type, and then use the STRING_TO_VECTOR() function to convert toVECTOR.For example, if
col_1intable_1is aJSONcolumn that you want to convert toVECTOR, you can do the following:SELECT STRING_TO_VECTOR(CAST(col_1 as NCHAR)) FROM table_1;
-
If you run
Vector_to_string(col)as a Vector_to_string() function, it fails if there are any0x(default for invalid vectors) values. To avoid errors, filter out these values.
-
As of MySQL 9.1.2, you can load Parquet files with row groups that total up to 10GB with large MySQL HeatWave Cluster shapes.
-
Lakehouse does not support the following data types in Parquet files. Lakehouse marks columns with these data types as
NOT SECONDARY, and does not load them.BSONENUMInterval-
ListNot supported before MySQL 9.3.2. As of MySQL 9.3.2, limited support is available for Parquet Lists in Lakehouse. Simple (non-nested) lists of the following Parquet data types are now supported:
INT_8,INT_16,INT_32,INT_64,UINT_8,UINT_16,UINT_32,UINT_64,FLOAT(32-bit),DOUBLE,STRING, andBOOL.
MapUnknownUUID
Do not use strict SQL mode if the inferred schema differs from the table schema.