MySQL HeatWave User Guide

Abstract

This document describes how to use MySQL HeatWave. It covers how to load data, run queries, optimize analytics workloads, and use HeatWave machine learning capabilities.

For information about creating and managing a HeatWave Cluster on Oracle Cloud Infrastructure (OCI), see the MySQL HeatWave Service documentation.

For information about creating and managing a HeatWave Cluster on Amazon Web Services (AWS), see the MySQL HeatWave on AWS Service Guide.

For information about creating and managing a HeatWave Cluster on Oracle Database Service for Azure (ODSA), see the Oracle Database Service for Azure documentation.

For MySQL Server documentation, refer to the MySQL Reference Manual.

For information about the latest MySQL HeatWave features and updates, refer to the HeatWave Release Notes.

For legal information, see the Legal Notices.

For help with using MySQL, please visit the MySQL Forums, where you can discuss your issues with other MySQL users.

Document generated on: 2024-03-01 (revision: 77981)

Table of Contents

Preface and Legal Notices
1 Overview
1.1 HeatWave Architectural Features
1.2 HeatWave Usage
1.3 HeatWave AutoML
1.4 HeatWave Lakehouse
1.5 MySQL Autopilot
1.6 MySQL Server Extensions for HeatWave
2 HeatWave
2.1 Before You Begin
2.2 Loading Data to HeatWave
2.2.1 Prerequisites
2.2.2 Loading Data Manually
2.2.3 Loading Data Using Auto Parallel Load
2.2.4 Monitoring Load Progress
2.2.5 Checking Load Status
2.2.6 Data Compression
2.2.7 Change Propagation
2.3 Running Queries
2.3.1 Query Prerequisites
2.3.2 Running Queries
2.3.3 Auto Scheduling
2.3.4 Auto Query Plan Improvement
2.3.5 Debugging Queries
2.3.6 Query Runtimes and Estimates
2.3.7 CREATE TABLE ... SELECT Statements
2.3.8 INSERT ... SELECT Statements
2.3.9 Using Views
2.4 Modifying Tables
2.5 Unloading Data from HeatWave
2.5.1 Unloading Tables
2.5.2 Unloading Data Using Auto Unload
2.6 Table Load and Query Example
2.7 Workload Optimization
2.7.1 Encoding String Columns
2.7.2 Defining Data Placement Keys
2.8 Workload Optimization using Advisor
2.8.1 Advisor Syntax
2.8.2 Auto Encoding
2.8.3 Auto Data Placement
2.8.4 Query Insights
2.8.5 Unload Advisor
2.8.6 Advisor Command-line Help
2.8.7 Advisor Report Table
2.9 Best Practices
2.9.1 Preparing Data
2.9.2 Provisioning
2.9.3 Importing Data into the MySQL DB System
2.9.4 Inbound Replication
2.9.5 Loading Data
2.9.6 Auto Encoding and Auto Data Placement
2.9.7 Running Queries
2.9.8 Monitoring
2.9.9 Reloading Data
2.10 Supported Data Types
2.11 Supported SQL Modes
2.12 Supported Functions and Operators
2.12.1 Aggregate Functions
2.12.2 Arithmetic Operators
2.12.3 Cast Functions and Operators
2.12.4 Comparison Functions and Operators
2.12.5 Control Flow Functions and Operators
2.12.6 Data Masking and De-Identification Functions
2.12.7 Date and Time Functions
2.12.8 Encryption and Compression Functions
2.12.9 JSON Functions
2.12.10 Logical Operators
2.12.11 Mathematical Functions
2.12.12 String Functions and Operators
2.12.13 Window Functions
2.13 String Column Encoding Reference
2.13.1 Variable-length Encoding
2.13.2 Dictionary Encoding
2.13.3 Column Limits
2.14 Troubleshooting
2.15 Metadata Queries
2.15.1 Secondary Engine Definitions
2.15.2 Excluded Columns
2.15.3 String Column Encoding
2.15.4 Data Placement
2.16 Bulk Ingest Data to MySQL Server
2.17 HeatWave Limitations
2.17.1 Change Propagation Limitations
2.17.2 Data Type Limitations
2.17.3 Functions and Operator Limitations
2.17.4 Index and Optimizer Hints
2.17.5 Join Limitations
2.17.6 Variable Limitations
2.17.7 Bulk Ingest Data to MySQL Server Limitations
2.17.8 Other Limitations
3 HeatWave AutoML
3.1 HeatWave AutoML Features
3.1.1 HeatWave AutoML Supervised Learning
3.1.2 HeatWave AutoML Ease of Use
3.1.3 HeatWave AutoML Workflow
3.1.4 Oracle AutoML
3.2 Before You Begin
3.3 Getting Started
3.4 Preparing Data
3.4.1 Labeled Data
3.4.2 Unlabeled Data
3.4.3 General Data Requirements
3.4.4 Example Data
3.4.5 Example Text Data
3.5 Training a Model
3.5.1 Advanced ML_TRAIN Options
3.6 Training Explainers
3.7 Predictions
3.7.1 Row Predictions
3.7.2 Table Predictions
3.8 Explanations
3.8.1 Row Explanations
3.8.2 Table Explanations
3.9 Forecasting
3.9.1 Training a Forecasting Model
3.9.2 Using a Forecasting Model
3.10 Anomaly Detection
3.10.1 Training an Anomaly Detection Model
3.10.2 Using an Anomaly Detection Model
3.11 Recommendations
3.11.1 Training a Recommendation Model
3.11.2 Using a Recommendation Model
3.12 HeatWave AutoML and Lakehouse
3.13 Managing Models
3.13.1 The Model Catalog
3.13.2 ONNX Model Import
3.13.3 Loading Models
3.13.4 Unloading Models
3.13.5 Viewing Models
3.13.6 Scoring Models
3.13.7 Model Explanations
3.13.8 Model Handles
3.13.9 Deleting Models
3.13.10 Sharing Models
3.14 Progress tracking
3.15 HeatWave AutoML Routines
3.15.1 ML_TRAIN
3.15.2 ML_EXPLAIN
3.15.3 ML_MODEL_IMPORT
3.15.4 ML_PREDICT_ROW
3.15.5 ML_PREDICT_TABLE
3.15.6 ML_EXPLAIN_ROW
3.15.7 ML_EXPLAIN_TABLE
3.15.8 ML_SCORE
3.15.9 ML_MODEL_LOAD
3.15.10 ML_MODEL_UNLOAD
3.15.11 Model Types
3.15.12 Model Metadata
3.15.13 Optimization and Scoring Metrics
3.16 Supported Data Types
3.17 HeatWave AutoML Error Messages
3.18 HeatWave AutoML Limitations
4 HeatWave Lakehouse
4.1 Overview
4.2 Prerequisites
4.3 Pre-Authenticated Requests
4.3.1 Pre-Authenticated Request Examples
4.4 Resource Principals
4.4.1 Configuring a Tenancy for Resource Principal Data Loading
4.5 Data Types
4.6 HeatWave Lakehouse Limitations
4.7 Lakehouse External Table Syntax
4.8 Lakehouse Auto Parallel Load
4.9 Loading Data from External Storage Using Auto Parallel Load
4.9.1 Load Configuration
4.9.2 Loading Lakehouse Data
4.9.3 Generated Load Script
4.10 Manually Loading Data from External Storage
4.11 External Table Recovery
4.12 HeatWave Lakehouse Error Messages
5 System and Status Variables
5.1 System Variables
5.2 Status Variables
6 HeatWave Performance and Monitoring
6.1 HeatWave Autopilot Report Table
6.2 HeatWave Monitoring
6.2.1 HeatWave Node Status Monitoring
6.2.2 HeatWave Memory Usage Monitoring
6.2.3 Data Load Progress and Status Monitoring
6.2.4 Change Propagation Monitoring
6.2.5 Query Execution Monitoring
6.2.6 Query History and Statistics Monitoring
6.2.7 Scanned Data Monitoring
6.3 HeatWave AutoML Monitoring
6.4 HeatWave Performance Schema Tables
6.4.1 The rpd_column_id Table
6.4.2 The rpd_columns Table
6.4.3 The rpd_exec_stats Table
6.4.4 The rpd_nodes Table
6.4.5 The rpd_preload_stats Table
6.4.6 The rpd_query_stats Table
6.4.7 The rpd_table_id Table
6.4.8 The rpd_tables Table
7 HeatWave Quickstarts
7.1 tpch Analytics Quickstart
7.1.1 tpch Prerequisites
7.1.2 Generating tpch Sample Data
7.1.3 Creating the tpch Sample Database and Importing Data
7.1.4 Loading tpch Data Into HeatWave
7.1.5 Running tpch Queries
7.1.6 Additional tpch Queries
7.1.7 Unloading tpch Tables
7.2 AirportDB Analytics Quickstart
7.2.1 AirportDB Prerequisites
7.2.2 Installing AirportDB
7.2.3 Loading AirportDB into HeatWave
7.2.4 Running AirportDB Queries
7.2.5 Additional AirportDB Queries
7.2.6 Unloading AirportDB Tables
7.3 Iris Data Set Machine Learning Quickstart