HeatWave User Guide

Abstract

This document describes how to use HeatWave. It covers how to load data, run queries, optimize analytics workloads, and use HeatWave machine learning capabilities.

For information about creating and managing a HeatWave Cluster on Oracle Cloud Infrastructure (OCI), see HeatWave on OCI Service Guide.

For information about creating and managing a HeatWave Cluster on Amazon Web Services (AWS), see HeatWave on AWS Service Guide.

For information about creating and managing a HeatWave Cluster on Oracle Database Service for Azure (ODSA), see HeatWave for Azure Service Guide.

For MySQL Server documentation, refer to the MySQL Reference Manual.

For information about the latest HeatWave features and updates, refer to the HeatWave Release Notes.

For legal information, see the Legal Notices.

For help with using MySQL, please visit the MySQL Forums, where you can discuss your issues with other MySQL users.

Document generated on: 2024-11-19 (revision: 80260)

Table of Contents

Preface and Legal Notices
1 Overview
1.1 HeatWave Architectural Features
1.2 HeatWave MySQL
1.3 HeatWave AutoML
1.4 HeatWave GenAI
1.5 HeatWave Lakehouse
1.6 HeatWave Autopilot
1.7 MySQL Functionality for HeatWave
2 HeatWave MySQL
2.1 Before You Begin
2.2 Loading Data to HeatWave MySQL
2.2.1 Prerequisites
2.2.2 Loading Data Manually
2.2.3 Loading Data Using Auto Parallel Load
2.2.4 Monitoring Load Progress
2.2.5 Checking Load Status
2.2.6 Data Compression
2.2.7 Change Propagation
2.2.8 Reload Tables
2.3 Running Queries
2.3.1 Query Prerequisites
2.3.2 Running Queries
2.3.3 Auto Scheduling
2.3.4 Auto Query Plan Improvement
2.3.5 Dynamic Query Offload
2.3.6 Debugging Queries
2.3.7 Query Runtimes and Estimates
2.3.8 CREATE TABLE ... SELECT Statements
2.3.9 INSERT ... SELECT Statements
2.3.10 Using Views
2.4 Modifying Tables
2.5 Unloading Data from HeatWave MySQL
2.5.1 Unloading Tables
2.5.2 Unloading Partitions
2.5.3 Unloading Data Using Auto Unload
2.5.4 Unload All Tables
2.6 Table Load and Query Example
2.7 Workload Optimization for OLAP
2.7.1 Encoding String Columns
2.7.2 Defining Data Placement Keys
2.7.3 HeatWave Autopilot Advisor Syntax
2.7.4 Auto Encoding
2.7.5 Auto Data Placement
2.7.6 Auto Query Time Estimation
2.7.7 Unload Advisor
2.7.8 Advisor Command-line Help
2.7.9 Autopilot Report Table
2.7.10 Advisor Report Table
2.8 Workload Optimization for OLTP
2.8.1 Autopilot Indexing
2.9 Best Practices
2.9.1 Preparing Data
2.9.2 Provisioning
2.9.3 Importing Data into the MySQL DB System
2.9.4 Inbound Replication
2.9.5 Loading Data
2.9.6 Auto Encoding and Auto Data Placement
2.9.7 Running Queries
2.9.8 Monitoring
2.9.9 Reloading Data
2.10 Supported Data Types
2.11 Supported SQL Modes
2.12 Supported Functions and Operators
2.12.1 Aggregate Functions
2.12.2 Arithmetic Operators
2.12.3 Cast Functions and Operators
2.12.4 Comparison Functions and Operators
2.12.5 Control Flow Functions and Operators
2.12.6 Data Masking and De-Identification Functions
2.12.7 Encryption and Compression Functions
2.12.8 JSON Functions
2.12.9 Logical Operators
2.12.10 Mathematical Functions
2.12.11 String Functions and Operators
2.12.12 Temporal Functions
2.12.13 Vector Functions
2.12.14 Window Functions
2.13 SELECT Statement
2.14 String Column Encoding Reference
2.14.1 Variable-length Encoding
2.14.2 Dictionary Encoding
2.14.3 Column Limits
2.15 Troubleshooting
2.16 Metadata Queries
2.16.1 Secondary Engine Definitions
2.16.2 Excluded Columns
2.16.3 String Column Encoding
2.16.4 Data Placement
2.17 Bulk Ingest Data to MySQL Server
2.18 HeatWave MySQL Limitations
2.18.1 Change Propagation Limitations
2.18.2 Data Type Limitations
2.18.3 Functions and Operator Limitations
2.18.4 Index Hint and Optimizer Hint Limitations
2.18.5 Join Limitations
2.18.6 Partition Selection Limitations
2.18.7 Variable Limitations
2.18.8 Bulk Ingest Data to MySQL Server Limitations
2.18.9 Other Limitations
3 HeatWave AutoML
3.1 HeatWave AutoML Features
3.1.1 HeatWave AutoML Supervised Learning
3.1.2 HeatWave AutoML Ease of Use
3.1.3 HeatWave AutoML Workflow
3.1.4 Oracle AutoML
3.2 HeatWave AutoML Prerequisites
3.3 Getting Started
3.4 Preparing Data
3.4.1 Labeled Data
3.4.2 Unlabeled Data
3.4.3 General Data Requirements
3.4.4 Example Data
3.4.5 Example Text Data
3.5 Training a Model
3.5.1 Advanced ML_TRAIN Options
3.6 Training Explainers
3.7 Predictions
3.7.1 Row Predictions
3.7.2 Table Predictions
3.8 Explanations
3.8.1 Row Explanations
3.8.2 Table Explanations
3.9 Forecasting
3.9.1 Training a Forecasting Model
3.9.2 Using a Forecasting Model
3.9.3 Prediction Intervals
3.10 Anomaly Detection
3.10.1 Anomaly Detection Model Types
3.10.2 Training an Anomaly Detection Model
3.10.3 Using an Anomaly Detection Model
3.11 Recommendations
3.11.1 Recommendation Model Types
3.11.2 Training a Recommendation Model
3.11.3 Using a Recommendation Model
3.11.4 Generating Recommendations for Ratings and Rankings
3.11.5 Generating Item Recommendations for Users
3.11.6 Generating User Recommendations for Items
3.11.7 Generating Recommendations for Similar Items
3.11.8 Generating Recommendations for Similar Users
3.12 HeatWave AutoML and Lakehouse
3.13 Topic Modeling
3.13.1 Training a Model with Topic Modeling
3.13.2 Table Predictions with Topic Modeling
3.13.3 Row Predictions with Topic Modeling
3.14 Managing Models
3.14.1 The Model Catalog
3.14.2 ONNX Model Import
3.14.3 Loading Models
3.14.4 Unloading Models
3.14.5 Viewing Models
3.14.6 Scoring Models
3.14.7 Model Explanations
3.14.8 Model Handles
3.14.9 Deleting Models
3.14.10 Sharing Models
3.14.11 Data Drift Detection
3.15 Progress tracking
3.16 HeatWave AutoML Routines
3.16.1 ML_TRAIN
3.16.2 ML_EXPLAIN
3.16.3 ML_MODEL_EXPORT
3.16.4 ML_MODEL_IMPORT
3.16.5 ML_PREDICT_ROW
3.16.6 ML_PREDICT_TABLE
3.16.7 ML_EXPLAIN_ROW
3.16.8 ML_EXPLAIN_TABLE
3.16.9 ML_SCORE
3.16.10 ML_MODEL_LOAD
3.16.11 ML_MODEL_UNLOAD
3.16.12 ML_MODEL_ACTIVE
3.16.13 Model Types
3.16.14 Optimization and Scoring Metrics
3.17 Supported Data Types
3.18 HeatWave AutoML Error Messages
3.19 HeatWave AutoML Limitations
4 HeatWave GenAI
4.1 HeatWave GenAI Overview
4.2 Getting Started with HeatWave GenAI
4.2.1 Requirements
4.2.2 Supported Languages, Embedding Models, and LLMs
4.2.3 Authenticating OCI Generative AI Service
4.2.4 Quickstart: Setting Up a Help Chat
4.3 Generating Text-Based Content
4.3.1 Generating New Content
4.3.2 Summarizing Content
4.4 Performing a Vector Search
4.4.1 HeatWave Vector Store Overview
4.4.2 Setting Up a Vector Store
4.4.3 Updating the Vector Store
4.4.4 Running Retrieval-Augmented Generation
4.5 Running HeatWave Chat
4.5.1 Running HeatWave GenAI Chat
4.5.2 Viewing Chat Session Details
4.6 Generating Vector Embeddings
4.7 HeatWave GenAI Routines
4.7.1 ML_GENERATE
4.7.2 ML_GENERATE_TABLE
4.7.3 VECTOR_STORE_LOAD
4.7.4 ML_RAG
4.7.5 ML_RAG_TABLE
4.7.6 HEATWAVE_CHAT
4.7.7 ML_EMBED_ROW
4.7.8 ML_EMBED_TABLE
4.8 Troubleshooting Issues and Errors
5 HeatWave Lakehouse
5.1 Overview
5.1.1 External Tables
5.1.2 Lakehouse Engine
5.1.3 Data Storage
5.2 Loading Structured Data to HeatWave Lakehouse
5.2.1 System Requirements
5.2.2 Lakehouse External Table Syntax
5.2.3 Loading Data Manually
5.2.4 Loading Data Using Auto Parallel Load
5.2.5 How to Load Data from External Storage Using Auto Parallel Load
5.2.6 Lakehouse Incremental Load
5.3 Loading Unstructured Data to HeatWave Lakehouse
5.4 Access Object Storage
5.4.1 Pre-Authenticated Requests
5.4.2 Resource Principals
5.5 External Table Recovery
5.6 Data Types
5.6.1 Parquet Data Type Conversions
5.7 HeatWave Lakehouse Error Messages
5.8 HeatWave Lakehouse Limitations
5.8.1 Lakehouse Limitations for all File Formats
5.8.2 Lakehouse Limitations for the Avro Format Files
5.8.3 Lakehouse Limitations for the CSV File Format
5.8.4 Lakehouse Limitations for the JSON File Format
5.8.5 Lakehouse Limitations for the Parquet File Format
6 System and Status Variables
6.1 System Variables
6.2 Status Variables
7 HeatWave Performance and Monitoring
7.1 HeatWave MySQL Monitoring
7.1.1 HeatWave Node Status Monitoring
7.1.2 HeatWave Memory Usage Monitoring
7.1.3 Data Load Progress and Status Monitoring
7.1.4 Change Propagation Monitoring
7.1.5 Query Execution Monitoring
7.1.6 Query History and Statistics Monitoring
7.1.7 Scanned Data Monitoring
7.2 HeatWave AutoML Monitoring
7.3 HeatWave Performance Schema Tables
7.3.1 The rpd_column_id Table
7.3.2 The rpd_columns Table
7.3.3 The rpd_exec_stats Table
7.3.4 The rpd_ml_stats Table
7.3.5 The rpd_nodes Table
7.3.6 The rpd_preload_stats Table
7.3.7 The rpd_query_stats Table
7.3.8 The rpd_table_id Table
7.3.9 The rpd_tables Table
8 HeatWave Quickstarts
8.1 HeatWave Quickstart Prerequisites
8.2 tpch Analytics Quickstart
8.3 AirportDB Analytics Quickstart
8.4 Iris Data Set Machine Learning Quickstart