HeatWave Autopilot automates many of the most important and often challenging aspects of achieving exceptional query performance at scale, including cluster provisioning, loading data, query processing, and failure handling. It uses advanced techniques to sample data, collect statistics on data and queries, and build machine learning models to model memory usage, network load, and execution time. The machine learning models are used by HeatWave Autopilot to execute its core capabilities. HeatWave Autopilot makes the HeatWave query optimizer increasingly intelligent as more queries are executed, resulting in continually improving system performance.
The following sections describe some of the features of HeatWave Autopilot:
-
Auto Provisioning
Estimates the number of HeatWave nodes required by sampling the data, which means that manual cluster size estimations are not necessary.
For HeatWave on OCI, see Generating a Node Count Estimate in the HeatWave on OCI Service Guide.
For HeatWave on AWS, see Estimating Cluster Size with HeatWave Autopilot in the HeatWave on AWS Service Guide.
For HeatWave for Azure, see Provisioning HeatWave Nodes in the HeatWave for Azure Service Guide.
-
Auto Shape Prediction
For HeatWave on AWS, the Auto Shape Prediction feature in HeatWave Autopilot uses MySQL statistics for the DB System workloads to assess the suitability of the current DB System shape. Auto Shape Prediction provides prompts to upsize the DB System shape and improve system performance, or to downsize the shape if the system is under-utilized. To learn more, see Autopilot Shape Advisor in the HeatWave on AWS Service Guide.
-
Automated Backups
By default, HeatWave creates automatic backup of the DB System once a day and retains the backup for 7 days. However, when you create a DB System, you have the option to disable automatic backups, or specify how long to retain the backups. The retention period of automatic backups can be between 1 and 35 days.
For HeatWave on OCI, see Overview of Backups in the HeatWave on OCI Service Guide.
For HeatWave on AWS, see Backups in the HeatWave on AWS Service Guide.
For HeatWave for Azure, see Provisioning Oracle HeatWave in the HeatWave for Azure Service Guide.
-
Rolling Upgrades
Updates within the same version, such as 9.0.1-u1, are applied automatically during the maintenance window that you define on the DB System. A deprecated version becomes unavailable about three months after the deprecation date. Any DB System that is running an unavailable version will be upgraded automatically during its next maintenance window.
Rolling upgrades are available for HeatWave on OCI and HeatWave on AWS.
For HeatWave on OCI, see DB System Upgrades in the HeatWave on OCI Service Guide.
For HeatWave on AWS, see Maintenance in the HeatWave on AWS Service Guide.
-
Auto Load/Unload
Auto Parallel Load Optimizes load time and memory usage by predicting the optimal degree of parallelism for each table loaded into HeatWave. For more information, see Section 2.2.4, “Loading Data Using Auto Parallel Load”.
Auto Unload can automate the process of unloading data from HeatWave Section 2.6.3, “Unloading Data Using Auto Unload”.
Auto Data Placement
Recommends how tables should be partitioned in memory to achieve the best query performance, and estimates the expected performance improvement. For more information, see Section 2.8.5, “Auto Data Placement”.
-
Auto Encoding
Determines how the data rows in a table are partitioned across different nodes in the HeatWave Cluster. An optimal data placement can reduce inter-node communications as well as query execution time, and increase the throughput. For more information, see Section 2.8.4, “Auto Encoding”
-
Auto Compression
HeatWave and HeatWave Lakehouse can compress data stored in memory using different compression algorithms. To minimize memory usage while providing the best query performance, auto compression dynamically determines the compression algorithm to use for each column based on its data characteristics. Auto compression employs an adaptive sampling technique during the data loading process, and automatically selects the optimal compression algorithm without user intervention. Algorithm selection is based on the compression ratio and the compression and decompression rates, which balance the memory needed to store the data in HeatWave with query execution time. For more information, see Section 2.2.7, “Data Compression”.
-
Auto Indexing
Autopilot Indexing can make secondary index suggestions to improve workload performance for tables stored in the DB System. For more information, see Section 2.9.1, “Autopilot Indexing”
-
Auto Schema Inference
Lakehouse Auto Parallel Load extends Auto Parallel Load with Auto Schema Inference that can analyze the data, infer the external table structure, and create the database and all tables. It can also use header information from the external files to define the column names. For more information, see Section 5.2.4.1, “Lakehouse Auto Parallel Load Schema Inference”.
-
Adaptive Sampling
Predicts the relevant metadata and statistics in petabytes of object storage data while only loading a small part of the data. The adaptive data sampling algorithm dynamically determines the right level of sampling to achieve the highest possible accuracy while minimizing the data that needs to be loaded.
-
Adaptive Data Flow
Learns and coordinates network bandwidth utilization to the object store across a large cluster of nodes, dynamically adapting to the performance of the underlying object store. This results in optimal performance and availability.
-
Auto Change Propagation
Intelligently determines the optimal time when changes in a MySQL DB System should be propagated to the HeatWave storage layer. This ensures that changes are being propagated at the right optimal cadence.
-
Adaptive Query Execution
Adaptive query optimization automatically improves query performance and memory consumption, and mitigates skew-related performance issues as well as out-of-memory errors. It uses various statistics to adjust data structures and system resources after query execution has started. It independently optimizes query execution for each HeatWave node based on actual data distribution at runtime. This helps improve the performance of ad hoc queries by up to 25%. The HeatWave optimizer generates a physical query plan based on statistics collected by Autopilot. During query execution, each HeatWave node executes the same query plan. With adaptive query execution, each individual HeatWave node adjusts the local query plan based on statistics such as cardinality and distinct value counts of intermediate relations collected locally in real-time. This allows each HeatWave node to tailor the data structures that it needs, resulting in better query execution time, lower memory usage, and improved data skew-related performance.
-
Auto Query Plan Improvement
Collects previously executed queries and uses them to improve future query execution plans. For more information, see Section 2.3.4, “Auto Query Plan Improvement”.
-
Auto Scheduling
Prioritizes queries in an intelligent way to reduce overall query execution wait times. For more information, see Section 2.3.3, “Auto Scheduling”.
-
Dynamic Offload
Analyzes the query characteristics and execution engine static and dynamic characteristics to choose the best engine for the query, given the current system state. For more information, see Section 2.3.5, “Dynamic Query Offload”.
-
Auto Query Time Estimation
Estimates query execution time to determine how a query might perform without having to run the query. For more information, see Section 2.8.6, “Auto Query Time Estimation”.
-
Auto Cardinality Estimation
Cardinalities from actual query runs are cached, which is used for cardinality estimation of an exact query, a sub-set of a query, or a similar query. HeatWave uses HeatWave Autopilot to automatically collect cardinality statistics to calculate costs. Cardinality accuracy improves when more queries are run in the system, making cost models more accurate.
-
Auto Thread Pooling
Provides sustained throughput during high transaction concurrency. Where multiple clients are running queries concurrently, Auto Thread Pooling applies workload-aware admission control to eliminate resource contention caused by too many waiting transactions. Auto Thread Pooling automatically manages the settings for the thread pool control variables
thread_pool_size
,thread_pool_max_transactions_limit
, andthread_pool_query_threads_per_group
. For details of how the thread pool works, see Thread Pool Operation. -
Auto Embedding Generation
With HeatWave GenAI, you can effortlessly generate embeddings within the database without the complexity of having the user select specific embedding models. This simplified approach means significantly reduced application complexity and also eliminates the need for detailed machine learning knowledge to generate the embeddings. Additionally, HeatWave GenAI seamlessly integrates with the automated in-database vector store, saving you from the hassle of transferring embeddings to a separate vector database. For more information, see HeatWave In-Database Embedding Models.
-
Auto Error Recovery
For HeatWave on OCI, when a HeatWave node becomes unresponsive due to a software or hardware failure, Auto Error Recovery recovers a failed node or provisions a new one and reloads data from the HeatWave Storage Layer, the DB System, or OCI Object Storage in case of Lakehouse tables. For more information, see HeatWave Cluster Failure and Recovery in the HeatWave on OCI Service Guide.
For HeatWave on AWS, when a HeatWave node becomes unresponsive due to a software or hardware failure, Auto Error Recovery recovers a failed node and reloads data from the HeatWave Storage Layer, the DB System, or Amazon S3 in case of Lakehouse tables. For more information, see HeatWave Cluster Data Recovery in the HeatWave on AWS Service Guide.
-
Auto Failover
High availability DB Systems are made up of three MySQL instances: a primary instance and two secondary instances. If the primary instance fails, the HeatWave Service automatically promotes one of the secondary instances to function as the primary instance. This resumes availability to client applications with no data loss.
For more information, see Failover in the HeatWave on OCI Service Guide or Failover in the HeatWave on AWS Service Guide.