Related Documentation Download this Manual
PDF (US Ltr) - 1.7Mb
PDF (A4) - 1.7Mb

3.1.4 Oracle AutoML

The HeatWave AutoML ML_TRAIN routine leverages Oracle AutoML technology to automate the process of training a machine learning model. Oracle AutoML replaces the laborious and time consuming tasks of the data analyst whose workflow is as follows:

  1. Selecting a model from a large number of viable candidate models.

  2. For each model, tuning hyperparameters.

  3. Selecting only predictive features to speed up the pipeline and reduce over-fitting.

  4. Ensuring the model performs well on unseen data (also called generalization).

Oracle AutoML automates this workflow, providing you with an optimal model given a time budget. The Oracle AutoML pipeline used by the HeatWave AutoML ML_TRAIN routine has these stages:

  • Data preprocessing

  • Algorithm selection

  • Adaptive data reduction

  • Hyperparameter optimization

  • Model and prediction explanations

Figure 3.1 Oracle AutoML Pipeline

Image showing the Oracle AutoML pipeline.

Oracle AutoML also produces high quality models very efficiently, which is achieved through a scalable design and intelligent choices that reduce trials at each stage in the pipeline.

  • Scalable design: The Oracle AutoML pipeline is able to exploit both HeatWave internode and intranode parallelism, which improves scalability and reduces runtime.

  • Intelligent choices reduce trials in each stage: Algorithms and parameters are chosen based on dataset characteristics, which ensures that the model is accurate and efficiently selected. This is achieved using meta-learning throughout the pipeline.

For additional information about Oracle AutoML, refer to Yakovlev, Anatoly, et al. "Oracle AutoML: A Fast and Predictive AutoML Pipeline." Proceedings of the VLDB Endowment 13.12 (2020): 3166-3180.