HeatWave User Guide  /  HeatWave Quickstarts  /  AirportDB Analytics Quickstart

8.3 AirportDB Analytics Quickstart

The HeatWave airportdb Quickstart shows how to import data into the MySQL DB System using the MySQL Shell Dump Load utility, loading data into HeatWave using Auto Parallel Load, and running queries.

For an online workshop that demonstrates HeatWave using the airportdb sample database, see Turbocharge Business Insights with HeatWave Service and HeatWave.

This quickstart contains the following sections:

AirportDB Prerequisites

Downloading AirportDB Data

Download the airportdb sample database. The airportdb sample database is provided for download as a compressed tar or Zip archive. The download is approximately 640 MBs in size.

To run these commands, make sure you have the wget utility installed. Alternatively, you can download the files by pasting the URL in a browser window.

$> wget https://downloads.mysql.com/docs/airport-db.tar.gz

or

$> wget https://downloads.mysql.com/docs/airport-db.zip

Loading AirportDB into MySQL DB System

Depending on how you connect to your MySQL DB system, follow the appropriate steps to load the airportdb database into the MySQL DB system. To learn more about connecting to your DB system, see Networking.

To load the data, you will need to use the MySQL Shell Dump Loading utility. See MySQL Shell Dump Loading Utility.

VPN Connection

If you connect to the MySQL DB system through a VPN connection, you can load airportdb locally from your computer.

To load the airportdb database through a VPN connection:

  1. Unpack the compressed tar or Zip archive you downloaded previously. It creates a single directory named airport-db, which contains the data files.

    $> tar xvzf airport-db.tar.gz

    or

    $> unzip airport-db.zip
  2. Copy the file path to the unpacked airport-db folder. For example, on Windows it might be C:\\Temp\\airport-db, or on Mac it might be /Users/johnsmith/airport-db.

  3. From a terminal window, start MySQL Shell and connect to the MySQL DB System IP address. You can review the IP address (private IP address) and other connection details in the Connections tab for the DB system. For additional information about connecting to a MySQL DB System, see Connecting to a DB System in the HeatWave on OCI Service Guide.

    $> mysqlsh Username@DBSystem_IP_Address_or_Host_Name
  4. Enable js mode.

    MySQL> \js
  5. Load the airportdb database into the MySQL DB System using MySQL Shell Dump Loading Utility. In the command, replace airport-db with the file path you copied in the previous step.

    MySQL>JS> util.loadDump("airport-db", {threads: 16, deferTableIndexes: "all", ignoreVersion: true})

    The command uses the following options:

    • threads: Specifies the number of parallel threads to use to upload chunks of data to the target MySQL instance.

    • deferTableIndexes: This option is set to "all", so it defers the creation of secondary indexes until after the table data is loaded. This can reduce loading time. If you intend to use airportdb with only HeatWave, which does not use secondary indexes, you can avoid creating secondary indexes by specifying the loadIndexes: "FALSE" option instead of deferTableIndexes: "all". For more information about MySQL Dump Load options, see Dump Loading Utility.

    • ignoreVersion: This option is enabled, which imports the dump even if the major version number of the MySQL instance from which the data was dumped is non-consecutive to the major version number of the MySQL instance to which the data will be uploaded.

Compute Instance Connection

If you connect to the MySQL DB system through a compute instance, you will first need to upload the airportdb database to the compute instance before running the MySQL Shell Dump Loading Utility.

To load the airportdb database through a compute instance connection:

  1. From a terminal window, upload the contents of the airport-db folder to the appropriate directory of the compute instance.

    $> scp -v -i ssh-key.key airport-db.zip opc@ComputeInstancePublicIP:/home/opc/

    Replace the following:

    • ssh-key.key: The full file path to the SSH key file (.key) for the compute instance.

    • airport-db.zip: The full file path to the airport-db.zip file on your device.

    • opc@ComputeInstancePublicIP: The appropriate username and public IP for the compute instance.

    • /home/opc/: The appropriate file path to upload the airport-db folder to in the compute instance.

  2. Once the upload successfully completes, SSH into the Compute instance using the public IP address of the Compute instance.

    $> ssh -i ssh-key.key opc@computeInstancePublicIP

    Replace the following:

    • ssh-key.key: The full file path to the SSH key file (.key) for the compute instance.

    • opc@ComputeInstancePublicIP: The appropriate username and public IP for the compute instance.

  3. Unpack the compressed tar or Zip archive you downloaded previously. It creates a single directory named airport-db, which contains the data files.

    $> tar xvzf airport-db.tar.gz

    or

    $> unzip airport-db.zip
  4. From a terminal window, start MySQL Shell and connect to the MySQL DB System IP address. You can review the IP address (private IP address) and other connection details in the Connections tab for the DB system. For additional information about connecting to a MySQL DB System, see Connecting to a DB System in the HeatWave on OCI Service Guide.

    $> mysqlsh Username@DBSystem_IP_Address_or_Host_Name
  5. Enable js mode.

    MySQL> \js
  6. Load the airportdb database into the MySQL DB System using MySQL Shell Dump Loading Utility. If needed, replace airport-db to the appropriate file path in the compute instance.

    MySQL>JS> util.loadDump("airport-db", {threads: 16, deferTableIndexes: "all", ignoreVersion: true})

    The command uses the following options:

    • threads: Specifies the number of parallel threads to use to upload chunks of data to the target MySQL instance.

    • deferTableIndexes: This option is set to "all", so it defers the creation of secondary indexes until after the table data is loaded. This can reduce loading time. If you intend to use airportdb with only HeatWave, which does not use secondary indexes, you can avoid creating secondary indexes by specifying the loadIndexes: "FALSE" option instead of deferTableIndexes: "all". For more information about MySQL Dump Load options, see Dump Loading Utility.

    • ignoreVersion: This option is enabled, which imports the dump even if the major version number of the MySQL instance from which the data was dumped is non-consecutive to the major version number of the MySQL instance to which the data will be uploaded.

Loading AirportDB into HeatWave MySQL

To load the airportdb from the MySQL DB System into HeatWave:

Note

For HeatWave on AWS, load data into HeatWave using the HeatWave Console. See Manage Data in HeatWave with Workspaces in the HeatWave on AWS Service Guide.

  1. From a terminal window, start MySQL Shell and connect to the MySQL DB System IP address. You can review the IP address (private IP address) and other connection details in the Connections tab for the DB system.

    $> mysqlsh Username@DBSystem_IP_Address_or_Host_Name
  2. Run the following Auto Parallel Load command to load the airportdb tables into HeatWave.

    MySQL>SQL> CALL sys.heatwave_load(JSON_ARRAY('airportdb'), NULL);

    For information about the Auto Parallel Load utility, see Section 2.2.3, “Loading Data Using Auto Parallel Load”.

Running AirportDB Queries

After airportdb sample database tables are loaded into the HeatWave Cluster, queries that qualify are automatically offloaded to the HeatWave Cluster for accelerated processing. To run queries:

Note

For HeatWave on AWS, run queries from the Query Editor in the HeatWave Console. See Running Queries in the HeatWave on AWS Service Guide.

  1. From a terminal window, start MySQL Shell and connect to the MySQL DB System IP address. You can review the IP address (private IP address) and other connection details in the Connections tab for the DB system.

    $> mysqlsh Username@DBSystem_IP_Address_or_Host_Name
  2. Change to the airportdb database.

    MySQL>SQL> USE airportdb;
    Default schema set to `airportdb`.Fetching table and column names from `airportdb` for 
    auto-completion... Press ^C to stop.
  3. Before running a query, use EXPLAIN to verify that the query can be offloaded to the HeatWave Cluster. For example:

    MySQL>SQL> EXPLAIN SELECT booking.price, count(*)
                     FROM booking
                     WHERE booking.price > 500
                     GROUP BY booking.price
                     ORDER BY booking.price LIMIT
                     10\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: booking
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 54081693
         filtered: 33.32999801635742
            Extra: Using where; Using temporary; Using filesort; Using secondary engine RAPID

    If the query can be offloaded, the Extra column in the EXPLAIN output reports Using secondary engine RAPID.

  4. After verifying that the query can be offloaded, run the query and note the execution time.

    MySQL>SQL> SELECT booking.price, count(*) 
                     FROM booking 
                     WHERE booking.price > 500
                     GROUP BY booking.price 
                     ORDER BY booking.price 
                     LIMIT 10;
    +--------+----------+
    | price  | count(*) |
    +--------+----------+
    | 500.01 |      860 |
    | 500.02 |     1207 |
    | 500.03 |     1135 |
    | 500.04 |     1010 |
    | 500.05 |     1016 |
    | 500.06 |     1039 |
    | 500.07 |     1002 |
    | 500.08 |     1095 |
    | 500.09 |     1117 |
    | 500.10 |     1106 |
    +--------+----------+
    10 rows in set (0.0537 sec)
  5. To compare the HeatWave execution time with MySQL DB System execution time, disable the use_secondary_engine variable to see how long it takes to run the same query on the MySQL DB System; for example:

    MySQL>SQL> SET SESSION use_secondary_engine=OFF;
    MySQL>SQL> SELECT booking.price, count(*) 
                     FROM booking 
                     WHERE booking.price > 500
                     GROUP BY booking.price 
                     ORDER BY booking.price 
                     LIMIT 10;
    +--------+----------+
    | price  | count(*) |
    +--------+----------+
    | 500.01 |      860 |
    | 500.02 |     1207 |
    | 500.03 |     1135 |
    | 500.04 |     1010 |
    | 500.05 |     1016 |
    | 500.06 |     1039 |
    | 500.07 |     1002 |
    | 500.08 |     1095 |
    | 500.09 |     1117 |
    | 500.10 |     1106 |
    +--------+----------+
    10 rows in set (9.3859 sec)

For other airportdb sample database queries that you can run, see Additional AirportDB Queries. For more information about running queries, see Section 2.3, “Running Queries”.

Additional AirportDB Queries

This topic provides additional airportdb queries that you can run to test the HeatWave Cluster.

  • Query 1: Number of Tickets > $500.00, Grouped By Price

    mysql> SELECT booking.price, count(*) 
          FROM booking
          WHERE booking.price > 500
          GROUP BY booking.price
          ORDER BY booking.price
          LIMIT 10;
  • Query 2: Average Age of Passengers By Country, Per Airline

    mysql> SELECT airline.airlinename, AVG(datediff(departure,birthdate)/365.25) as avg_age, count(*) as nb_people
          FROM booking, flight, airline, passengerdetails
          WHERE booking.flight_id=flight.flight_id 
          AND airline.airline_id=flight.airline_id 
          AND booking.passenger_id=passengerdetails.passenger_id 
          AND country IN ("SWITZERLAND", "FRANCE", "ITALY")
          GROUP BY airline.airlinename
          ORDER BY airline.airlinename, avg_age
          LIMIT 10;
  • Query 3: Most Tickets Sales by Airline for Departures from US Airports

    mysql> SELECT airline.airlinename, SUM(booking.price) as price_tickets, count(*) as nb_tickets
          FROM booking, flight, airline, airport_geo
          WHERE booking.flight_id=flight.flight_id 
          AND airline.airline_id=flight.airline_id 
          AND flight.from=airport_geo.airport_id 
          AND airport_geo.country = "UNITED STATES"
          GROUP BY airline.airlinename
          ORDER BY nb_tickets desc, airline.airlinename
          LIMIT 10;

Unloading AirportDB Tables

airportdb tables can be unloaded from HeatWave using the following statements:

Note

For HeatWave on AWS, unload data from HeatWave using the HeatWave Console. See Manage Data in HeatWave with Workspaces in the HeatWave on AWS Service Guide.

mysql> USE airportdb;

ALTER TABLE booking SECONDARY_UNLOAD;
ALTER TABLE flight SECONDARY_UNLOAD;
ALTER TABLE flight_log SECONDARY_UNLOAD;
ALTER TABLE airport SECONDARY_UNLOAD;
ALTER TABLE airport_reachable SECONDARY_UNLOAD;
ALTER TABLE airport_geo SECONDARY_UNLOAD;
ALTER TABLE airline SECONDARY_UNLOAD;
ALTER TABLE flightschedule SECONDARY_UNLOAD;
ALTER TABLE airplane SECONDARY_UNLOAD;
ALTER TABLE airplane_type SECONDARY_UNLOAD;
ALTER TABLE employee SECONDARY_UNLOAD;
ALTER TABLE passenger SECONDARY_UNLOAD;
ALTER TABLE passengerdetails SECONDARY_UNLOAD;
ALTER TABLE weatherdata SECONDARY_UNLOAD;