Documentation Home
HeatWave User Guide
Related Documentation Download this Manual
PDF (US Ltr) - 2.2Mb
PDF (A4) - 2.2Mb


HeatWave User Guide  /  ...  /  Anomaly Detection for Logs

3.10.3 Anomaly Detection for Logs

MySQL 9.2.2 introduces the ability to detect anomalies in log data. To perform anomaly detection on logs, log data is cleaned, segemented, and encoded before running anomaly detection. This feature leverages the log template miner Drain3.

Consider the following when running anomaly detection on logs.

  • The input table can only have the following columns:

    • The column containing the logs.

    • If including logs from different sources, a column containing the source of each log. The values in this column contain the names of the sources that each log belongs to. These values are used to group each host's logs together. If this column is not present, it is assumed that all logs originate from the same source.

    • If including labeled data, a column identifying the labeled log lines. See Semi-supervised Anomaly Detection to learn more.

  • If the input table has additional columns to the ones permitted, you must use the exclude_column_list option when running ML_TRAIN to exclude irrelevant columns.

  • The data collected for anomaly detection can be unsupervised or semi-supervised. To run semi-supervised anomaly detection, you can provide a separate column in the input table with labels for the labeled log lines. This column labels identified anomalous logs with a value of 1, non-anomalous logs with 0, and unlabeled logs with NULL. See Semi-supervised Anomaly Detection to learn more.

  • In addition to the anomaly scores included in the output table, you have the option to leverage HeatWave GenAI to provide textual log summaries.

  • By default the following parameters are masked in the input data (training or test data): IP, DATETIME, TIME, HEX, IPPORT, and OCID. You have the option to mask additional regex patterns with the additional_masking_regex option.