Documentation Home
MySQL AI
Download this Manual
PDF (US Ltr) - 1.4Mb
PDF (A4) - 1.4Mb


4.7.1.2 Model Metadata

The model_metadata column in the model catalog allows you to view detailed information on trained models. For example, you can view the algorithm used to train the model, the columns in the training table, and values for the model explanation.

When you run the ML_MODEL_IMPORT routine, the imported table has a model_metadata column that stores the metadata for the table. If you import a model from a table, model_metadata stores the name of the database and table. If you import a model object, model_metadata stores a JSON_OBJECT that contains key-value pairs of the metadata See Section 7.1.4, “ML_MODEL_IMPORT” to learn more.

The default value for model_metadata is NULL.

This topic has the following sections.

Model Metadata Details

model_metadata contains the following metadata as key-value pairs in JSON format:

  • task: string

    The task type specified in the ML_TRAIN query. The default is classification when used with ML_MODEL_IMPORT.

  • build_timestamp: number

    A timestamp indicating when the model was created (UNIX epoch time). A model is created when the ML_TRAIN routine finishes executing.

  • target_column_name: string

    The name of the column in the training table that was specified as the target column.

  • train_table_name: string

    The name of the input table specified in the ML_TRAIN query.

  • column_names: JSON array

    The feature columns used to train the model.

  • model_explanation: JSON object literal

    The model explanation generated during training. See Generate Model Explanations.

  • notes: string

    The notes specified in the ML_TRAIN query. It also records any error messages that occur during model training.

  • format: string

    The model can be in one of the following formats:

    • HWMLv1.0

    • HWMLv2.0

    • ONNXv1.0

    • ONNXv2.0

  • status: string

    The status of the model. The default is Ready when used with ML_MODEL_IMPORT.

    • Creating: The model is being created.

    • Ready: The model is trained and active.

    • Error: Either training was canceled or an error occurred during training. Any error message appears in the notes column. The error message also appears in model_metadata notes.

  • model_quality: string

    The quality of the model object for classification and regression tasks. For other tasks, this value is NULL. The value is either low or high.

  • training_time: number

    The time in seconds taken to train the model.

  • algorithm_name: string

    The name of the chosen algorithm.

  • training_score: number

    The cross-validation score achieved for the model by training.

  • n_rows: number

    The number of rows in the training table.

  • n_columns: number

    The number of columns in the training table.

  • n_selected_rows: number

    The number of rows selected by adaptive sampling.

  • n_selected_columns: number

    The number of columns selected by feature selection.

  • optimization_metric: string

    The optimization metric used for training. See Section 7.1.14, “Optimization and Scoring Metrics” to review available metrics.

  • selected_column_names: JSON array

    The names of the columns selected by feature selection.

  • contamination: number

    The contamination factor for the anomaly detection task. See Anomaly Detection Options to learn more.

  • options: JSON object literal

    The options specified in the ML_TRAIN query.

  • training_params: JSON object literal

    Internal task dependent parameters used during ML_TRAIN.

  • onnx_inputs_info: JSON object literal

    Information about the format of the ONNX model inputs. This only applies to ONNX models. See Manage External ONNX Models.

    Do not provide onnx_inputs_info if the model is not ONNX format. This generates an error.

    • data_types_map: JSON object literal

      This maps the data type of each column to an ONNX model data type. The default value is:

      JSON_OBJECT("tensor(int64)": "int64", "tensor(float)": "float32", "tensor(string)": "str_")
  • onnx_outputs_info: JSON object literal

    Information about the format of the ONNX model outputs. This only applies to ONNX models. See Manage External ONNX Models.

    Do not provide onnx_outputs_info if the model is not ONNX format, or if task is NULL. This generates an error.

    • predictions_name: string

      This name determines which of the ONNX model outputs is associated with predictions.

    • prediction_probabilities_name: string

      This name determines which of the ONNX model outputs is associated with prediction probabilities.

    • labels_map: JSON object literal

      This maps prediction probabilities to predictions, known as labels.

  • training_drift_metric: JSON object literal

    Contains data drift information about the training data. See Analyze Data Drift. This only applies to classification and regression models.

    • mean: number

      The mean value of drift metrics of all the training data. ≥ 0.

    • variance: number

      The variance value of drift metrics of all the training data. ≥ 0.

    Both mean and variance should be low.

  • chunks: number

    The total number of chunks that the model has been split into.

Query Model Metadata

You can query the model metadata in the model catalog with the following command. Replace user1 with your own user name.

mysql> SELECT JSON_PRETTY(model_metadata) FROM ML_SCHEMA_user1.MODEL_CATALOG\G
*************************** 1. row ***************************
JSON_PRETTY(model_metadata): {
  "task": "regression",
  "notes": null,
  "chunks": 1,
  "format": "HWMLv2.0",
  "n_rows": 407284,
  "status": "Ready",
  "options": {
    "task": "regression",
    "model_explainer": "permutation_importance",
    "prediction_explainer": "permutation_importance"
  },
  "n_columns": 14,
  "column_names": [
    "VendorID",
    "store_and_fwd_flag",
    "RatecodeID",
    "PULocationID",
    "DOLocationID",
    "passenger_count",
    "extra",
    "mta_tax",
    "tolls_amount",
    "improvement_surcharge",
    "trip_type",
    "lpep_pickup_datetime_day",
    "lpep_pickup_datetime_hour",
    "lpep_pickup_datetime_minute"
  ],
  "contamination": null,
  "model_quality": "high",
  "training_time": 515.13427734375,
  "algorithm_name": "RandomForestRegressor",
  "training_score": -5.610334873199463,
  "build_timestamp": 1730395944,
  "n_selected_rows": 130931,
  "training_params": {
    "recommend": "ratings",
    "force_use_X": false,
    "recommend_k": 3,
    "remove_seen": true,
    "ranking_topk": 10,
    "lsa_components": 100,
    "ranking_threshold": 1,
    "feedback_threshold": 1
  },
  "train_table_name": "heatwaveml_bench.nyc_taxi_train",
  "model_explanation": {
    "permutation_importance": {
      "extra": 0.0,
      "mta_tax": 0.0019,
      "VendorID": 0.0048,
      "trip_type": 0.0003,
      "RatecodeID": 0.0152,
      "DOLocationID": 0.4178,
      "PULocationID": 0.2714,
      "tolls_amount": 0.0851,
      "passenger_count": 0.0,
      "store_and_fwd_flag": 0.0,
      "improvement_surcharge": 0.0015,
      "lpep_pickup_datetime_day": 0.0,
      "lpep_pickup_datetime_hour": 0.0161,
      "lpep_pickup_datetime_minute": 0.0
    }
  },
  "n_selected_columns": 9,
  "target_column_name": "tip_amount",
  "optimization_metric": "neg_mean_squared_error",
  "selected_column_names": [
    "DOLocationID",
    "PULocationID",
    "RatecodeID",
    "VendorID",
    "improvement_surcharge",
    "lpep_pickup_datetime_hour",
    "mta_tax",
    "tolls_amount",
    "trip_type"
  ],
  "training_drift_metric": {
    "mean": 0.3326,
    "variance": 3.2482
  }
}
*************************** 2. row ***************************
JSON_PRETTY(model_metadata): {
  "task": "regression",
  "notes": null,
  "chunks": 0,
  "format": "HWMLv2.0",
  "n_rows": null,
  "status": "Error",
  "options": {},
  "n_columns": null,
  "column_names": null,
  "contamination": null,
  "model_quality": null,
  "training_time": null,
  "algorithm_name": null,
  "training_score": null,
  "build_timestamp": 1730403865,
  "n_selected_rows": null,
  "training_params": null,
  "train_table_name": "nyc_taxi.nyc_taxi_train",
  "model_explanation": {},
  "n_selected_columns": null,
  "target_column_name": "tip_amount",
  "optimization_metric": null,
  "selected_column_names": null,
  "training_drift_metric": {
    "mean": null,
    "variance": null
  }
}
*************************** 3. row ***************************
JSON_PRETTY(model_metadata): {
  "task": "regression",
  "notes": null,
  "chunks": 0,
  "format": "HWMLv2.0",
  "n_rows": null,
  "status": "Creating",
  "options": {},
  "n_columns": null,
  "column_names": null,
  "contamination": null,
  "model_quality": null,
  "training_time": null,
  "algorithm_name": null,
  "training_score": null,
  "build_timestamp": 1730404027,
  "n_selected_rows": null,
  "training_params": null,
  "train_table_name": "nyc_taxi.nyc_taxi_train",
  "model_explanation": {},
  "n_selected_columns": null,
  "target_column_name": "tip_amount",
  "optimization_metric": null,
  "selected_column_names": null,
  "training_drift_metric": {
    "mean": null,
    "variance": null
  }
}
3 rows in set (0.0859 sec)