WL#9637: SQL Function to return DIGEST of given SQL

Affects: Server-8.0   —   Status: Complete

This feature has been requested by many users. They want to ask the server for a digest for a query so they can correlate it with their application-side query capturing.

The feature consists of two new functions for which we see examples below:

mysql> SELECT statement_digest('SELECT * FROM foo') ->fc86ab9f4e4275ef861733844e3b2710313376a3a632204f99eecf8ce0855f4d

mysql> SELECT statement_digest_text('SELECT * FROM foo') -> 'SELECT * FROM `foo`'

  • F-1 statement_digest will calculate a statement digest from the tokens in its argument string.
  • F-2 statement_digest_text will calculate a normalized statement, a.k.a. digest text from the tokens in its argument string.
  • F-3 statement_digest and statement_digest_text will return NULL for NULL arguments.

  • NF-1 statement_digest shall calculate the same digest for a given syntactically correct SQL string as performance_schema does.

  • NF-2 statement_digest_text shall calculate the same normalized statement for a given syntactically correct SQL string as performance_schema does.
  • NF-3 statement_digest and statement_digest_text raise an error for incorrect SQL strings during execution.
  • NF-4 statement_digest and statement_digest_text only return NULL for NULL arguments.
  • NF-5 If the argument to either function is a string literal, the parse error from the parser will be quoted in the error message.
  • NF-6 If the argument to either function not a string literal, the parse error from the parser will not be quoted in the error message.

Overview

This work adds the new native functions STATEMENT_DIGEST() and STATEMENT_DIGEST_TEXT(). Both functions will parse the string using the normal SQL parser and hence may throw parse errors, but return NULL only when the argument is NULL. For strings containing valid SQL statements, STATEMENT_DIGEST() will return the digest that performance_schema would assign to the statement, and STATEMENT_DIGEST_TEXT() will return the normalized statement that performance_schema would assign.

Security

The argument will not be expanded in the case of syntax errors unless it was passed as a string literal.

Because the functions are so similar, they will have a common implementation called parse() in an anonymous namespace. The only difference between the functions is that Item_func_statement_digest::val_str_ascii() will call compute_digest_hash() whereas Item_func_statement_digest_text::val_str() will call compute_digest_text() after parsing.

The function parse() calls Bison's parser symbol MYSQLparse() directly, as all of the work done in parse_sql() deals either with cleanup up a previous state or preparing for execution.