WL#5883: make error messages robust to wrong parameters

Affects: Server-9.x   —   Status: Un-Assigned

Created after bug#11755168.

Examples of problems observed in code
=====================================

1)
 sql_print_error(ER(ER_OUTOFMEMORY), info.len*sizeof(XID));
ER(ER_OUTOFMEMORY), defined in errmsg-utf8.txt, is a format string containing
%d, but the argument is of type size_t, which is a mismatch.
2)
 push_warning_printf(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
                     WARN_DATA_TRUNCATED, ER(WARN_DATA_TRUNCATED),
                     item->full_name(), row_count);
where ER(WARN_DATA_TRUNCATED) ends with %ld, but row_count is longlong, mismatch
again, causing a wrong error message.
3)
 my_error(ER_HANDSHAKE_ERROR, MYF(0), thd->main_security_ctx.host_or_ip);
where ER(ER_HANDSHAKE_ERROR) actually has no % specifier.
4)
 my_error(ER_TOO_BIG_PRECISION, MYF(0), c_len, a->name,
          DECIMAL_MAX_PRECISION);
where ER(ER_TOO_BIG_PRECISION) has a %d specifier which is passed c_len (a
char*): nonsense.

Diagnosis
=========
We have too many such errors in code (62 have already been found when fixing the
above bug). There's no reason new ones won't appear if we don't change our design.

Points to note when searching for a solution
============================================

One easy way to catch such mistakes is to use
ATTRIBUTE_FORMAT (from my_attribute.h), a wrapper around gcc's
__attribute__(format).
But this works only when the site where the printf-like function is called, has
both:
a) the format as a string literal
b) the types of arguments.
So:
  func(ER(ER_BLAH), 10);
will silently not be checked, because ER(ER_BLAH) is not known at
compile time (it is known at run-time, and depends on the chosen
language).
And
  func("%s", a va_list argument);
has the same problem, as the *real* type of arguments is not
known at this site at compile time (it's known in some caller).
Moreover,
  func(ER(ER_BLAH));
though possibly correct (if ER(ER_BLAH) has no '%' markers), will not
compile (gcc says "error: format not a string literal and no format
arguments").

WL#751 (section "New formatting") added the possibility that positional format
specifiers are used in messages, so that the English message will print argument
1 before argument 2, but the German message will print argument 2 before
argument 1. Thus, the format specifiers differ by language. So they
cannot be literals in code.

WL#751 (section "New formatting") adds support for %#s , which isn't in the
standard printf(), so may cause problems with ATTRIBUTE_FORMAT (to be tested).

For reformatting of errmsg-utf8.txt, see also WL#5086.

Compile-time verifications are better than run-time ones, because not all error
paths are covered in the testsuite (probably).

There are some types of usage which make verifications difficult. A function
(for example a storage engine's function) returns an error code, this error code
is propagated through callers, finally some caller prints the error message; at
that moment the error code is in a variable.
Some functions also deliberately pass more arguments than needed because they
know that excess arguments don't hurt; example: handler::print_error(): some of
its error codes don't need two arguments...

Additional considerations (Marc Alff, 2011-11-28)
-------------------------------------------------

Because of WL#2111 GET DIAGNOSTICS, the number of attributes to provide when
raising a SQL condition (a warning, an error) needs to improve significantly.

The root cause of the problem here seems related to the wide spread use of the
"..." prototype in my_error, causing all kind of trouble.

When a given error (say, ER_NO_SUCH_TABLE) is raised, all the places in the code
raising this error *should* all provide the same parameters.

Having several places in the server calling my_error( ER_NO_SUCH_TABLE, ...) is
a source of potential bugs.

A possible solution is to provide a strongly typed helper to have only one call
to my_error, such as:

void THD::raise_ER_NO_SUCH_TABLE(const char *schema_name, const char *table_name)

This helper can not only use a printf to generate the correct MESSAGE_TEXT, but
it can then also populate correctly the SCHEMA_NAME and TABLE_NAME conditions
items in the condition area, which are missing today.

With this design:
- every place in the code that raises a given error is required to provide all
the needed parameters, with the correct type
- there is only 1 place to review carefully, the implementation of raise_ER_XXX
for each error

Potentially, every existing call to my_error() and push_warning() today needs to
be changed today, to provide new condition area attributes.

-- Marc

See also wl#6662 .