WL#9344: Logging services: error messages

Affects: Server-8.0   —   Status: Complete

LOG MESSAGES

Update the calls to error-logging in the server
code to leverage the new features introduced by 
WL#9323 and friends, "structured logging."


Review sql_print_*() calls in mysqld:

- At minimum, add a SQL error code

- Identify calls that appear repeatedly with same or similar
  messages, and collapse them into the same message (where possible)
  and the same error code code.  For these, the error code can
  replace the string literal -- the convenience wrapper will then
  fill in the well-known string for that error code).

- add additional log-items of interest 
  (where applicable -- OS error code, etc.)

- messages shall not be localized at this point



Due to the volume of this task and the design of the
new logging infrastructure, these calls can be switched
over incrementally, id est, in several batches.


RELATED CONCERNS

We currently derive our server error messages from a file, errmsg-utf8.txt, that 
contains messages in various languages. From this input, we create one 
errmsg.sys file per language, to be read by the server at runtime; we also 
create header files, include/mysqld_error.h and include/mysqld_ername.h, which 
contain the default C/English messages. These headers are already compiled into 
the server (as per trunk). Rather than to fail in the absence of an errmsg.sys, 
it is suggested that we fail back on the compiled-in messages. The benefits are 
obvious:
- simpler start-up (simpler configuration)
- more robustness (can still start-up when errmsg.sys is missing)
- less overhead (no need to read errmsg.sys when we already have the information 
compiled-in)
- option to later deprecate the i10n mechanism (errmsg.sys files and their 
loading and handling) for simpler, more streamlined code
- opportunity to keep on using the proven mechanism for the time being, without 
needing to inflate this WL

I.e. this will give us backward compatibility and "the best of both 
worlds" for now (the ability for a more straightforward startup on one 
side, and the option to overload the default error messages with rewritten or 
translated ones -- as far as "bulk" goes; individual messages could also be 
transformed in a log-filter component).  We'll retain the option to deprecate 
the message file at a later date, in favor of the simplicity of just the 
compiled-in English set, or for other solutions (e.g. reading the information 
from a table, rather than having a separate mechanism to access a custom file -- 
this would present the data in a way more familiar to the DBA).

The new messages are philsophically in the class of "messages emitted by the 
server", and as an implementation detail, sourced from errmsg-utf8.txt for the 
time being. For both these reasons, their indices will be in the range of the 
server messages. As a related concern, while the numeric indices are of obvious 
usefulness within the server, the structured logging stack aims to also provide 
the symbolic error-code ("ER_FOO_MISSING" instead of 4711 etc.) which should 
serve to both enhance readability and to somewhat shield the end user from the 
numeric values which are considered rather an implementation detail.
Non-Func-Req 1   Replacement of calls to logger

Review sql_print_*() calls in mysqld, as per above, Exec Summary.


Func-Req 1   Fallback for running without errmsg.sys

If the server is unable to read its error messages from an errmsg.sys file at 
start-up, it should default to the built-in English messages. This will affect 
all messages normally read from a localized errmsg.sys, i.e. both those logged by 
the server to the error log, and those sent by the server to its clients.
I-1  SQL syntax

NO CHANGE

I-2  Instrumentation

NO CHANGE

I-3  Error and Warnings

Change calls to use a standardized, well-known error-code instead of a string  
literal wherever possible. Provide additional information to structured logging  
where sensible. When started without an errmsg.sys, all messages from the server 
-- both written to the error log and sent to clients -- will be in the default 
language, English.

I-4  Install/Upgrade

NO CHANGE

I-5  Commercial Plugins

NO CHANGE

I-6  Replication

NO CHANGE

I-7  Protocols

NO CHANGE

I-8  Security

NO CHANGE

I-9  Log files

Error log messages will change where several similar string literals are 
collapse into a single uniform message, where customized rate-limiting messages 
were used, or where messages contained newlines.
When started without an errmsg.sys, all messages from the server -- both written 
to the error log and sent to clients -- will be in the default language, 
English.

I-10 MySQL clients

NO CHANGE

I-11 Auth plugins

NO CHANGE

I-12 Globalization

NO CHANGE

I-13 Configuration

NO CHANGE

I-14 File formats

NO CHANGE (except log files, see there)

I-15 Plugins/APIs

NO CHANGE

I-16 Other programs

NO CHANGE

I-17 Storage engines

Update calls in the "classic" MySQL engines. Interface with Team Inno about 
updates there.
I  C++ fluent API

This new API keeps error submission a one-liner, while offering a vast increase 
in flexibility over the old format.  Particles enriching the default information 
can be appended as needed/desired.

Example:

  LogErr(INFORMATION_LEVEL, ER_FOO, "er_foo arg1");

  LogErr(ERROR_LEVEL, ER_FOO, myArg).tableName(t->alias)
                                    .stringValue("myKey", myVal);

Specification:

  LogErr(prio, errcode [, args]){.}

See WL#9323 for an exhaustive list of particles.

The C++ API creates an array of key/value pairs, then calls the low level API 
with it.

Calls to error logging should use the new log_message() variadic interface or, 
wherever possible, the fluent C++ style Log() interface.



II Compatiblity with "legacy" calls

During development, we define the legacy calls sql_print_*() to use the new 
error logging stack, like so:

#define sql_print_information(...) log_errlog_formatted(INFORMATION_LEVEL, ## 
__VA_ARGS__)

#define sql_print_warning(...) log_errlog_formatted(WARNING_LEVEL, ## 
__VA_ARGS__)

#define sql_print_error(...) log_errlog_formatted(ERROR_LEVEL, ## __VA_ARGS__)

This provides backward compatibility (maintain minimal call signature), and 
helps us identify use of function pointers to any of the sql_print_*(),
and as a cheap source of rich items to test with.



"log_errlog_formatted()" in turn is a macro that automatically enriches the call 
with additional information that is cheap to synthesize, e.g.

#define log_errlog_formatted(level, ...)  log_message(LOG_TYPE_ERROR, 
LOG_ITEM_LOG_PRIO, level, LOG_ITEM_MSC_SUBSYS, LOG_SUBSYSTEM_TAG, 
LOG_ITEM_SRC_LINE, __LINE__, LOG_ITEM_SRC_FILE, MY_BASENAME, 
LOG_ITEM_LOG_MESSAGE, ## __VA_ARGS__)

This in turn calls the variadic C interface.  Like the C++ interface, it is 
ultimately a convenience wrapper around the new error logging stack. New code 
shall not use this interface, except where C (rather than C++), or where 
external services cannot utilize the C++ interface.

  log_message(LOG_TYPE_ERROR, 
              LOG_ITEM_LOG_PRIO,    INFORMATION_LEVEL,
              LOG_ITEM_LOG_LOOKUP,  ER_STARTUP, my_progname, server_version,
                                                mysqld_unix_port, mysqld_port,
                                                MYSQL_COMPILATION_COMMENT);

Once all calls have been converted to the new C++ API, the #defines listed 
herein may be deprecated.



III  Message replacements

The table holding these replacements shall have the columns Service, Language, 
Symbol, Message. Replacements are identified by a unique key of the first three 
fields. Within a given service and language-set, messages are identified by 
error-symbol ("ER_STARTUP") as opposed to error-code (1234).
The dataset contained in the table is currently conceptually a changeset or 
"diff" to the data provided by Oracle: The expectation is that the user will 
largely rely on the provided messages, but elect to override a small subset of 
these messages.