WL#9343: Logging services: log writers
Affects: Server-8.0
—
Status: Complete
LOG WRITERS Log writers are services that accept a log event (a collection of key/value pairs describing a log event; at minimum a message and a priority) and write them to a log (a file, syslog, EventLog, etc.). Re-implement logging (basic logging to file, logging to syslog/eventlog) as services using structured logging as per WL#9323 et al. on top of the new services model (as per WL#4102 et al.). Additionally, implement at least one writer that can write all the structured data and item types supported by the new model, so that we may test. EXAMPLE I "Traditional" MySQL error log Fixed selection of infobits (timestamp, connection-ID, severity-label (Note, Warning, Error), error message) in fixed order: 2016-10-11T08:38:55.495001Z 25 [Note] mysqld (mysqld 8.0.0-dmr-debug) starting as process 7091 ... II JSON writer Arbitrary key/value pairs can be logged; keys become tags, values their content. Order is irrelevant as items are not identified by order, but by key. Entities are escaped according to standard (in this example, ", others are supported, but not shown here). "One tag per line" organization facilitates easier debugging through more meaningful diffs. Writer is future-proof as it writes items according to their storage class (string, integer, or floating point number), not according to their semantics (error number, error message, etc.). Being able to handle new types of field therefore adds no complexity and requires no change in the source. Therefore, what is logged depends not on the capabilities of the writer, but on what information is submitted when the error event is created (give or take any filtering along the way). The richer the created event, the richer the output in the JSON log. This example shows the output of a debug helper that echos every line the parser sees to the error log. { "log_priority": 2, "subsystem": "parser", "SQL_state": "XX999", "source_file": "sql_parse", "time": "2016-10-11T08:38:55.495001Z", "thread": 25, "MySQL_err_symbol": "ER_PARSER_TRACE", "log_message": "Parser saw: INSTALL COMPONENT \"file://component_log_sink_json\"", "log_label": "Note" } * Note that in this example, all keys are rather verbose for easier reading. An actual implementation might choose more concise keys to cut down on log file size.
Non-Func-Req 1 Configuration
Granular configuration of outputs is deferred until per-service system variables
are available.
Func-Req 1 "Classic" log
Reimplement "classic log to file" (timestamp, thread, message) as an internal
default for the new structure logging framework.
Func-Req 1.1 "Classic" log - item types
Ignore all log-items but a subset of
- priority (integer, aka "severity" or "log-level")
- message (string)
- label (optional string, to overwrite the default suggested by priority)
- timestamp (a string containing an ISO8601 timestamp)
- thread (postive integer identifying the connection-ID, or 0)
so that the file-format remains unchanged.
Func-Req 2 "Classic syslog"
Reimplement "classic syslog"/"classic EventLog" as loadable service.
Func-Req 2.1 "Classic syslog"/"classic EventLog" -- item types
Ignore all log-items but a subset of
- priority (integer)
- message (string -- owing to the semantics of the syslog
subsystem, this will cut off messages at \0)
The format therefore remains unchanged.
Syslog vasriants supporting richer data, such as systemd's
journal, may be supported at a later date in separate services.
Func-Req 3 JSON
Implement a JSON log writer. This should be able to print all items submitted,
and serves as a reference for rich logging. It will form the basis of most "new
error log"-related mtr test cases.
Func-Req 3.1 JSON - item types
Support all item classes (string, float, integer).
I-1 SQL syntax
NO CHANGE
I-2 Instrumentation
NO CHANGE
I-3 Error and Warnings
NO CHANGE (but see log files)
I-4 Install/Upgrade
NO CHANGE
I-5 Commercial Plugins
NO CHANGE (but it will open the door for possible commercial log writer plugins,
cf. APIs)
I-6 Replication
NO CHANGE
I-7 Protocols
NO CHANGE
I-8 Security
NO CHANGE
I-9 Log files
Log files (and, potentially, sinks other than files) will depend on the log
writer employed. The drop-in replacement should be just that ("NO CHANGE"), but
others (such as the JSON writer) will differ from the default both in format and
in content (primarily: number of fields written).
I-10 MySQL clients
NO CHANGE
I-11 Auth plugins
NO CHANGE
I-12 Globalization
NO CHANGE
I-13 Configuration
NO CHANGE -- perhaps unexpectedly, configuration is intended to be as close as
possible to the current system ("drop-in replacement"). This may be amended
later as per-component system variables become available.
I-14 File formats
NO CHANGE (except log files, see there)
I-15 Plugins/APIs
Offer service API for log-writers.
I-16 Other programs
NO CHANGE
I-17 Storage engines
NO CHANGE
1 ITEM "MESSAGE"
1.1 Substitutions
At the time the log sink is called, the "message" item (a format string, or an
error-code implying a format string) will have been filled in, i.e., all %
substitutions have been made. The log sink should consider all % as literals,
not substitutions.
1.2 NUL ('\0')
Some log sinks (such as syslog or Windows EventLog) have an API as the final
destination that uses C-strings ('\0' terminated const char *). While well-known
error messages (i.e. registered format strings) themselves would be compatible
with this, and even ad hoc messages (but not necessarily other string-type log
items) SHALL follow this standard, the same is not guaranteed for the results of
all substitutions, or across all character sets. Such log writers as cannot
guarantee an unbroken chain of (const char *, size_t) style interfaces should
therefore call a built-in (to the server proper) convenience service that will
escape \0 bytes and make sure the string as \0 terminated.
static DEFINE_METHOD(int, sanitize, (log_item *li));
2 LOCKING
If a writer writes to something other than file (or does not use the current
facilities for that), it will have to arrange for its own locking where needed.
3 DEFAULT ERROR LOG FILE / DEFAULT ERROR LOG STREAM
During prototyping, if a writer wishes to write to the default error log stream
(stderr, often file), it may use built-in primitives of the log service to
achieve this. This behavior will be discouraged once service configuration
variables become available. As this is contingent on WL#9424, it is beyond the
scope of this initial WL draft.
static DEFINE_METHOD(void, write_errstream, (const char *buffer,
size_t length));
All users of the default error log stream enjoy locking (vs the same writer in
other threads, and other writers targeting the same stream). That said,
multiplexing several formats into the same output stream is of interest for
debugging only; production use will inherit the current trunk's limitations
(logging to stderr/file and/or logging to syslog (UNIX) or EventLog (Windows)).
This limitation can be lifted on WL#9424 becomes available and lets us configure
writers (and their targets) individually.
4 CONFIGURATION
Some writers require configuration. To maintain compatibility with current
behaviour, the syslog/EventLog writer must heed the settings for log_syslog_tag
and log_syslog_facility in the server. These variables can be deprecated once
per-component system variables (WL#9424) become available, and be replaced with
per-component ones at that time.
5 LOG SINKS: LOG WRITERS
5.1 TYPES OF LOG WRITERS
There are only two types of log writers. "Modern sinks" like the JSON
writer don't care whether a key/value pair is a table name or
an error message, or what -- a string is a string is a string. These
sinks have to cover just three cases: format as string, format as
integer, format as float. Trivial.
The other type is sinks that write to a fixed format, like the MySQL
error log. The error log as you'll remember has a timestamp,
connection ID, severity label, and an error message, in that exact order.
The traditional error log sink ("log_sink_trad") therefore doesn't
attempt to process all the information we might send its way; it's
all about knowing how to retrieve the few items it knows it's interested
in, and printing them (in the prescribed order). All other information
is ignored. Still simple, just a different kind of simple.
5.2 INPUT TO SINKS ("What to expect from a log event")
So as we've seen, it's either "handle 3 storage classes" (which is
simple) or "handle as many specific item types are you're interested
in". Both kinds of writers are in fact very basic. The first group
(which handles any and all items) is actually the simplest, and perhaps
easier to read and understand than the second group, as they're little
more than a glorified
switch (item->item_class) {
case STRING: sprintf(buf, "%s", item->value.data_string);
break;
case INTEGER: sprintf(buf, "%d", item->value.data_integer);
break;
case FLOAT: sprintf(buf, "%f", item->value.data_float);
break;
}
Anything beyond that in a given source file is usually the > 100 lines
of boilerplate needed by the component framework (but those are going
to be there no matter how simple or complex our actual writer is), and
code to handle configuration variables (if any).
The code for a baseline "modern" writer amounts to about 35 lines of code,
plus boilerplate. The JSON sink is slightly longer, as it needs to
get the formatting and escaping and tags and entities right. Still,
it is still a single function of only about 100 lines.
5.3 ERROR MESSAGES THAT AREN'T STRINGS, AND OTHER INSANITY
So what if a service is looking for the error message, but it turns
out that message is not a string? Won't that add complexity?
Well, first off, the well-know items (log message, error number, ...)
all also have a well-known storage class, so an error message for example
shall always be of string class. Our code (submission API, filter, ...)
ensures this. Additionally, there is code in the server to help guard
against broken modules that violate these guarantees, so the sinks won't
have to check that themselves, either.
In summary, if a case like that happens, it's a bug, and it's a bug that
we have tools to detect. The expectation is NOT that services have to
convert data-types on the fly and be able to handle error messages that
are numbers, or integers that come wrapped in a string, or any such
nonsense. It is true that modern writers like the JSON one gracefully
handle this case at zero overhead (as they only look at storage class,
not type or semantics), but this is very specifically NOT required of
any service.
5.4 OPTIMIZATIONS
Log-writers (in contrast to log sources and log filters) are not expected to
change the log event. A writer may elect to skip a field, or to synthesize or
transform data, but this view on the data must remain private to the writer. As
a result of this, a future iteration of the framework could elect to execute the
writers in concurrent threads. (As correctly pointed out by Praveen. :)
5.5 FLUSH ERROR LOGS
Log-services MAY implement a function flush() that, if present, is called on
FLUSH ERROR LOGS. An obvious use here would be to flush, close, and re-open a
log file to aid in log rotation.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.