WL#9343: Logging services: log writers
Affects: Server-8.0
—
Status: Complete
LOG WRITERS Log writers are services that accept a log event (a collection of key/value pairs describing a log event; at minimum a message and a priority) and write them to a log (a file, syslog, EventLog, etc.). Re-implement logging (basic logging to file, logging to syslog/eventlog) as services using structured logging as per WL#9323 et al. on top of the new services model (as per WL#4102 et al.). Additionally, implement at least one writer that can write all the structured data and item types supported by the new model, so that we may test. EXAMPLE I "Traditional" MySQL error log Fixed selection of infobits (timestamp, connection-ID, severity-label (Note, Warning, Error), error message) in fixed order: 2016-10-11T08:38:55.495001Z 25 [Note] mysqld (mysqld 8.0.0-dmr-debug) starting as process 7091 ... II JSON writer Arbitrary key/value pairs can be logged; keys become tags, values their content. Order is irrelevant as items are not identified by order, but by key. Entities are escaped according to standard (in this example, ", others are supported, but not shown here). "One tag per line" organization facilitates easier debugging through more meaningful diffs. Writer is future-proof as it writes items according to their storage class (string, integer, or floating point number), not according to their semantics (error number, error message, etc.). Being able to handle new types of field therefore adds no complexity and requires no change in the source. Therefore, what is logged depends not on the capabilities of the writer, but on what information is submitted when the error event is created (give or take any filtering along the way). The richer the created event, the richer the output in the JSON log. This example shows the output of a debug helper that echos every line the parser sees to the error log. { "log_priority": 2, "subsystem": "parser", "SQL_state": "XX999", "source_file": "sql_parse", "time": "2016-10-11T08:38:55.495001Z", "thread": 25, "MySQL_err_symbol": "ER_PARSER_TRACE", "log_message": "Parser saw: INSTALL COMPONENT \"file://component_log_sink_json\"", "log_label": "Note" } * Note that in this example, all keys are rather verbose for easier reading. An actual implementation might choose more concise keys to cut down on log file size.
Non-Func-Req 1 Configuration Granular configuration of outputs is deferred until per-service system variables are available. Func-Req 1 "Classic" log Reimplement "classic log to file" (timestamp, thread, message) as an internal default for the new structure logging framework. Func-Req 1.1 "Classic" log - item types Ignore all log-items but a subset of - priority (integer, aka "severity" or "log-level") - message (string) - label (optional string, to overwrite the default suggested by priority) - timestamp (a string containing an ISO8601 timestamp) - thread (postive integer identifying the connection-ID, or 0) so that the file-format remains unchanged. Func-Req 2 "Classic syslog" Reimplement "classic syslog"/"classic EventLog" as loadable service. Func-Req 2.1 "Classic syslog"/"classic EventLog" -- item types Ignore all log-items but a subset of - priority (integer) - message (string -- owing to the semantics of the syslog subsystem, this will cut off messages at \0) The format therefore remains unchanged. Syslog vasriants supporting richer data, such as systemd's journal, may be supported at a later date in separate services. Func-Req 3 JSON Implement a JSON log writer. This should be able to print all items submitted, and serves as a reference for rich logging. It will form the basis of most "new error log"-related mtr test cases. Func-Req 3.1 JSON - item types Support all item classes (string, float, integer).
I-1 SQL syntax NO CHANGE I-2 Instrumentation NO CHANGE I-3 Error and Warnings NO CHANGE (but see log files) I-4 Install/Upgrade NO CHANGE I-5 Commercial Plugins NO CHANGE (but it will open the door for possible commercial log writer plugins, cf. APIs) I-6 Replication NO CHANGE I-7 Protocols NO CHANGE I-8 Security NO CHANGE I-9 Log files Log files (and, potentially, sinks other than files) will depend on the log writer employed. The drop-in replacement should be just that ("NO CHANGE"), but others (such as the JSON writer) will differ from the default both in format and in content (primarily: number of fields written). I-10 MySQL clients NO CHANGE I-11 Auth plugins NO CHANGE I-12 Globalization NO CHANGE I-13 Configuration NO CHANGE -- perhaps unexpectedly, configuration is intended to be as close as possible to the current system ("drop-in replacement"). This may be amended later as per-component system variables become available. I-14 File formats NO CHANGE (except log files, see there) I-15 Plugins/APIs Offer service API for log-writers. I-16 Other programs NO CHANGE I-17 Storage engines NO CHANGE
1 ITEM "MESSAGE" 1.1 Substitutions At the time the log sink is called, the "message" item (a format string, or an error-code implying a format string) will have been filled in, i.e., all % substitutions have been made. The log sink should consider all % as literals, not substitutions. 1.2 NUL ('\0') Some log sinks (such as syslog or Windows EventLog) have an API as the final destination that uses C-strings ('\0' terminated const char *). While well-known error messages (i.e. registered format strings) themselves would be compatible with this, and even ad hoc messages (but not necessarily other string-type log items) SHALL follow this standard, the same is not guaranteed for the results of all substitutions, or across all character sets. Such log writers as cannot guarantee an unbroken chain of (const char *, size_t) style interfaces should therefore call a built-in (to the server proper) convenience service that will escape \0 bytes and make sure the string as \0 terminated. static DEFINE_METHOD(int, sanitize, (log_item *li)); 2 LOCKING If a writer writes to something other than file (or does not use the current facilities for that), it will have to arrange for its own locking where needed. 3 DEFAULT ERROR LOG FILE / DEFAULT ERROR LOG STREAM During prototyping, if a writer wishes to write to the default error log stream (stderr, often file), it may use built-in primitives of the log service to achieve this. This behavior will be discouraged once service configuration variables become available. As this is contingent on WL#9424, it is beyond the scope of this initial WL draft. static DEFINE_METHOD(void, write_errstream, (const char *buffer, size_t length)); All users of the default error log stream enjoy locking (vs the same writer in other threads, and other writers targeting the same stream). That said, multiplexing several formats into the same output stream is of interest for debugging only; production use will inherit the current trunk's limitations (logging to stderr/file and/or logging to syslog (UNIX) or EventLog (Windows)). This limitation can be lifted on WL#9424 becomes available and lets us configure writers (and their targets) individually. 4 CONFIGURATION Some writers require configuration. To maintain compatibility with current behaviour, the syslog/EventLog writer must heed the settings for log_syslog_tag and log_syslog_facility in the server. These variables can be deprecated once per-component system variables (WL#9424) become available, and be replaced with per-component ones at that time. 5 LOG SINKS: LOG WRITERS 5.1 TYPES OF LOG WRITERS There are only two types of log writers. "Modern sinks" like the JSON writer don't care whether a key/value pair is a table name or an error message, or what -- a string is a string is a string. These sinks have to cover just three cases: format as string, format as integer, format as float. Trivial. The other type is sinks that write to a fixed format, like the MySQL error log. The error log as you'll remember has a timestamp, connection ID, severity label, and an error message, in that exact order. The traditional error log sink ("log_sink_trad") therefore doesn't attempt to process all the information we might send its way; it's all about knowing how to retrieve the few items it knows it's interested in, and printing them (in the prescribed order). All other information is ignored. Still simple, just a different kind of simple. 5.2 INPUT TO SINKS ("What to expect from a log event") So as we've seen, it's either "handle 3 storage classes" (which is simple) or "handle as many specific item types are you're interested in". Both kinds of writers are in fact very basic. The first group (which handles any and all items) is actually the simplest, and perhaps easier to read and understand than the second group, as they're little more than a glorified switch (item->item_class) { case STRING: sprintf(buf, "%s", item->value.data_string); break; case INTEGER: sprintf(buf, "%d", item->value.data_integer); break; case FLOAT: sprintf(buf, "%f", item->value.data_float); break; } Anything beyond that in a given source file is usually the > 100 lines of boilerplate needed by the component framework (but those are going to be there no matter how simple or complex our actual writer is), and code to handle configuration variables (if any). The code for a baseline "modern" writer amounts to about 35 lines of code, plus boilerplate. The JSON sink is slightly longer, as it needs to get the formatting and escaping and tags and entities right. Still, it is still a single function of only about 100 lines. 5.3 ERROR MESSAGES THAT AREN'T STRINGS, AND OTHER INSANITY So what if a service is looking for the error message, but it turns out that message is not a string? Won't that add complexity? Well, first off, the well-know items (log message, error number, ...) all also have a well-known storage class, so an error message for example shall always be of string class. Our code (submission API, filter, ...) ensures this. Additionally, there is code in the server to help guard against broken modules that violate these guarantees, so the sinks won't have to check that themselves, either. In summary, if a case like that happens, it's a bug, and it's a bug that we have tools to detect. The expectation is NOT that services have to convert data-types on the fly and be able to handle error messages that are numbers, or integers that come wrapped in a string, or any such nonsense. It is true that modern writers like the JSON one gracefully handle this case at zero overhead (as they only look at storage class, not type or semantics), but this is very specifically NOT required of any service. 5.4 OPTIMIZATIONS Log-writers (in contrast to log sources and log filters) are not expected to change the log event. A writer may elect to skip a field, or to synthesize or transform data, but this view on the data must remain private to the writer. As a result of this, a future iteration of the framework could elect to execute the writers in concurrent threads. (As correctly pointed out by Praveen. :) 5.5 FLUSH ERROR LOGS Log-services MAY implement a function flush() that, if present, is called on FLUSH ERROR LOGS. An obvious use here would be to flush, close, and re-open a log file to aid in log rotation.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.