WL#11009: Logging services: log writers: add error ID to traditional error log
Affects: Server-8.0
—
Status: Complete
WL#9344 moved the supermajority of the server's messages to the error log out of the code proper and into the messages file, sql/share/errmsg-utf8.txt. Beside auxiliary advantages such as deduplication, facilitation of translation, and providing a stepping stone for a possible moving of the messages to a table later, the main advantage was that all messages concerned would now have a stable error symbol (e.g. ER_STARTUP) and an error code (e.g. 1451). While the JSON log writer (one of the sinks defined in WL#9343) is capable of writing either, both, or none of these items (depending on the configuration set in the filtering engine, WL#9342), it was decided that even the "traditional" error-log would benefit from having each message's ID added to it.
Func-Req 1 All error log messages output from mysqld must have anFunc-Req 2 Log format of "traditional" error log (stderr/file) Entry in the error log shall be extended so that the actual error message is preceded by ' [MY-' '] '. This will maintain human readability, and is designed so that new tools can parse the error_ID as an indvidual field if they wish, while legacy ones will consider it part of the error message (with the intention of keeping the format nominally intact, and the impact on aggregation etc. as small as possible).
I-1 SQL syntax NO CHANGE I-2 Instrumentation NO CHANGE I-3 Error and Warnings NO CHANGE (except default log format, see I-9, log files) I-4 Install/Upgrade NO CHANGE I-5 Commercial Plugins NO CHANGE (but designed to be compatible with any future changes further empowering loadable plugins/components, allowing the latter to register namespaced error messages) I-6 Replication NO CHANGE I-7 Protocols NO CHANGE I-8 Security NO CHANGE I-9 Log files The "traditional" error log (file/stderr) will be extended with an error message ID. This new field be precede the actual human-readable message string, with the intention of both keeping the logs human-readable, and of remaining compatible with legacy log analyzers / aggregators etc. See below for full specification, analysis, and rationale. I-10 MySQL clients NO CHANGE I-11 Auth plugins NO CHANGE (but designed to be compatible with any future changes further empowering loadable plugins/components, allowing the latter to register namespaced error messages) I-12 Globalization NO CHANGE I-13 Configuration NO CHANGE I-14 File formats NO CHANGE (except log files, see there) I-15 Plugins/APIs NO CHANGE (but designed to be compatible with any future changes further empowering loadable plugins/components, allowing the latter to register namespaced error messages) I-16 Other programs Designed for compatibility with external log analyzers etc. where possible. No change for programs that do not read error logs. I-17 Storage engines NO CHANGE (but designed to be compatible with any future changes further empowering loadable plugins/components, allowing the latter to register namespaced error messages)
A few considerations that will not result in extra code or work right now, but will hopefully keep this spec future-proof, and will prevent us from painting ourselves into a corner. ERROR NUMBERING: I - STATUS QUO The error message subsystem supports a (somewhat) sparse numberspace in that understands ranges, so that e.g. each version of the server can have its own range, allowing us to add messages to older servers without breaking the numbering of the newer ones by "pushing the newer messages up to higher indices in the array." This is mostly a convenience for us however, with ranges having little meaning to the DBA: we don't have ranges per sub-system, and unrelated messages are simple appended (to a given server-version's message list, within that server version's number range) in the order they come in. As a result, the numbers are for all practical purposes opaque; you can't look at an error-code and know, "it's between 1000 and 2000, that's from replication, then." ERROR NUMBERING: II - CHALLENGES IN THE FUTURE If we enrich each log-line by an error identifier, it would be nice if not just the server proper, but loadable plug-ins and components as well could supply this information with relative ease. We could perhaps assign a range to each of our own plug-ins, but that would not solve the issue for 3rd party offerings. (In fact, even within ORCL the idea of all subsystems using errmsg-utf8.txt is less than universally popular.) Since we probably wouldn't want to be a central registry for all plugins/components and their error ranges, we might need a system later that will allow loadable objects to maintain their own error lists, and register them on load. We could then assign ranges dynamically on load, but that would mean that codes might change every time the DBA changes their server's configuration, which would make these codes inconsistent across logs. Hence, at such a time in the future as this issue arises, the correct way of handling this will be to namespace the error codes, thus enabling each plug-in to start their range at 1, if they wish, as "FooPlug:1" will be disjunct from the server's "1". ON DISK FORMAT / PARSING BY THIRD PARTY TOOLS: I - FIELDS The traditional log is currently printed as len= snprintf(buff, sizeof(buff), "%.*s %u [%.*s] %.*s", (int) ts_len, iso_timestamp, thread_id, (int) label_len, label, (int) msg_len, msg); that is, fields are separated by spaces (rather than tabs etc.). The last field is the message, which may itself contain white space; therefore any tools operating on this log are likely to parse it as "1st field, timestamp; 2nd field, thread-ID; 3rd field, severity-label; all the rest, message." As the error-ID semantically belongs with the message anyway, the correct place for it is therefore right in front of it, separated by a space:This is not only the format that would make most immediate sense to a human reader, it also means that a tool that is not aware of this new, extended format would consider the error_id part of the message, i.e. parsing and aggregation should still work as expected, keeping the format backwards-compatible. At the same time, more current tools could consider the error_id a separate field where this makes sense. It is suggested to use square brackets for the error_id (as we do for the severity) to enable such tools to automatically decide which format a given log is in: 2017-06-30T08:33:37.659435Z 1 [Note] [MY-003629] Found data dictionary with version 1 ------------1-------------- 2 ---3-- -------------4------------- ------------1-------------- 2 ---3-- -----4----- ------5-------- In this example, the first line shows an arbitrary log entry, the second line shows how this entry would be parsed by old tools, and the third line shows how it would be parsed by new tools. For comparison, an older server's log would not show the error-ID, like so: 2017-06-30T08:33:37.659435Z 1 [Note] Found data dictionary with version 1 ------------1-------------- 2 ---3-- -----------------4------------------ ------------1-------------- 2 ---3-- -----------------5------------------ Here, old tools would work as they always have (line 2), while new tools could detect the old log format (id est, the absence of the error-ID, or what they'd consider "field 4") by the fact that the field after field 3 does not start with an opening square bracket. ON DISK FORMAT / PARSING BY THIRD PARTY TOOLS: II - ERROR-ID FIELD As noted above, the actual error-ID is for all intents and purposes an opaque identifier (preceded by a '[', succeeded by a ']'), and it is of paramount importance that this be reflected in the public documentation. This not only helps with keeping the tools compatible as outlined above; it also allows us to extend and modify the format at a later date without violating these specs. At this point in time, a numeric format prefixed with 'MY-' is chosen for the identifier (6 digits with leading zeros, for better alignment and therefore enhanced readability of the logs), but as identifiers are opaque, tools should treat this field as a string-literal all the same. The rationale to print "MY" in front of error log numbers is that users can search for the same to find appropriate MySQL documentation. Therefore, the error-ID is now implemented as, error_code ::= '[' {6} ] prefix ::= 'MY-' digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 but will be 100 % upwards compatible with a namespaced format if and when the need arises in the future: error_id ::= '[' [ ':' ] error_code_in_subsystem ']' subsystem ::= [ | | '_' | '.' ] error_code_in_subsytem ::= [ | | '_' | '.' ] ACTUAL CODE After all these considerations, the actual code-change is very minor, and expected to look similar to len= snprintf(buff_line, sizeof(buff_line), "%.*s %u [%.*s] [MY-%06u] %.*s", (int) ts_len, iso_timestamp, thread_id, (int) label_len, label, errcode, (int) msg_len, msg); for the time being.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.