WL#9342: Logging services: log filter (filtering engine)
Affects: Server-8.0
—
Status: Complete
LOG FILTER/S The umbrella WL#9323 defines "logging TNG", which has as one of its main goals "structured logging" (i.e. logging entries that have data beyond a single plaintext log message, such as a separate error number, etc.). It stands to reason that if we have these rich data, it becomes easy and desirable to filter them. This implies the following points of order: - implement a new filtering engine that can handle the structured log events defined in WL#9323. The first implementation aims to maintain compatibility with the features it replaces; its primary goal is to change the code over to the new model. - that filtering engine will the built-in default. the user can not accidentally misconfigure their system to have no filters at all. - by default, the engine tries to emulate 5.7 behavior (heeding current configuration variable --log-error-verbosity and setting up the same rate- limiting for selected "spammy" messages, and so on) - throttles for binlog.cc, connection_handler_per_thread.cc log_event.cc - the logging framework shall expose the necessary calls for plug-in services to implement a filter. It should be possible to run such a filter instead of the built-in default. This will allow users to offer alternative, more powerful filtering, allow power-users to create custom filter plug-in services, and so on.
Preamble
A filter service has two major functional parts; it has to identify log messages
it needs to act on (e.g. select "informational" type messages when a low
verbosity is configured), and then apply an action to them (in this case,
suppress the line entirely).
In a manner of speaking, we support selection and projection in the baseline
implementation.
Selection: A primary concern of a filter are log messages (aka "log lines"). The
filter will usually use certain fields (such as "error code" or "priority") to
make its decisions on whether or not an entire line should appear (or be gagged,
throttled, etc.). This functionality can be used to emulate previous behaviour,
and more.
Projection: Now that a log entry may have multiple fields (such as error-code,
message, etc.), the filter service may discard individual fields at its
choosing. Operating on field-level constitutes new behaviour (as previously,
there was only one data-item, a plain-text error message).
Finally, an "action" may also generate synthetic new fields.
The filter service is provided with key-value pairs describing the fields, a
count of items in the collection, and a bit vector of seen types in
this collection. It may modify these according to the following rules:
Non-Func-Req 1 Actions
Actions are applied to entire log lines (e.g. "suppress"), or to individual
key/value pairs within a log line (e.g. "delete field"). Filter services may
implement different types of actions, for instance, a service may offer to
rewrite messages of the string type. The "stock" filter service specified in
this WL MUST implement the actions needed to replicate current behaviour
("suppress line" for rate-limiting and verbosity-filtering based on error
severity). A service MAY implement actions beyond that (e.g. for string editing,
such as to apply a 'basename' type operation to a string containing a file path
and name etc.).
Non-Func-Req 1.1 Suppression of log lines
A filter MAY elect to discard a prospective log line in its entirety.
Func-Req 1.1 Suppression of log lines
Suppression of entire lines SHALL be available to the user as emulation of the
previous log_error_verbosity and error rate throttling functionalities.
Non-Func-Req 1.2 Suppression of key/value pairs
A filter MAY remove individual key/value pairs from the collection.
(It MUST adjust the item count if it does so. It SHOULD adjust the bit vector of
seen types if it removes the last item of a given type.)
Func-Req 1.2 Suppression of key/value pairs
Where 1.1 deals with suppression of entire "log lines", an implementation MAY
also suppress individual key/value pairs within those lines, i.e. it helps us
gag individual fields within a line.
Non-Func-Req 1.3 Creation of key/value pairs
A filter MAY add key/value pairs to a log line. Since "newer" items override
older ones, this lets us override defaults generated by log item sources.
Func-Req 1.3 Creation of key/value pairs
A simple example for this would be to select messages with a specific error
code, and to override their label: one issue with the previous logic was that
very important informational messages, such as "server ready on port ...", would
either get suppressed by commonly used verbosity settings because of their
"informational" classification, or would have to be tagged with ERROR_LEVEL,
guaranteeing their appearance in the log, but forcing the misleading label of
"ERROR" when none had occurred.
NB Not all log writer plugins may support separate labels and severities.
Non-Func-Req 1.4 Throttling/rate-limiting
The default filter should offer rate-limiting similar to the existing
Log_throttle class's as one of its functionalities. Rate-limiting should be
available as an ACTION-VERB for any CONDITION. That is to say, the CONDITION may
create various equivalence classes, such as:
"allow only a limited number of messages with error code 15 per minute (but
leave messages with other error codes unaffected)"
"allow only a limited number of messages of the information-type per minute (but
leave messages with higher priorities, such as errors, unaffected)"
Func-Req 1.4 Throttling/rate-limiting
In the initial implementation, the default filter's rate-limiting shall
replace/emulate the previous use of the Log_throttle class for error messages.
Summary messages will be standardized; this may result in certain test-.result
files requiring appropriate updates.
Section 2 Filters
Non-Func-Req 2.1 Conditions
A condition consists of a comparator ("equals" etc.) and a reference
item to compare it with:
A condition with a comparator of "greater than" and a reference item that has a
type of LOG_PRIORITY and a value of 0 will match all log-lines containing a
LOG_PRIORITY item with a value of 1 or 2.
When testing just for presence or absence of an item with a given key, a value
need not be set on the reference item.
Non-Func-Req 2.1.1 Data types in conditions
The default plugin must at minimum support comparison of integer-form data as is
required to filter based on log line severity/priority, and to throttle based on
error number. A log filter service may implement further comparisons, e.g. for
string-type data.
Func-Req 2.2 Conditions
The default filter service MUST be able to model at least the cases required to
emulate pre-patch behaviour, i.e. it must be able to select lines for throttling
(based on error code) and for suppression (based on log_error_verbosity). It MAY
implement further comparators.
I-1 Semantics NO CHANGE, until service configuration is more clearly defined. The first implementation of the filter will initially support the current UI (log_error_verbosity system variable) for configuration, making it a drop- in replacement for the current technology. The server will set up rules to replicate the current throttling behaviour etc. I-2 Instrumentation NO CHANGE I-3 Error and Warnings YES. Summaries of rate-limiting should be uniform across throttled error messages. I-4 Install/Upgrade NO CHANGE I-5 Commercial plugins NO CHANGE. Changing filtering to be a service opens the way for commercial plugins, but creating one is beyond the scope of this WL entry. I-6 Replication NO CHANGE I-7 Xprotocol NO CHANGE I-8 Protocols NO CHANGE I-9 Security NO CHANGE Future filters may elect to obfuscate parts of plain-text messages however, etc. I-10 Log Files YES, unsurprisingly. :) See I-3 I-11 MySQL clients NO CHANGE I-12 Auth plugins NO CHANGE I-13 Globalization NO CHANGE I-14 Configuration NO CHANGE. Compatibility with the current method of configuration is a goal. While the filter engine described in this WL SHALL heed the 5.7 variables, other filter services NEED NOT. I-15 File formats NO CHANGE (see log writers for that) I-16 Plugins/APIs NO CHANGE. (Will use the new APIs introduced by the "services -- the next generation" and "logging -- the next generation" WLs however.) I-17 Other programs NO CHANGE I-18 Storage engines NO CHANGE I-19 Data types NO CHANGE
1 CONFIGURATION (compatibilty with --log_error_verbosity)
Some 5.7 behaviour is to be emulated in the filtering component. This can be
done with relative ease by setting up the appropriate filter rules.
2 CONCURRENCY
While each error has its own grab bag of fields and string buffer (and therefore
requires no locking), the built-in filter's rule-set will be shared among
concurrent calls. Therefore, the following cases are expected:
- change (clear/append/modify) rule-set
=> exclusive lock
- apply filters (i.e. check conditions; apply action/verb on match)
=> shared lock
- if a rule has internal state (e.g. throttling -- how many of the same
message have we seen in this window? when will the window end?) and
an update of this state is required, we'll need to upgrade to an
=> exclusive lock
3 FILTERING STAGE: MATCHING STAGE
At run time, the filter iterates over its rule-set. For each rule, it the
condition contains a well-known item, it looks for an item of that type in the
event. If the condition contains an ad hoc-item, it looks for an item of any ad
hoc-time with the given key in the event.
If there is a match, the filter will verify whether the storage class of the
value in the event and that in the condition are either both strings, or both
not. If that's not the case, it flags an error.
Otherwise, it now compares both values using the requested comparator, and
reports the result.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.