WL#12653: A reference caching service
Affects: Server-8.0
—
Status: Complete
To call a service implementation one needs to: 1. query the registry to get a reference to the service needed 2. call the service via the reference 3. call the registry to release the reference While #2 is very fast (just a function pointer call) #1 and #3 can be expensive since they'd need to interact with the registry's global structure in a read/write fashion. Hence if the above sequence is to be repeated in a quick succession it'd be beneficial to do steps #1 and #3 just once and aggregate as many #2 steps in a single sequence. This will usually mean to cache the service reference received in #1 and delay #3 for as much as possible. But since there's an active reference held to the service implementation until #3 is taken special handling is needed to make sure that: 1. The references are released at regular intervals so changes in the registry can become effective. 2. There is a way to mark a service implementation as "inactive" ("dying") so that until all of the active references to it are released no new ones are possible. All of the above is part of the current audit API machinery, but needs to be isolated into a separate service suite and made generally available to all services. This is what this worklog aims to implement.
FR1: If there are not consumers assigned the performance impact of trying to emit an event should be negligible FR2: No visible SQL layer changes will result from this worklog. The server should behave as it did before. FR3: Calls to these services will be added to the server by one of the subsequent worklogs. FR4: The implementation will be self-contained and can be done in a separate component. FR5: This component will be loaded at the server startup time. FR6: This component will be unloaded at the server shutdown time.
What I'm trying to replace ----------------------------------- The "crown jewel" of audit plugin API is the reference cache done in the THD. It puts a "channel" (called event class) dependent cache into the THDs and consults that when it needs to emit an event instead of going to the global plugin list every time. When there's no consumers the check is lightning fast since all it needs to do is check a bitmask of event classes. And even if there are consumers it needs not repeat fetch references to plugins for events of the same class. And, since plugins can be in "zombie state" (i.e. registered in the list, but dying) it allows one to gracefully retract a plugin from the whole caching machinery. Plugins unloading can mark themselves as zombies and the release of the last reference to them from the cache(s) will unload them. What I want to replace it with ---------------------------------------- I allow creating a bunch of reference caching channels. One caching channel relates to many reference caches. The channel is basically a device for somebody to announce that they want all the caches on that channel invalidated. A channel is usually coupled with a component service. These channels are created at startup and are to replace the global static bitmask lists. Now if one wants to subscribe to events all they need to do is implement the service. The caches will pick it up when they refresh and start calling it. Note that the signature of the parameters is not fixed in the caching API (as it is in the audit plugin class). It's a the way the service is defined. And also the number of channels is not static (as it is in the audit plugin class). Once the channel is created each thread wanting to emit events fast will at thread startup create one service cache instance and will keep the handle to it until it wants to emit events. When an event is to be emitted the thread will invoke a method of the reference cache that will: - fill in the cache if invalid or empty - return the contents of the cache (service references to the consumer implementations) to the thread to call. Now this call will be relatively fast too: consult the cache validity flag (thread local) and the channel validity flag (global, I'm thinking atomic). So should be on par with the audit plugin event emit code. Much more details in the header below: how do gracefully shutdown an event consumer etc. -------------------- reference_caching.h /* Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2.0, as published by the Free Software Foundation. This program is also distributed with certain software (including but not limited to OpenSSL) that is licensed under separate terms, as designated in a particular file or component or in included license documentation. The authors of MySQL hereby grant you an additional permission to link the program and your derivative works with the separately licensed software that they have included with MySQL. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License, version 2.0, for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ #ifndef REFERENCE_CACHING_H #define REFERENCE_CACHING_H #include#include /** Handle for "producer channels". A producer channel is the singleton used to emit events that are subject to reference caching. This is needed to hold a flag that will trigger forced invalidation of all caches in a particular channel on their next access. Multithreaded access is expected to be safe and lock-less to this handle once it's instantiated. */ DEFINE_SERVICE_HANDLE(reference_caching_channel); /** A handle of a "reference cache". This is created by all events producer threads when they start and is maintained throughout their lifetimes. Operations on that are not thread safe and are assumed to be done by a single thread. */ DEFINE_SERVICE_HANDLE(reference_caching_cache); /** @ingroup group_components_services_inventory */ /** A reference caching channel service. The goal of this service is to ensure that event producers spend a MINIMAL amount of time in the emitting code when there are no consumers of the produced events. Terminology ----------- An event consumer is an implementation of a component service. An event producer is a piece of code that wants to inform all currently registered event consumers about an event. A channel is a device that serves as a singleton of all reference caches for all threads that are to produce events. A reference cache is something each thread producing events must maintain for its lifetime and use it to find the references to event consumers it needs to call. Typical lifetime of the event consumers --------------------------------------- At init time the event consumer will register implementations of any of the services it's interested in receiving notifications for. Then optionally it might force a channel invalidation to make sure all existing event producers will start sending notifications to it immediately. Or it can just sit and wait for the natural producer's reference cache refresh cycles to kick in. Now it is receiving notifications from all event producers as they come. When it wishes to no longer receive notifications it needs to mark itself as invisible for reference cache fill-ins. In this way all reference cache flushes will not pick this implementation up even if it's still registered in the registry. Eventually all active references will be removed by the natural producers flush cycles. The consumer may choose to expedite this by triggering a channel invalidation. As with all service implementations, when all references to the services are released it can unload. Typical lifetime of the event producers --------------------------------------- An event producer will usually start at init time by creating channel(s). Then, for each thread wishing to produce events, a reference cache must be created and maintained until the thread will no longer be producing events. Now the tread can produce events using the reference cache. This is done by calling the get method and then iterating over the resulting set of references and calling each one in turn as one would normally do for registry service references. It is assumed that the references are owned by the cache and thus they should not be released. With some cyclicity (e.g. at the end of each statement or something) the event producing thread needs to flush the cache. This is to ensure that references to event consumers are not held for very long and that new event consumers are picked up. However flushing the cache is a relatively expensive operation and thus a balance between the number of events produced and the cache being flushed must be achieved. General remarks --------------- Channels are something each event producer must have to produce events. Channels are to be created by a single thread before the first event is ever produced. And, once created they are to be kept until after the last event is produced. Channels serve as singletons for caches and you only need one channel instance per event producer component. There usually will be multiple caches (one per event producing thread) per channel. Creating and destroying a channel is a relatively "expensive" operation that might involve some synchronization and should not be done frequently. Channels exist to allow a non-related thread to trigger invalidation of all currently active caches on that channel. This is necessary when for example event consumers are registered and are about to be removed. Invalidating a channel is a thread-safe operation that can be invoked without synchronization at any time. Each channel is associated with a specific service name. @note It is a service name, not an implementation name ! The name is stored and used in event caches to handle all implementations of that particular service. */ BEGIN_SERVICE_DEFINITION(reference_caching_channel) /** Creates a channel and returns a handle for it. The handle created by this method must be destroyed by the invalidate method otherwise there might be leaks. The channel should be created before any events are to be produced on it. @param service_name a service name that this channel will operate on. @param[out] out_channel placeholder for the newly created handle. @retval false success @retval true failure */ DECLARE_BOOL_METHOD(create, (const char *service_name, reference_caching_channel *out_channel)); /** Destroys a channel handle Should make sure no other thread is using the handle and no caches are allocated on it. @param channel the handle to destroy @retval false success @retval true failure */ DECLARE_BOOL_METHOD(destroy, (reference_caching_channel channel)); /** Invalidate a channel This is thread safe to call without synchronization and relatively fast. Forces an asynchronous flush on all caches that are allocated on that channel when they're next accessed. @param channel the handle to destroy @retval false success @retval true failure */ DECLARE_BOOL_METHOD(invalidate, (reference_caching_channel channel)); /** Validate a channel This is thread safe to call without synchronization and relatively fast. This function is used to validate the channel. Which helps in getting the cached service references on that channel when they're next accessed. @param channel the handle to destroy @retval false success @retval true failure */ DECLARE_BOOL_METHOD(validate, (reference_caching_channel channel)); /** Fetches a reference caching channel by name. Usually consumers wishing to force reference cache flush would fetch the channel handle so they can then call invalidate on it. This is a relatively expensive operation as it might involve some synchronization. @param service_name a service name that this channel will operate on. @param[out] out_channel placeholder or NULL if not found. @retval false success @retval true failure */ DECLARE_BOOL_METHOD(fetch, (const char *service_name, reference_caching_channel *out_channel)); END_SERVICE_DEFINITION(reference_caching_channel) /** A service to maintain an "ignore list" for reference caches. When a service implementation is on that ignore list it will never be added to any reference caches when they're filled in even if the service implementation is in the registry and implements the service the reference caching channel is operating on. This is just a list of "implementations", i.e. the part of the service implementation name after the dot. The channel already has the name of the service so a full implementation name can be constructed if needed. */ BEGIN_SERVICE_DEFINITION(reference_caching_channel_ignore_list) /** Adds an implementation name to the ignore list. @param channel the channel to add the ignored implementation to. @param implementation_name the second part of the service implementation name (the part after the dot). @retval false successfully added @retval true error adding (e.g. the ignore is present etc) */ DECLARE_BOOL_METHOD(add, (reference_caching_channel channel, const char *implementation_name)); /** Remove an implementation name to the ignore list. @param channel the channel to remove the ignored implementation from. @param implementation_name the second part of the service implementation name (the part after the dot). @retval false successfully removed @retval true error removing or not present */ DECLARE_BOOL_METHOD(remove(reference_caching_channel channel, const char *implementation_name)); /** Empty the ignore list. @param channel the channel to remove the ignored implementation from. @retval false successfully removed all ignores @retval true error removing the ignores. State unknown. */ DECLARE_BOOL_METHOD(clear(reference_caching_channel channel)); END_SERVICE_DEFINITION(reference_caching_channel_ignore_list); /** Reference cache service. See the reference_caching_channel service for details on how to operate this. Manages thread caches for event producer threads. */ BEGIN_SERVICE_DEFINITION(reference_caching_cache) /** Create a reference cache. Needs to be called before the first get() or flush() is called. Each call to create must be paired with a call to destroy() or there will be leaks. @param channel the reference cache channel to operate on. @param[out] handle of the newly allocated cache. @retval false success @retval true failure */ DECLARE_BOOL_METHOD(create, (reference_caching_channel channel, reference_caching_cache *out_cache)); /** Destroy a reference cache. Needs to be called to dispose of each cache allocated by create(). Needs to be called after all possible calls to get() and flush(). Once called the cache handle is invalid and should not be used anymore. @param channel the reference cache channel to operate on. @param[out] handle of the newly allocated cache. @retval false success @retval true failure */ DECLARE_BOOL_METHOD(destroy, (reference_caching_cache cache)); /** Gets a set of service references for an event producer to call. This is the main event producer function that will be called when the event producer needs to produce an event. The cache must be a valid one. And the channel too. If the cache is empty or invalidated (either via the channel or via a call to flush) it will try to fill it by consulting the registry and acquiring references to all implementations of the service the channel is created for. Once that is done the cache will be marked as full. This a cache "miss": a relatively expensive operation and care must be taken so it doesn't happen very often. If the cache is full (not flushed) this call will return all references stored into the cache (might be zero too). This is a very fast operation since the cache is single-thread-use and thus no synchronization will be done (except for checking the channel's state of course). This is a cache "hit" and should be the normal operation of the cache. @param cache the cache to use @param registry a handle to the registry so that no time is spent taking it @param[out] an array of my_h_service terminated with an empty service (0). @retval true failure @retval false success */ DECLARE_BOOL_METHOD(get, (reference_caching_cache cache, my_h_service registry, my_h_service *refs)); /** Flush a reference cache This causes the reference cache supplied to be flushed. When in this state the next call to get() will be a guaranteed cache miss and will fill in the cache. @param cache the cache to flush @retval true failure @retval false success */ DECLARE_BOOL_METHOD(flush, (reference_caching_cache cache)); END_SERVICE_DEFINITION(reference_caching_cache); #endif /* REFERENCE_CACHING_H */
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.