WL#4102: Service registry and component infrastructure

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

This worklog will provide new infrastructure needed for improving the extensibility of the MySQL Server, and address issues with the current plugin mechanisms.

It allows the MySQL server to be divided into a number of logical components. Additional components may be added to a running server to extend its functionality.

Components can be linked either dynamically or statically.

Each component will provide an extensive set of named APIs, services, that other components can consume. To facilitate this there will be a registry of all services available to all components.

Each component will communicate with other components only through services and will explicitly state the services it provides and consumes.

The infrastructure will enable components to override and complement other components through re-implementing the relevant service APIs.

Once there's a critical mass of such components exposing their functionality and consuming the services they need through named service APIs this will allow additional loadable components to take dvantage of all the functionality they need without having to carve in specialized APIs for each new need.

The work of providing the infrastructure described in this WL, is mostly orthogonal to the ongoing activity on modularizing the MySQL Server. However, properly defined modules and interfaces in the Server, allows server functionality to easily be exposed as services where it makes sense.

Functional Requirements List

  1. It should be possible to extend the server through both statically linked and dynamically linked Components.
  2. Every Component should be able to access and use all service implementations provided by all other Components. No Component should be in a privileged position to use certain service implementations just because it happens to contain them.
  3. It should be possible to load and unload Components through a dynamic loader service, whose initial implementation will be provided by the server container.
  4. Components should interact between themselves only through published named APIs called services.
  5. Service names are an unique set of utf-8 strings and can't contain a dot (.).
  6. Services APIs are stateless. State should be handled through factory interfaces and handles by the services themselves.
  7. Each service can have multiple named implementations that exactly implement the service API.
  8. Each service implementation will have a unique name (again in UTF-8, no dots allowed) that starts with the service name, then a dot and then the implementation name.
  9. There will be a service to keep a non-persistent list of all service implementations (and hence services) called the registry service.
  10. No service can exist without at least one service implementation.
  11. Each service will always have 1 default implementation.
  12. When you request a service implementation using just the service name you will get the default implementation.
  13. It will be possible to change the default implementation to another service implementation explicitly.
  14. Each Component will have a list of service implementations it provides
  15. Each Component will have a list of services or service implementations it consumes
  16. The dynamic loader service will allow loading multiple Components in a single go (to satisfy circular dependencies).
  17. The dynamic loader service will keep in a system table a persistent ordered list of the Component sets it needs to load on startup.
  18. The dynamic loader, when loading a Component will register into the service registry all the service implementations the Component provides
  19. Each Component can have an initialization function that the dynamic loader will call after loading the container
  20. Each Component can have an de-initialization function that the dynamic loader will call before unloading the container
  21. It should be possible to enumerate all the currently loaded Components
  22. It should be possible for more than one reader to access the list of the Components at the same time through the dynamic loader
  23. A writer to the list of Components through the dynamic loader service will block all read access through the dynamic loader service until it's done.
  24. The service registry will keep a reference count for all of the service implementations registered in it.
  25. The dynamic loader will refuse to unload a Component that provides services that are still referenced.
  26. It should be possible to get another service implementation from the same Component (if available) by referencing it by service name only.
  27. Object handles returned from one service implementation should be valid for "related" services from the same Component (related means logically operating on the same data instances).
  28. Service implementations should expose the same functionality to everybody that request them, no matter whether they're in the same Component as the service implementation or not.
  29. SQL Syntax "INSTALL COMPONENT 'component_urn' [, component_urn...]" is available for loading the group of components.
  30. SQL Syntax "UNINSTALL COMPONENT 'component_urn' [, component_urn...]" is available for enloading the group of components. The URNs used must be exactly the same as ones used in INSTALL COMPONENT.
  31. It is not necessary to unload the whole group that was used in INSTALL COMPONENT. The group may consist of any subset of loaded components, as long as they will not break any dependencies of all components that will remain loaded after unload operation.
  32. If Components have circular dependencies, they need all be loaded in the same group.
  33. "file://" scheme will be available for use in Component URN. It will load Component with a name supplied from file located in MySQL Server plugin_dir, with OS-dependent default extension. The name cannot contain subdirectory or any kind of standard link like "..". The name cannot have an extension, a system default one will always be added.

Contents


Definitions

Service

A service is a named, well-defined stateless interface to one functionality. The service name will consist of UTF-8 symbols excluding the dot ('.') symbol and NULL character, which is used only as end of string.

Each service will be accompanied with a (part of a) header file that defines the service allowing other components to consume it.

Services are a C struct consisting of C style function pointers.

Services are unconditionally binary compatible.

Each Service should work only on basic C types and structures of these types to prevent problems with not fully-defined C++ objects ABI.

Service is a default way to operate inside the Components subsystem as a mean to show that one is interested only on functionality interface, not its exact implementation.

The Services are not versioned - any change to interface must require Service being copied to one with different name before applying changes.

The Services by itself do not carry any state, all methods are stateless. This does not prevent one from having some state-carrying objects created and returned as handles to them. Such concept is shown for example in create(), get() and release() methods of the registry_query Service.

Service Implementation

A service implementation is a named stateless implementation of a named service. The service implementation specific name consists of UTF-8 symbols excluding the dot ('.') symbol and NULL character, which is used only as end of string. The fully qualified service implementation name is formed by the service name and the service implementation specific name as follows : .

In most cases the implementation related name should be the Component name in which it is being defined.

Each Service can have several Service Implementations.

Services cannot exist without at least one implementation.

Each service implementation will have a list of name/value metadata pairs.

Service implementations are not versioned. If you need to provide a new version of the service implementation implementing a new service version you will use the new Service implementation specific name to name your implementation. Nothing prevents a component from keeping the older service implementations together with the most recent one where it makes sense.

Component

A component is a named container that contains one or more service implementations. Components can be internal (part of the mysql server binary) and external (hosted in a OS binary file different from the one of the mysql server).

Each component will have:

  • Name
  • List of service implementations it provides.
  • List of services it needs. By default every component will require a Registry Service to be able to communicate with other components.
  • Initialization function that's called when a component is loaded.
  • De-initialization function that's called when a component unload is requested.

Metadata

Both the service implementations and the components will have metadata associated with them. The metadata of an object are a set of name/value string pairs. The names are unique for one object. Strings will be encoded in UTF-8.

Metadata will be used to store attributes for the component and the service implementation. These attributes will be used for reflection-like queries over the available components and service implementations.

Examples of typical metadata elements for a component would be elements like e.g. "author", "license", "version", "revision", "description", "source URL", "signature key" etc.

Examples of typical metadata elements for a service implementation would be elements like e.g. "description", "help URL", "header file name", "implementation language" etc.

It will of course be up to the service and component authors to fill these values in. Future metadata presentation services that can be built on top of the metadata will probably expect certain name uniformity. Nothing prevents implementations to provide and use component metadata names too : e.g. BigPresentationService.description. Or even "com.companydomain.presentation_service.decription".

Names starting with "mysql" and "com.mysql" are reserved for future use by the mysql development team.

Service Handle

A service handle is a handle that allows consumers to consume services. It's typically coupled with extra external files (e.g. C/C++ headers) that define the interface. The exact format and type of it is implementation dependent.

What is defined below is what a service handle (h_service) will look like in the initial implementation.

Here's how Registry->acquire() looks like

my_bool acquire(const char *service_name, h_service *out_service);

The question is what is h_service ?

typedef void * h_service

One approach would be to call it a void * and use whatever implementation pointer was provided to RegistryRegistration->register_service(service_implementation_name,service_implementation_ptr).

This would mean that the following code(s) will work without a type cast warning:

#include "service1_def.h"

service1 *ptr;

Registry->acquire("service1", &ptr);
ptr->method1(..);

This approach is really simple and agnostic to the actual type of service_implementation_ptr. But it leaves very little choice but making extra lookups in implementing functions like e.g. Registry->acquire_related(const char *service_name, h_service service, h_service *out_service) and metadata.

Let's use metadata to demonstrate the problem : Metadata are defined to be name/value pairs linked to a service implementation or a components. So a fitting implementation will have them in some sort of an internal structure roughly as follows :

struct h_service_implementation_data {
   const char *name;
   h_service implementation_pointer;
   struct metadata_s {
      const char *name;
      const char *value;
   } *metadata_list;
   ...
   h_container container;
   ...
}

Obviously it'll be a pretty common task to fetch metadata either through RegistryMetaDataQuery->get_value(service, meta_name, out_meta_value) or through RegistryMetaDataEnumerate->create_iterator(service, out_iterator).

And if we assume that h_service is void *, then the only viable alternative is to either:

  • make a reverse lookup on the h_service pointer to fetch the corresponding h_service_implementation_data and then do another lookup in metadata_list to find the metadata value
  • pass the service as a name so that the implementation can do the same lookup as Registry->aquire() does and then do another lookup in the metadata_list to find the metadata value

So with this approach effectively RegistryMetadataQuery->get_value() is a combination of either Registry->acquire() or Registry->acquire_related() and a lookup inside h_service_implementation_data->metadata_list.

This obviously is a bit heavy, provided that the application most probably already did the lookup from name to h_service_implementation_data.

h_service is a composite

The only alternative to the extra lookups I'm seeing is to make h_service point to some sort of a "ticket" structure that has both the pointer supplied to RegistryRegistration->register_service() and a link to the actual h_service_implementation_data structure for it.

In fact this is roughly how COM operates : All interfaces returned from it inherit from IUnknown. And for C programs it does preprocessor trickery to mimic the same.

I'll use C++ to illustrate it :

class mysql_service {
   h_service_implementation_data *ptr;
public:
   static h_service_implementation_data *get_service_implementation (mysql_service*this) {
     return this->ptr;
   }
}

Now any service implementation will have to inherit from this one, e.g.

class Service1Implementation : public mysql_service
{
   public my_bool api1(...);
   ...
}

If this is the case then the implementation may rely on always getting a mysql_service pointer, so that it can save the extra reverse or direct lookup in the above cases. Obviously we can't just inherit the h_service_implementation_data as it will create serious strain on binary compatibility.

This will come at the added inconvenience that we'll need to handle C++ virtual tables in the way the COM does : with complicated preprocessor trickery (http://www.codeproject.com/Articles/13601/COM-in-plain-C) or external tools that translate from interface definition language into relevant headers and implementation boilerplates, similarly to what MIDL does (http://msdn.microsoft.com/en-us/library/windows/desktop/aa367091(v=vs.85).aspx).

But it will have the added benefit of saving that extra lookup to find the service internal structure.

What to choose ?

Decision made : my_h_service will be a void *.

Service Registry

A service registry is a service that allows registration and retrieval of service implementations. The service registry keeps a reference count for each service implementation.

The registry stores and retrieves service handles. See above for definition of what a service handle is. It also keeps track of the services registered. Users of the registry can register, unregister and find service implementations from it through the supplied APIs.

The Registry Service

The registry service has the following interface declaration:

acquire()
my_bool acquire(const char *service_name, my_h_service *out_service);

Finds and acquires a Service by name. A name of the Service or the Service Implementation can be specified. In case of the Service name, the default Service Implementation for Service specified will be returned. If none found returns a true.

The reference count of the implementation returned is increased.

acquire_related()
my_bool acquire_related(const char *service_name, my_h_service service, 
                        my_h_service *out_service);

Retrieves a service implementation handle that matches "service_name" into out_service. It first tries acquiring a service "related" to the service passed via the "service" argument. "Related" in terms of service implementation means "with the same service implementation name". If none such is available the function will default to a regular acquire() call.

If a service name is specified it returns a handle for the default related implementation for that service and returns false. If a service implementation name is specified it will be equal to calling acquire(). If none found returns a true.

The reference count of the implementation returned is increased.

The idea of this function is to allow writing "portable" service usage code. Imagine you have a host of services that operate on the same objects. Since the service interfaces are stateless by definition they'd use handles to reference the object instances. And handles do have a meaning only when passed to services in the same component. So once a program gets a handle from a particular service implementation it'll need to ensure it's sticking to that same container this implementation came from. This can't be safely done through acquire():

h_factory_service fs;
h_manipulation_service ms;
registry->acquire("GreatFactoryService", &fs);

// now what ? 
// registry->acquire("GreatManipulationService", &ms); 
// can return another implementation.

The only choice left is to program against a particular implementation :

h_factory_service fs;
h_manipulation_service ms;
registry->acquire("GreatFactoryService.MyImp", &fs);
registry->acquire("GreatManipulationService.MyImp", &ms); 
 

But this would make the code disable future seamless overloading.

This is where acquire_related() can help :

h_factory_service fs;
h_manipulation_service ms;
registry->acquire("GreatFactoryService", &fs);
registry->acquire_related("GreatManipulationService", fs, &ms); 

Note though that acquire_related() shouldn't be used by default, as depending on the implementation it can be slower than aquire().

release()
my_bool release(my_h_service service);

Releases the Service Implementation previously acquired. After the call to this method the usage of the Service Implementation handle will lead to unpredicted results. Returns false if successful.

The RegistryQuery Service

Used to look for services by partial name and enumerate interfaces.

create()
my_bool create(const char *service_name_pattern, 
                        my_h_service_iterator *out_iterator);

Creates iterator that iterates through all registered Service Implementations. If successful it leaves read lock on the Registry until iterator is released. The starting point of iteration may be specified to be on one particular service implementation. The iterator will move through all Service Implementations and additionally through all default Service Implementation additionally, i.e. the default Service Implementation will be returned twice. If no name is specified for search, iterator will be positioned on the first Service Implementation.

get()
my_bool get(my_h_service_iterator iter, const char **out_name);

Gets name of service pointed to by iterator. The pointer returned will last at least up to the moment of call to release() on the iterator. Returns true otherwise.

Maintains the read lock on the registry.

next()
my_bool next(my_h_service_iterator iter);

Advances specified iterator to next element. Will succeed but return true if it reaches one-past-last element.

Maintains the read lock on the registry.

is_valid()
my_bool is_valid(my_h_service_iterator iter);

Checks if specified iterator is valid, i.e. have not reached one-past-last element.

Maintains the read lock on the registry.

release()
void release_iterator(my_h_service_iterator iter);

Releases the Service Implementations iterator. Releases read lock on the Registry.

The RegistryRegistration service

Service for managing list of registered Service Implementations.

register_service()
my_bool register_service(const char *service_implementation_name, my_h_service ptr);

Registers a new Service Implementation. If it is the first Service Implementation for the specified Service then it is made a default one. Returns false if successful.

unregister()
my_bool unregister(const char *service_implementation_name);

Removes previously registered Service implementation from registry. If it is the default one for specified Service then any one still registered is made default. If there is no other, the default entry is removed from the Registry too.

set_default()
my_bool set_default(const char *service_imlementation_name);

Sets new default Service Implementation for corresponding Service name. Returns false if successful.

The RegistryMetadataEnumerate Service

Service for listing all metadata for a Service Implementation specified by the given iterator. Service Implementation metadata are name/value pairs. Names are unique for single Service Implementation.

create()
my_bool create_iterator(my_h_service_iterator iterator, my_h_service_metadata_iterator *out_iterator);

Creates a iterator that iterates through all metadata for the object pointed by the specified iterator. If successful it leaves read lock on the registry until the iterator is released.

get()
my_bool get(my_h_service_metadata_iterator iter, const char** name, const char** value);

Gets the key and value of the metadata pointed to by the specified iterator. The pointers returned will last at least up to the moment of call to release() on the iterator. Returns false if successful.

next()
my_bool next(my_h_service_metadata_iterator iter);

Advances specified iterator to next element. Will fail if it reaches one-past-last element.

is_valid()
my_bool is_valid(my_h_service_metadata_iterator iter);

Checks if specified iterator is valid, i.e. have not reached one-past-last element.

release()
my_bool release_iterator(my_h_service_metadata_iterator iter);

Releases the specified iterator. Releases read lock on the registry. Returns false if successful.

RegistryMetadataQuery service

Service to query specified metadata key directly for the specified Service

 Implementation by iterator to it.
 
get_value()
 my_bool get_value(my_h_service_iterator service, const char *name, const char **out_value);

Gets the key and value of the metadata pointed to by the specified object iterator. The pointer returned will last at least up to the moment of call to the release() method on the iterator. Returns false if successful.

Concurrency control of the registry

RW locks will be used to protect the registry. The operation of loading and unloading plugins should not be on any performance critical path anyway. Each change to the registry happens in its own "transaction" and, when unlocked, the registry will be in a consistent form again (i.e. no incomplete changes). The RegistryRegistration service APIs are the "write" operations. The rest is "read" operations. The reference counting can be atomic so it doesn't require a write lock.

Persistence of the registry

The registry will not have any sort of persistence and will be an entirely in- memory structure. Thus, it will be important to handle the proper bootstraping order when loading components if they alter the registry and it's important that we achieve identical content.

Dynamic Loader Service

The dynamic loader service implements loading and unloading of components and keeps a list of them. It consumes the service registry service to register and unregister the services the component implements and to find the services the component consumes.

The DynamicLoader service

It has the following methods :

load()
bool load(const char *urns[], int component_count, bool force);

Loads specified group of components by URN, initializes them and registers all Service Implementations present in these components. Assures all dependencies will be met after loading specified components. The dependencies may be circular, in such case it's necessary to specify all components on cycle to load in one batch. From URNs specified the scheme part of URN (part before "://") is extracted and used to acquire Service Implementation of scheme component loader Service for specified scheme.

unload()
my_bool unload(const char *urns[], int component_count);

Unloads specified group of components by URN, deinitializes them and unregisters all Service Implementations present in these components. Assumes, thous does not check it, all dependencies of not unloaded components will still be met after unloading specified components. The dependencies may be circular, in such case it's necessary to specify all components on cycle to unload in one batch. From URNs specified the scheme part of URN (part before "://") is extracted and used to acquire Service Implementation of scheme component loader Service for specified scheme. Returns false if successful.

The DynamicLoaderQuery service

Service for listing all Components by iterator. Symmetrical to the Registry one.

DynamicLoaderMetadataEnumerate service

SService for listing all metadata for a Component specified by the iterator. Symmetrical to the Registry one.

DynamicLoaderMetadataQuery service

Service to query specified metadata key directly for the specified Component by iterator to it. Symmetrical to the Registry one.

Concurrency control of the dynamic loader

RW locks will be used to protect the dynamic loader data. The operation of loading and unloading components should not be on any performance critical path anyway. Each change to the dynamic loader list happens in its own "transaction" and, when unlocked, the dynamic loader information will be in a consistent form again (i.e. no incomplete changes). Write operations are the DynamicLoader methods. The rest are read operations.

Persistence of the dynamic loader

The dynamic loader have two implementations, one as in-memory structure, and second, a wrapper on the first one, will maintain and use a persistent ordered list of component sets issued in calls to the load() method in a system table. The API should be designed in such a way to allow moving this information into a proper configuration service, e.g. the one in WL#6801, when it becomes available.

Implementation plans

Bootstrapping

The server binary will provide an implementation of the core services, that is the registry, and the dynamic loader. They will be used to seed the service registry service. The dynamic loader will from then on use its reference to the service registry service to lookup the default service registry service implementation and provide it to the newly loaded components so that they can register into it and keep a reference to it.

This worklog assumes that the registry and the dynamic loader services will be bootstrapped very early in initialization process of the MySQL Server, and only PSI and logging facilities are required for them to work. The Persistent Dynamic Loader on the other hand assumes operation with a completely initialized server (i.e. one that can execute queries), and therefore is initialized later.

In this case the bootstrap of the two service implementations will look as follows :

  1. The server container creates the built-in registry service
  2. The server container initializes the built-in registry service by registering the registry service itself into it.
  3. The server container creates the built-in dynamic loader service
  4. The server container initializes the built-in dynamic loader service by :
    1. Registering the server container into it.
    2. Registering the dynamic loader service into the service registry.
    3. Loading and initializing an ordered persistent set of components from a table with their respective services using the currently active server configuration values (e.g. locations, search paths etc).

To lift the requirement of a fully initialized server in a future version (M2) the bootstrap process will need to be altered approximately as follows :

  1. The server container creates the built-in registry service
  2. The server container initializes the built-in registry service by
    registering the registry service itself into it.
  3. The server container creates the built-in dynamic loader service
    and registers it into the registry.
  4. The server container initializes the built-in dynamic loader service by :
    1. Registering the server container into it.
    2. load and initialize the built-in dependent persistent configuration service.
    3. Registering the dynamic loader service into the service registry.
    4. Read from the configuration service the ordered persistent list of components to load

Note that in M2 there's no mention of a data dictionary. Or any other server-y stuff, like e.g. a security model, or an language compiler, or even a client-server protocol. They will be optional (but frequently sought after components like anything else.

Migrating the current plugin architecture to the component/service model

We will migrate all plugin types and all existing services into service definitions and service implementations. This migration will be done verbatim. Then we will look into providing a better service APIs for each of those as a second version.

What is the current (5.6) state of plugin architecture ?

On an architectural level the current (5.6) plugin architecture consists of a single monolithic server that's calls out to a set of named plugins whenever it needs to perform a special task there exists an registered plugin interface for. Plugin interfaces are versioned, but there're no clever backward compatible checks. Both the (explicit) plugin name and the (implicit and not so obvious) version number must match completely before the plugin is loaded.

Plugins can call back into the server in two ways:

  • "cleanly" through the usage of "plugin services" : a named compile-time set of interfaces that the server exposes to plugins.
  • in a "dirty" way, exploiting the fact that plugins can only run inside the server's namespace so they can just leave undefined symbols and rely on OS dynamic linker not using any namespace protection technique to resolve these symbols and map them back to server internals. Note that this different OSes even in this unprotected mode are applying different degrees of additional protection, like e.g. not allowing plugin code to modify data into another OS module's code, thus provoking a number of various "wormhole" functions to do that. Since it's really difficult to establish the proper coverage of these wormholes some of them are just rotting away and turning into dead non-functional code.

But of course there's not even a clean way for plugins to call other plugins. So nobody even considered this.

There's the limitation that a plugin equals 1 named API. This (and the above limitations) encourage monstrous composite APIs (see the storage engine api as the best example for this)that are impossible to implement and hard to maintain consistent (hard, because of both limiting how internal structures can change and because the APIs are so complex that nobody can really implement all combinations).

Add to all this the fact that plugin APIs are considered to be "public" and thus need to be "static" and "backward compatible" and the conflict between the requirements and the implementation becomes evident.

What can't we just extend the current plugin implementation ?

In a way this is exactly what this worklog tries to do. But it needs to do it on core conceptual level.

The current plugin handling code assumes things that are simply not sustainable and don't cater for good modularization, namely:

  • It assumes that only the server will need to call out to plugins. No plugin can reasonably expect to call into another plugin.
  • It assumes that plugins will only use the server resources and code.
  • It assumes that the server will need to know (and call into) all possible extension scenarios that plugins can ever provide.
  • No new plugin functionality can be added without carving out an API into the server to support it.
  • It doesn't support for overloading existing functionality with alternative implementations.

Security implications

All the alteration operations will require DBA privilege. No fine grained ACLs are planned for any near future, as it's native code and it's pointless to try imposing any restrictions on top of it. Later, when (and if) we have proper managed code components we will probably extend the API with some ACL checks.

Normal operation patterns

Broadcast

A component may need to broadcast a message to all implementations of a service

e.g. if we have a number of logging service implementations and we want to

notify them all that certain event has occurred we iterate over them and call a service function with the event data.

A variant of this is when we have a number of work unit processor service implementations that we want to use to process an incoming unit of work. We iterate over all implementations, we call a service function and we exit the loop when any of the service implementations accepted the work unit.

Overriding service implementations

We have an authentication service. And a built-in component that implements native authentication set as a default. When we need to authenticate users we query the service registry using the service name alone (no implementation) to get the default service. Now a new authentication service implementation loaded through an external component can be set as a default authentication method and hence all future generic authentication requests will go to the external component instead.

Sub-services

Imagine we have an HTTP server service implementation. It can consume a number of different service implementations of the page rendering service through a number of sub-services that all implement certain "base" service. E.g.

PageServiceStoredProcedure {
 bool render(const char *requestText, char **outResponseText);
}
PageServiceIndexLookup {
 bool render(const char *requestText, char **outResponseText);
}
PageServiceSQLQueryTOJSON {
 bool render(const char *requestText, char **outResponseText);
}
...

And we have a number of service implementations of the above services. Now the HTTP server service when receiving a specific request will map it to one of the above sub-services and will call the resulting function pointer to produce the desired result.

Stateful services

If a service needs to work with state it must use handles. No separate factory service will exist by default. It's up to the service authors to provide the factory methods and instance methods services.

Here's an example of a factory/instance service separation vs. using a single service:

Single service
{
  bool getIterator(h_iterator *outIterator);
  bool next(h_iterator, const char *outObject);
  bool releaseIterator(h_iterator iterator);
}
Factory service + instance service
Factory
{
  bool createIterator(h_iterator *outIterator);
}
Instance
{
  bool next(h_iterator, const char *outObject);
  bool releaseIterator(h_iterator iterator);
}

Service Definition Examples

Used to ilustrate the idea only. Actual code may vary.

A Pure C Service

SimpleAuthenticationService.h

/* Define the service implementation */ BEGIN_SERVICE_DEFINITION(SimpleAuthenticationService)

  my_bool (*authenticate)(const char *user_name, my_bool &out_result);

END_SERVICE_DEFINITION(SimpleAuthenticationService);

SimpleAuthenticationService.c
static my_bool
authenticate_imp(const char *user_name, my_bool &out_result)
{
  if (user_name[0] != 0)
    *out_result= TRUE;
  else
    *out_result= FALSE;
  return FALSE;
}
/* Define the service implementation */
BEGIN_SERVICE_IMPLEMENTATION(SimpleAuthenticationService, 
                             "SimpleAuthenticationService.C",
                             NULL /* no metadata */)
  autenticate_imp
END_SERVICE_IMPLEMENTATION(SimpleAuthenticationService);
static my_bool 
init_imp(h_registry_service registry)
{
  /* a dummy initializer */
  static int initialized;
  initialized= 1;
}   
/* define the component */
BEGIN_COMPONENT_PROVIDES(AuthComponent)
 SERVICE_REFERENCE(SimpleAuthenticationService)
END_COMPONENT_PROVIDES(AuthComponent);
BEGIN_COMPONENT_REQUIRES(AuthComponent)
/* no explicit requirements */
END_COMPONENT_REQUIRES(AuthComponent);
DECLARE_COMPONENT(AuthComponent,
  "AuthCComponent",
  init_imp,
  NULL, /* no deinit */
  NULL /* no metadata */
);
BEGIN_C_CONTAINER_LIST
  COMPONENT_REF(AuthComponent)
END_C_CONTAINER_LIST

A C++ Service

SimpleAuthenticationService.h
class SimpleAuthenticationService : MySqlService
{
 /* override the init() static function */
 static my_bool init(h_registry_service registry);
 
 public:
  static my_bool authenticate(const char *user_name, my_bool &out_result);
}
SimpleAuthenticationService.cpp
my_bool SimpleAuthenticationService::authenticate(
  const char *user_name, my_bool &out_result)
{
  if (user_name[0] != 0)
    *out_result= TRUE;
  else
    *out_result= FALSE;
  return FALSE;
}  
my_bool SimpleAuthenticationService::init(h_registry_service registry)
{
  /* a dummy initializer */
  static int initialized;
  initialized= 1;
}
SimpleAuthenticationService inst("SimpleAuthenticationService.SampleCpp");
MySqlContainer container("AuthCPPContainer", &inst); 
BEGIN_CPP_CONTAINER_LIST
  container
END_CPP_CONTAINER_LIST

Design Patterns

This isn't really part of the specification itself, because it's not "official" or "mandatory". Nor does it outline functionality that will actually be implemented by this worklog. It's here to promote good coding practices and to demonstrate some of the expected uses of this infrastructure. Notes here are in no particular order. They come into two sections.

Recommended Practices

A.k.a "DOs"

  • Think of possible future uses.
    Try your best to design your interfaces for tomorrow and then reach the subset you can deliver today. This will aid your interfaces to be relatively constant and seldom in need of redesign. One example of this is the fact that all of the services defined by this worklog have a certain signature (returning a boolean and only using in-out parameters). This is because it's easier to generate such functions out of interface definition language compilers and makes it more natural to serialize the calls to such interfaces in and out of process/computer boundaries.

Practices to be discouraged

A.k.a. "DONTs"

  • Don't go too granular with the service interfaces because of:
    • Performance. While it's possible to create a service around the OS mutexes it's not really practical to use such service because the amount of time needed for the service API indirection will be comparable with the time needed to execute the payload code itself.
    • Polluting the service namespace. If your designs expose *all* the objects and classes that you have chances are that not many other parts of the code will need to interact with your code in such way. This will result in added overhead to the service registry with no value whatsoever.
    • Introducing compatibility points when there's little use in having them. Remember that you'll need to take some extra care to maintain the service interfaces you're exposing. This may put an additional burden on the development of your designs.
  • Don't expose too little of the functionality you're developing.
    Nobody will prefer an implementation they can do little extra with. Intended use of big functionality blocks quickly wears off with time. So to future proof your code you need to turn it into a tool-set that your users can use to achieve their new needs or automate old ones. Thus strive to expose all the interfaces that make any logical sense and that you yourself are using. Yes, it contradicts with the previous "Don't", because it's a balance that you need to strike.
  • Don't use composite APIs
    APIs are composite when you lump several logically distinct interfaces into a "larger" single interface. One example of a composite interface would be creating a single interface out of e.g. the Registry, RegistryQuery and RegistryRegistration services. While they all apply to the same data structure and instance it's not the implementation instance that should lead here. Having the 3 combined will mean that whenever you need to extend or re-think one of them you'll have to re-visit all clients of all the 3 of them.
    So the rule of thumb is that if you anticipate that people can get by with only a subset of your interface's functionality you're probably designing a composite interface.
  • Don't try to mix the use of the infrastructure with any other kinds of extra-Component code reuse.
    It's still probably technically possible to call code in other container in "rogue" ways. But the fact that the infrastructure has no known deficiencies and can check for most cases of "rogue" use will highlight your Component as "rogue". This is a testament to the users of the functionality you provide and will hopefully cause them to give a good hard second look whether they can afford to rely on plugins with decreased maintainability. A logical extension to the infrastructure would be e.g. an option to only take "signed" and "clean" Components in. Another logical extension that's incompatible with "rogue" plugins is proxying the calls to a certain service on another process or computer.

This is the LLD of WL#4102.

Contents


Milestones

M1

Due by 31 Jan 2014

  • The server "component" properly initialized as described in HLS, Bootstraping.
  • The registry and the dynamic loader services operational. Load/unload not implemented.
  • Unit test using simulated load/unload to test the rest of the code.
    The main goal of these tests is coverage. Coverage should be enough per the QA standards.
  • force options added as defined in forcing component load

M2

Due by 28 Feb 2014

  • The mechanism for loading internal compiled-in extra components operational
  • CMAKE provisions for defining extra compiled-in components.
  • unit tests extended to cover the full dynamic loader's code. Coverage + some basic real life usage sequences.
  • concurrency and performance tests to compare the current implementation to a baseline. for that :
    • we create an artificial test component, copy of sql/udf_example.cc defining UDFs. Probably a special "call component" UDF Item_func will be needed for this.
    • run the component through the component system on a multi-core lab machine for 1,2,4,8,16,32 threads.
    • run the plugin through the current plugin system on the same box and the same number of threads and compare.

M3

Due by 31 March 2014

  • SQL commands to dynamically load/unload additional external containers operational. At execution of LOAD/UNLOAD COMPONENT we first aquire the dynamic loader service form the registry, call load()/unload() in it and then release the dynamic loader instance. Other approaches were considered (involving a global) but ruled out due to the fact that they don't contribute to speed of LOAD/UNLOAD COMPONENT commands. COMPONENT is a non-reserved word.
  • fully functional Dynamic loader without the load/unload persistency. This means implementing the DynamicLoader_file sub-service to load/unload components from OS shared objects.
  • Extra integral tests to increase coverage. Coverage of the new code increased up to the QA standards.
  • force options added as defined in "forcing component load"

M4

Due by 31 Apr 2014

  • Persistency implemented for load() and unload().
  • Extra integral tests to cover for the full stack.
  • force options added as defined in "forcing component load"

Design decisions

  • Components will need to support configuration variables defined with them. Due to the amount of work in this worklog we will have a separate worklog for support of configuration variables with components.
  • forcing components load. The chassis will need to distinguish between components cine qua non and optional ones. This will be implemented through the following elements
    • (M1) A "force" boolean parameter to DynamicLoader::load(). When on it will cause the DynamicLoader::load() to return a failure code when the component fails to load. When off it will return a success and log a warning.
    • (M4) A column in the dynamic loader persistent state to store the value of the DynamicLoader::load() force parameter. This column will be used in the dynamic loader initialization phase to make it fail if a required component cannot be loaded.
    • (M3) An extra optional parameter for the LOAD COMPONENT COMMAND, e.g. LOAD COMPONENT ... [ FORCE=(ON|OFF) ]
    • (M1) An "force" metadata item definable by the plugin author to supply the default for the force parameter if not present in LOAD COMPONENT.
    • (M1) A global boolean variable "components_optional" that will force DynamicLoader::load() to disregard the "force" parameter and behave as it's off. "components_optional" is settable on the command line and available as a read-only global system variable.
    • (M4) The component list persistence table
      The component list persistence is an non-trivial topic. To preserve it so that it can be reproduced one can take 2 approaches. #1 is save a snapshot of the current list and hope that it will be reproducible after a server restart. #2 is save a changelog of changes to the component list. Obviously, due to depenencies of the components on services provided by other components there's no guarantee that a snapshot of the current list will always work. One example of it not working well is "kamikadze components" : components that never initialize properly, but they cause some side effects in their component_init functions. And even if it does work error handling is a bit more limited this way. Obviously the changelog approach has the disadvantage that it can produce relatively large "changelogs" that will need to be replayed on server statup and thus slow it down. To reduce the complexity and avoid slowing down the server we will implement the snapshot approach as a first iteration.
      • Naive (snapshot) approach:
        • CREATE TABLE COMPONENTS(component_urn varchar(?), component_order integer auto increment, force boolean, primary key(component_urn), unique(component_order));
        • on dynamic loader initialization load all of the compoments defined in the table in the persisted order as a single batch. Skip loading the non-force ones that fail and continue. Stop the server otherwise.
        • on DynamicLoader::Unload() remove the rows for all the components that are unloaded.
          In case of number depletion consider compressing the space by renumbering the existing components in the table. Absolute numbers don't matter as long as the renumbering preserves the order.
        • on DynamicLoader::Load() get the next component_order number and add a row to the table.

Detailed bootstrap sequence

This is the sequence according to phase 1, as defined in HLS, Bootstraping. Phase 2 (separation of the registry and the dynamic loader from the server component into a desgnated bootstrap component) will be done later.

Initial State

The server initialized, and operational up to the point where system tables can be read, e.g. in mysqld_init(), right before acl_init()

Prerequisites

  • The server component, containing the registry and the dynamic loader service suites
  • Server component implements mysql_component_init(SERVICE_TYPE(registry) *current)as defined in HLS, Component
  • Server component implements mysql_component_deinit(SERVICE_TYPE(registry) *current) as defined in HLS,Component

Initialization sequence

  • The server code, e.g. mysqld_init(), calls the function services_init().
    services_init() is very similar to DynamicLoader::load() in the sense that it will load the server component. But since this is the first component and it needs to be bootstrapped specially it a bit of a mutated variant of DynamicLoader::load()
  • (M1) services_init() will have a compiled-in call to mysql_component_init() for the server component, passing NULL as a registry handle.
  • (M1) mysql_component_init() will:
    • will ignore the registry implementation handle passed to it.
    • create the registry service suite internal structure mysql_registry and related.
    • seed the registry through registering the registry implementation into it (with a suffix of ".server"). This will also set the registry implementation as a default implementation and will thus cause the registry static to be updated to it.
    • create the dynamic loader service suite internal structures.
    • seed the dynamic loader through:
      • self-register the server component into it
      • register the dynamic loader service suite implementation into the registry
    • (M2) if the persistent state is empty (as in clean start) load all the compiled-in component sets from the CMAKE generated list using a special distinguishable URN for this (e.g. ASCII 0, followed by the pointer or some text encoded version of the pointer etc).
    • (M4) go through the persistent storage and load all the components sets from it.

Relationship with the LOAD/UNLOAD plugin code

The component/services functionality will be completely separate from LOAD/UNLOAD plugin currently existing in MySQL. This means new SQL level interface, new tables etc. The component code will probably reuse some useful ideas from the plugin code, but the plugin code is just too cluttered and build under a different architecual assumptions to be reused in any scale. We will aim to deprecate the plugin code once the component code matures.

File Structure

include/mysql/service_c_helper.h

Contains utility macros to define C service headers

  • DEFINE_SERVICE_HANDLE(service_handle_name) typedefs a named void * service handle type.
  • BEGIN_SERVICE_DEFINITION(service name) opens C service typedef definition : a struct of function pointers. Used in headers that declare the C service API. Should be followed by 1 or more DEFINE_METHOD calls and a matching END_SERVICE_DEFINITION call
  • DEFINE_METHOD(return value type, method_name, argument list with braces) defines a C service method : a function pointer. Must be inside a BEGIN_SERVICE_DEFINITION/END_SERVICE_DEFINITION pair.
  • DEFINE_BOOL_METHOD(method_name, argument list with braces) A specialization of the DEFINE_METHOD with a boolean return type. The return type is to be used for error return.
  • END_SERVICE_DEFINITION(service name) closes the C service typedef definition that was opened via BEGIN_SERVICE_DEFINITION
  • BEGIN_SERVICE_IMPLEMENTATION(service name) defines a variable of the service definition type that implements a C service. Must be followed by 1 or more coma separated references to C functions or static class methods and a matching END_SERVICE_IMPLEMENTATION() call. Must much the similarly named service definition defined through BEGIN_SERVICE_DEFINITION/DEFINE_METHOD/END_SERVICE_DEFINITION sequence.
  • END_SERVICE_IMPLEMENTATION(service_name) closes the definition of a C service instance struct. See BEGIN_SERVICE_IMPLEMENTATION for details.

libservices/my_allocator.h

Keeps the my_allocator class definition.

include/mysql/registry.h

Keeps the handle declarations for :

  • my_h_service : a service handle
  • my_h_service_iterator : a service iterator handle
  • my_h_service_metadata_iterator : a service metadata iterator handle

And declarations for the following services:

  • registry
  • registry_query
  • registry_registration
  • registry_metadata_enumerate
  • registry_metadata_query
  • registry_metadata_update

libservices/registry.cpp

Implementation for the services defined in registry.h and a general registry code container handling functions (init(), deinit(), performance_schema interaction etc).

General Classes

my_allocator

Defined in libservices/my_allocator.h (for now).

Specializes the STL allocator template to use my_alloc() and my_free() instead of alloc() and free. Used by the rest of the STL classes.

class my_string

Defined in registry.cpp (for now).

Utility class. A functional equivalent of STL::string that uses the specialized my_alloc/my_free STL allocator.

The Registry Module

class my_metadata

Helper class that handles service implementation metadata. Stores the metadata into an STL hash table of STL strings:

std::unordered_map

Has a specialized subclass of my_unordered_string_map::const_iterator (my_metadata::const_iterator) that also keeps a pointer to the structure it iterates on (so that the updates by iterator can be performed) and exposes it through a get() method (get_data()).

Implements basic functionality over the metadata that's used by the registry methods later.

my_ref_counted

Implements reference counting. The current implementation uses mysys atomics protecting an int64 using a static atomic rwlock member. Thus needs to be inited/deinited via the static init()/deinit() methods. Implements add_ref, release and get_count()

mysql_service_implementation

Subclasses my_ref_counted (for the reference counting) and my_metadata (for the metadata). Stores a h_interface, service name (in my_string m_service) and service implementation name (in my_string m_full_name). Has a copy constructor (to allow handling of the allocations) and a constructor taking h_service. set_name is a separate overloaded method taking either a composite name (single string) or the two parts of the name. Exposes public accessor methods for the members and a const char * variants for the string names.

class mysql_registry

None of the members should throw C++ exceptions. try {} catch (...) as needed to deal with the STL throwing stuff.

current mysql registry default implementation global

"mysql_registry" is a static in the mysql_registry class that points to the current default registry implementation. This may be changed through calling RegistryRegistration::set_default() for the "Registry" service. The dynamic loader will have a read access to the static to implement its load/unload methods and pass it on to component_init()/component_deinit().

registry

static std::map registry;

This is the ordered map of const char * service and service implementation names to mysql_service_implementation *.

It will have 1 entry for each service implementation: e.g. "foo.bar" -> mysql_service_implementation *.

It will also have 1 entry for the service name (denoting the default service implementation) : e.g. "foo" -> mysql_service_implementation *.

mysql_service_implementation objects are not automatically allocated and de-allocated, since two entries will point towards the same mysql_service_implementation instance if the implementation is a default.

Sorting is done in ascending order, thus guaranteeing that all service implementations will appear right after the default service entry :

  • foo1
  • foo1.bar1
  • foo1.bar2
  • foo2
  • foo2.bar1
  • foo2.bar2

The access to the registry is controlled by the mysql_rwlock_t LOCK_registry lock.

static init()/deinit()

The constructor does nothing. init()/deinit() must be called before/after operation to set up the lock. This complies with the dynamic loader.

registry service

acquire()

my_bool acquire(const char *service_name, my_h_service *out_service)

Takes a service name. Finds the service (if a service name) or the implementation (if an implementation name) inside the registry map. Uses a single lookup since both kinds are in the registry map. Increases the reference count of service inside the array. Returns a pointer to the interface. If none found the out parameter should be unaltered. Keeps a read lock while doing the lookup

acquire_related()

my_bool acquire_related(const char *service_name, my_h_service service, my_h_service *out_service);

release()

my_bool release(const char *service_name, my_h_service service)

As per the spec:

  • Takes a read lock
  • finds the service or the implementation by name
  • decreases the reference count
  • returns boolean status

registry_registration service

register_service()

my_bool register_service(const char *_implementation_name, my_h_service ptr)

  • Allocates a new mysql_service_implementation instance to hold the pointer.
  • takes a write lock on the registry.
  • tries to insert the implementation and fails if the service implementation exists.
  • registers the service name if not present already (another service instance present already).
unregister()

my_bool unregister(const char *service_implementation_name)

  • Takes an implementation name as an argument
  • takes a write lock.
  • Fails if the use count of the implementation is nonzero or it doesn't exist.
  • checks if the implementation is the default implementation and removes it if it is, then sets the first implementation of the same service (if any) as a new default by reusing the pointer to mysql_service_instance.
  • removes the implementation from the registry, deletes it and releases the lock
set_default()

my_bool set_default(const char *_implementation_name)

  • Searches for the implementation by name.
  • set the pointer found to the service name entry.
  • keeps a write lock.

registry_query service

create_iterator()

my_bool create_iterator(const char *service_name_pattern, my_h_service_iterator *out_iterator)

  • takes a read lock and keeps it until the cursor is closed.
  • searches only for complete matches (as to where to start from).
  • allocates a new STL's map::const_iterator and returns a pointer to it.
release_iterator()

void release_iterator(my_h_service_iterator iterator)

  • deletes the iterator
  • releases the read lock
get_next()

my_bool get_next(my_h_service_iterator iterator, const char **out_name)

  • returns the service name found. Doesn't copy it. You need to copy before closing the iterator if you need it.
  • doesn't increase the usage pointer of the service.
  • advances the iterator by one and returns true if no more.

registry_metadata_enumerate

metadata_create_iterator()

my_bool metadata_create_iterator(my_h_service_iterator iterator, my_h_service_metadata_iterator *out_iterator)

  • works on a service iterator
  • doesn't do any locking
  • returns an iterator over the metadata from the referenced mysql_service_implementation
metadata_release_iterator

my_bool metadata_release_iterator(my_h_service_metadata_iterator iterator)

  • assumes the argument is a valid metadata iterator
  • doesn't do any locking (expects to have a service cursor open for that)
  • deletes the iterator

metadata_query service

No locking again

metadata_update service

No locking. Straight metadata array wrappers.

The Dynamic Loader Module

These are the implementation details of the Dynamic loader suite of services.

We will have a class "dynamic_loader" that will implement the services described in HLS, Dynamic_Loader_Service. It will have the following members

components_list

An ordered list of active components identified via an unique name.

DynamicLoaderQuery service will iterate over it.

DynamicLoader service will add batches of components to it and remove (possibly different) batches of components from it.

It will consists of instances of the Component class.

Access to it will be guarded by the same read/write lock as the registry.

Component class

This will hold all the information about a loaded component.

component_name

The name of the component will be encoded in UTF-8.

component_urn

The component urn will be encoded in UTF-8.

It will use the HTTP naming scheme. The following formats will be available :

  • Built-in component URNs : "builtin://" + component_name
    All components statically compiled into the server will have an urn like this. The server at this point will not provide a list of these built-in components. We will rely on documentation for that.
  • External component URNs : file://" + component_file_name
    All the components available in a designated pre-configured directory (configuration property of the dynamic loader) can be loaded through this scheme using a file name. The directory name will be settable as a global only configuration variable. For that we create an atomic pointer to contain it and copy whatever it points to.
    Providing a metadata item with the full paths to the shared library loaded was considered and was rulled out for memory optimization reasons (not having to copy the array of metadata and just reuse the one provided by the component author).
metadata list

Need a set of metadata name/value strings (name is unique).

list of the services provided by the component

Needed to :

  • check if the component is unloadable (through making sure none of its services are being actively referenced
  • implement Registry::acquire_related
list of the services consumed by the component

A static array of { "name", service handle }.

Provided as a convenience service to the component authors willing to have the dynamic loader service handle all of their service references for them. All the component authors will need to do is provide a list of the services/service implementations they want to consume and then use the service implementation handles the dynamic loader will fill in for them.

Methods to implement the Dynamic_Loader_Service interfaces

DynamicLoader::load()

This takes a list of component urn.

A component urn will take the scheme (the part in front of the "://") and will use it to dynamically find a DynamicLoader_Scheme sub-service implementation in the registry. This will be exactly similar to the DynamicLoader service, but instead of dealing with persistent state it will materialize the component from persistent media and return an in-memory descriptor of the loaded component so that RegistryService::load() will be able to complete the component registration.

We will provide two sub-service implementations in the server/boostrap component : DynamicLoader_builtin service and DynamicLoader_file service. These will take care of the "builtin://" and "file://" schemes component materialization/dematerialization.

The sub-service API will be something like : my_bool load(const char *urn, struct component_description_block **out_component_block); my_bool unload(const char *urn, struct component_description_block *component_block);

So DynamicLoader::load() will be :

DynamicLoader::load(const char *urns[]) {

  scheme_to_service_map s_map;
  component_map new_components;
  for each urn in urns do
  {
     const char *scheme = extract_scheme(urn);
     const char *path = extract_path(urn);
     h_service load_service= s_map[scheme];
     if (not load_service)
     {
        load_service = Registry::aquire(concat("DynamicLoader_", scheme));
        s_map[scheme]= load_service;
     }
     component_block *block;
     load_service.load(path, &block);  
     register_provided_services(block);
     new_components.add(block);
  }
  for each component in new_components
  {
     satisfy_service_dependencies(component);
     component->init();
     components_list.add(component);
  }
  for each load_service in s_map
     Registry::release(load_service);

}

DynamicLoader initalization

This is the mysql_component_init() part relevant to

the dynamic loader initialization :

DynamicLoaderInit() {

 bool is_default= have_persistent_state();
 components_list.add(this) or continue;
 registry.add(this) or continue;
 if (is_default)
 {
    int n_builtins= 0;
    const char **urns = cmake_builtins_list_to_const_char_array(&n_builtins);
    DynamicLoader::load(urns, n_builtins,  ? false : true);
 }
 else
   persistent_state_deserialize();

}

This will load all the builtins into a single batch.

If we need to support more discrete interdependencies between built-ins we will extend the CMAKE macro to specify a batch number then have the cmake code sort the builtins by that number in batches of components and load them 1 batch at a time instead of a component at a time.

Order of loading (if there are not batch numbers) is irrelevant. If we go with batch numbers in cmake the order will become explicit.

DynamicLoader shutdown

DynamicLoaderUninit() {

 for each component in component_list
   component.deinit();
 registry.deregister(this);

}