WL#8020: Add a UDF registration service

Affects: Server-8.0   —   Status: Complete

Currently the server requires a bunch of "CREATE FUNCTION ... SONAME" statements
to be executed for each function in a UDF library.

Define a new plugin service allowing e.g. plugin_init() to dynamically register
UDF functions and e.g. plugin_deinit() to de-register them.

Obsolete CREATE FUNCTION ... SONAME in favor of this.

This will allow defining all the functions with a single INSTALL PLUGIN command.

Motivation:

Take a look at e.g.
http://dev.mysql.com/doc/refman/5.7/en/enterprise-encryption-installation.html

There's 9 commands to "install" the individual functions from a single component
with convoluted syntax etc.

If we get this worklog this is how the above page may look like:
http://dev.mysql.com/doc/refman/5.7/en/audit-log-installation.html

A single command. And a way to monitor the installed extensions. Which is not
present with UDF: https://bugs.mysql.com/bug.php?id=71262

Sometimes it's even more complex:
http://dev.mysql.com/doc/refman/5.7/en/firewall-installation.html
This one combines the need for:
1. INSTALL PLUGIN
2. a couple of CREATE FUNCTION SONAME
3. a system table.

We will combine 1 and 2. And will have to work on allowing plugins to handle #3. 
* FRQ1: A new component service will be added to register and unregister UDFs to/from the global hash of UDFs
* FRQ2: The components/plugins will be responsible for matching register and unregister
* FRQ3: The service will not be loading or trying to unload any more new shared objects.
* FRQ4: All UDFs registered by a shared object (a plugin) will have to be unregistered before the plugin's shared object is unloaded
* FRQ5: A new component service will be added to allow registering an UDF aggregate function.
* FRQ6: The existing UDF interface will continue to work.
* FRQ7: the UDFs registered via the service will not be stored into the mysql.func table. Persistence will be achieved indirectly via the mysql.component or the mysql.plugin tables.
* FRQ8: A new PERFORMANCE_SCHEMA.USER_DEFINED_FUNCTIONS will be added to display the contents of the global hash of all registered UDFs. 
API description
----------------

    /**
      Service for adding and removing UDF functions
    */
    BEGIN_SERVICE_DEFINITION(udf_registration)
      /**
        Registers a UDF function with a given name.

        The name must be unique. Does not store in any table, just
        updates the global function list.
        Plugins/components need to handle registration deregistration
        during their initialization and deinitialization.
        Registers a scalar UDF by default.
        @sa udf_aggregates

        @param name name of the function
        @param return_type return type.
        @param func function to call
        @param init_func function to call at query start
        @param deinit_func function to call at query cleanup
        @return Status of performed operation
        @retval false success
        @retval true failure
      */
      DECLARE_BOOL_METHOD(udf_register,
      (const char *func_name,
       enum Item_result return_type,
       Udf_func_any func,
       Udf_func_init init_func,
       Udf_func_deinit deinit_func));

      /**
        Unregisters a UDF function with a given name.

        Does not store in any table, just updates the global function list.
        Plugins/components need to handle registration deregistration
        during their initialization and deinitialization.

        @param name name of the function
        @param[out] was_present set to non-zero if the UDF was present, but locked
        @return Status of performed operation
        @retval false success
        @retval true failure
      */
      DECLARE_BOOL_METHOD(udf_unregister,
      (const char *name, int *was_present));    END_SERVICE_DEFINITION(udf_registration)

    /**
      Service for turning
    */
    BEGIN_SERVICE_DEFINITION(udf_registration_aggregate)
      /**
        Registers an aggregate UDF function with a given name.

        The name must be unique. Does not store in any table, just
        updates the global function list.
        Plugins/components need to handle registration deregistration
        during their initialization and deinitialization.
        You can use udf_registration::unregister to unregister the
        function.

        @sa udf_registration

        @param name name of the function
        @param return_type return type.
        @param func function to call
        @param init_func function to call at query start
        @param deinit_func function to call at query cleanup
        @return Status of performed operation
        @retval false success
        @retval true failure
      */
      DECLARE_BOOL_METHOD(udf_register,
      (const char *func_name,
       enum Item_result return_type,
       Udf_func_any func,
       Udf_func_init init_func,
       Udf_func_deinit deinit_func,
       Udf_func_add add_func,
       Udf_func_clear clear_func));

      /**
        Unregisters an aggregate UDF function with a given name.

        Does not store in any table, just updates the global function list.
        Plugins/components need to handle registration deregistration
        during their initialization and deinitialization.

        Currently it's a synonym for udf_registration::udf_unregister.

        @param name name of the function
        @param[out] was_present set to non-zero if the UDF was present, but locked
        @return Status of performed operation
        @retval false success
        @retval true failure
      */
      DECLARE_BOOL_METHOD(udf_unregister,
      (const char *name, int *was_present));
    END_SERVICE_DEFINITION(udf_registration_aggregate)


Structure of the PERFORMANCE_SCHEMA.USER_DEFINED_FUNCTIONS table
----------------------------------------------------------------

    CREATE TABLE PERFORMANCE_SCHEMA.USER_DEFINED_FUNCTIONS (
       UDF_NAME  VARCHAR(255) NOT NULL, -- empty string if registered via the CREATE FUNCTION and unloading
       UDF_RETURN_TYPE VARCHAR(20) NOT NULL, -- one of string, int, decimal, real, row
       UDF_TYPE varchar(20) NOT NULL, -- one of aggregate, scalar
       UDF_LIBRARY VARCHAR(1024) NULL, -- null if registered via the service API
       UDF_USAGE_COUNT INTEGER NOT NULL
    )


Migration path for the existing UDF libraries
---------------------------------------------

Currently it's enough to just expose a symbol from a shared library. Then you can do CREATE FUNCTION SONAME and start calling that function.

This is convenient but a bit dangerous since this can be used to call system shared libraries. This is why the UDF interface also requires there's at least x_init symbol or x_deinit symbol too. 

In order for such UDF libraries to migrate to the new structure they'll need to be converted into components (expose a single symbol with the component description) and have init() and deinit() function to register and unregister the UDFs. 

This extra effort will pay up in not having to issue CREATE FUNCTION commands for all individual functions in a UDF library or a plugin.
It will also resolve the issues with calling dlopen() twice for shared objects containing both a plugin/component and a udf function. 
And finally there'll be no need to expose the UDF function symbols as public symbols from the shared objects thus improving the encapsulation of the UDF implementations. 
How traditional UDF CREATE/DROP FUNCTION SONAME works
------------------------------------------------------

UDFs are held into a global hash, guarded by the THR_LOCK_udf rwlock.

When one registers UDFs the dynamic library name is stored too, together with a dynamic library handle (received via dlopen()). 

The hash entries also have a reference count, which is set to 1 at registration time.

When one tries to use an UDF function in SQL the udfs are searched in the hash by name (and the reference count is bumped). 
After the SQL statement is done the reference counts to the UDFs are decreased and if they're zero the UDFs are removed from the list. 

When DROP FUNCTION is issued to unregister the udf, the udf name in the hash will be set to NULL, thus it can't be found by name anymore and the reference count will be decreased. If the reference count drops to zero the UDF is removed from the list. 

There's also a global boolean flag (also protected by the same rwlock) that's set to true if there are any registered UDFs and to false when the last one is unloaded. 

Differences for the UDFs registered via the service
----------------------------------------------------

* No zombie UDFs: When an unregistration attempt is made just the reference count is decreased and the name is kept intact. Thus new uses of the UDF can appear at any time. But decrease of the UDF count to 0 will still unregister the UDF. 

* No dynamic library handles for the UDFs: unlike their siblings registered via the SQL interface the UDFs registered via the service will not store the dynamic library handle or the dynamic library name. This also means that, when the reference count drops to zero the dynamic library handle will not be unloaded. 

Components wanting to handle a set of UDF registrations should work as follows:

* At registration time they will create a global list of all registered UDFs.
* At unregistration time they will iterate over the list and for each UDF in it will try to unregister it, and, if successful, remove the UDF from the list.
* If the list is empty component unregistration can continue
* If the list is not empty the component unregistration fails. 

In the above state only the currently used UDFs will be registered. And subsequent attempts to unregister the UDFs can be made that will eventually be successful when there're no more active references to the UDFs.


Here's an example class that implements the above list:

    /**
      Helper class to maintain a list of registered UDFs per component.

      Instantiate one per component. Initialize and add the UDFs at init().
      At deinit() if unregister() passes deinitialize and allow component unload.
      Otherwise fail the component unload and keep the set for subsequent unloads.
    */
    class udf_list
    {
      typedef std::list udf_list_t;
    public:
      ~udf_list()
      {
        unregister();
      }
      bool add_scalar(const char *func_name,
                      enum Item_result return_type,
                      Udf_func_any func,
                      Udf_func_init init_func= NULL,
                      Udf_func_deinit deinit_func= NULL)
      {
        if (!mysql_service_udf_registration->udf_register(func_name,
                                                          return_type,
                                                          func,
                                                          init_func,
                                                          deinit_func))
        {
          set.push_back(func_name);
          return false;
        }
        else
          return true;
      }

      bool add_aggregate(const char *func_name,
                         enum Item_result return_type,
                         Udf_func_any func,
                         Udf_func_add add_func= NULL,
                         Udf_func_clear clear_func= NULL,
                         Udf_func_init init_func= NULL,
                         Udf_func_deinit deinit_func= NULL)
      {
        if (!mysql_service_udf_registration_aggregate->udf_register(func_name,
                                                                    return_type,
                                                                    func,
                                                                    init_func,
                                                                    deinit_func,
                                                                    add_func,
                                                                    clear_func))
        {
          set.push_back(func_name);
          return false;
        }
        else
          return true;
      }

      bool unregister()
      {
        udf_list_t delete_set;
        /* try to unregister all of the udfs */
        for (auto udf : set)
        {
          int was_present= 0;
          if (!mysql_service_udf_registration->udf_unregister(udf.c_str(),
                                                              &was_present) ||
              !was_present)
            delete_set.push_back(udf);
        }

        /* remove the unregistered ones from the list */
        for (auto udf : delete_set)
          set.remove(udf);

        /* success: empty set */
        if (set.empty())
          return false;

        /* failure: entries still in the set */
        return true;
      }

    private:
      udf_list_t set;
    }

Here's how the class can be used:

    static mysql_service_status_t init()
    {
      /*
       Use the global list pointer without a lock
       assuming serialization by the component infrastructure
      */
      list= new udf_list();

      if (list->add_scalar("dynamic_udf",
                           Item_result::INT_RESULT,
                           (Udf_func_any) udf_impl::dynamic_udf,
                           udf_impl::dynamic_udf_init,
                           udf_impl::dynamic_udf_deinit) ||
          list->add_aggregate("dynamic_agg",
                              Item_result::INT_RESULT,
                              (Udf_func_any) udf_impl::dynamic_agg,
                              udf_impl::dynamic_agg_add,
                              udf_impl::dynamic_agg_clear,
                              udf_impl::dynamic_udf_init,
                              udf_impl::dynamic_agg_deinit))
      {
        delete list;
        return 1; /* failure: one of the UDF registrations failed */
      }

      /* success */
      return 0;
    }


    static mysql_service_status_t deinit()
    {
      if (list->unregister())
        return 1; /* failure: some UDFs still in use */

      delete list;
        return 0; /* success */
    }

What other databases are doing
-------------------------------

* Oracle: https://docs.oracle.com/cd/E11882_01/appdev.112/e41502/adfns_externproc.htm#ADFNS759
* PostgresQL: https://www.postgresql.org/docs/9.1/static/sql-createfunction.html
* MS SQL: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql