WL#6380: Formulate framework for API for DD

Affects: Server-5.7   —   Status: Complete

A Data Dictionary is a collection of information about database objects, such
as tables, views, stored programs. Also known as metadata. In other words,
Data Dictionary stores information about e.g. table structure -- which
columns, indexes, etc every table has.

Data Dictionary also stores information about tables in INFORMATION_SCHEMA
and PERFORMANCE_SCHEMA. These tables don't physically exist in a storage
engine, they populated by the server on the fly.

As we're developing a new Data Dictionary (where all data is stored in
transactional tables instead of just simple files or MyISAM tables
as it is now), we want to come up with a clean and straightforward API
to deal with the new Data Dictionary.

The Data Dictionary API means a few things in this WL:

  - mainly, it is the internal API the server code will use to get and
    update Data Dictionary information.

  - external API, which should be used for accessing Data Dictionary
    information from plugins (including 3rd party Storage Engines),
    will also be discussed.

  - discussing API (both internal and external) means discussing public
    interface and key implementation concepts.

The Data Dictionary API is an intraprocess API, i.e. it can be used
within the server process only.

Developing a new Data Dictionary is large project. This WL is intended to:

  - outline our vision on the new Data Dictionary API, including how
    Data Dictionary information will be represented in main memory.
    I.e. define how we would like it to be eventually.

  - describe every major aspect of the new API and discuss possible

New Data Dictionary project will be implemented step by step. There is a
large legacy, which should be supported and maintained. Also, the current
server code does not have clear boundaries between various modules, which
means there is no "old API", that can be "just replaced" by the new API.
Implementing and using a new API means refactoring of the server to break the
code into different modules and make proper interfaces among them.
Most likely, the new API will be formulated and implemented iteratively.
This WL serves like a guideline in that process.

Strictly speaking the Data Dictionary consists of two parts:
  - "operational part" -- Data Dictionary information which is in memory (RAM);
  - "stored part" -- Data Dictionary information which is stored in tables
    (on disk);
This WL describes the "operational part".

See also:
  - WL#6379: Schema definitions for new DD

- Data Dictionary (shortcut: DD) -- a collection of metadata (information
  about RDBMS objects).

- Data Dictionary API (shortcut: DD API, API) -- this is what this WL is
  about. Data Dictionary *Internal* API is discussed here.

  Virtual tables don't exist in storage engine (on disk), they are
  populated by the server on the fly. However, from the user view point
  they behave as regular tables.

- Transactions

  This is probably the most confusing term in this specification.
  The problem is that "transaction" is associated with different things
  in different areas of the server. Here are the definitions which are used
  in this WL. Any implicit meaning should be rejected.

  * InnoDB mini transaction -- an InnoDB internal transaction, which has
    nothing to do with regular transactions in the server.

    InnoDB mini transactions do not participate in (are not visible
    through) DD API. DD-API-implementation however will use InnoDB mini
    transactions to read/write Data Dictionary records.

  * Regular transaction -- a transaction that disallow DDL-statements.

    Regular transactions are common transactions that we have now in the
    server. They can be read-only (SELECT) or read-write (UPDATE).
    Regular transactions only read the Data Dictionary (metadata).

  * DDL transaction -- a transaction that allow DDL-statements.
    This is what the server doesn't support at the moment, but the ultimate
    goal is to support such transactions. Basically that means DML and DDL
    statements can participate in the same transaction, or DDL
    statements/changes can be rolled back.

    DDL transactions read and modify (write) the Data Dictionary

Domain description

General API requirements

By all means, the new API should be:

  - clean, self-descriptive and straight-forward, which means:
    - most common use-cases should be implemented easily
    - API should be easy to learn

  - defensive against misuse

  - extendable without revolutionary changes

  - it should be hard and explicit to break the boundaries between the
    interface and implementation parts during future development

  - internal data representation should be encapsulated to
    make internal changes possible without updating code everywhere

  - facilitate ACID properties of the Data Dictionary
    (Atomicity, Consistency, Isolation and Durability)

API users

New Data Dictionary API will be used by the following components:
  - core server functionality (SQL-query execution in general sense);
  - internal storage engines;
  - plugins (including 3rd-party storage engines).

Storage engines want to associate custom data with the information stored
in the Data Dictionary. For example, InnoDB needs to store some specific
data for every InnoDB table.

Plugins might want to store custom metadata (in other words: add a new
object type to the Data Dictionary).

Both Storage engines and plugins also need a way to introduce new virtual 

Data Dictionary domain traits

  - Fixed set of core object types (tables, views, ...).

    NOTE: schema definitions of core object types are specified in WL#6379.

      - Storage Engines need to extend object type attributes
        (store custom data);
      - Plugins need to introduce new object types;

  - Every virtual table (e.g. table in INFORMATION_SCHEMA / PERFORMANCE_SCHEMA)
    will have a pre-defined set of columns.

    However, storage engines and plugins need to be able to introduce new 
    virtual tables. These tables could be stand-alone tables or tables
    extending existing virtual tables (by including existing record key)

  - Small size of all data (compared to size of data managed by RDBMS);

   - Read-only use cases are more frequent
     (thus, the server should be optimized for DD read-only use cases);

   - Although read-write use cases are less frequent,
     there is a vast variety of them (thus, the DD API design should
     simplify coding of them);

  - ACL will be eventually a part of new Data Dictionary
    - ACL does not use MDL now -- that should be changed eventually;
    - ACL loads all data at once now -- that should be changed eventually;

Data Dictionary objects

The new Data Dictionary will have at least the following object types:
  - Catalog
  - Schema
  - Table-space
  - Table (regular and temporary tables)
  - Partitioning information
  - Table column
  - Table index
  - Primary key
  - Foreign key
  - View
  - Stored procedure
  - Stored function
  - Trigger
  - Event
  - ACL user
  - ACL privilege
  - Sequence


Two API layers

It's proposed to have two API layers: for internal usage (the core server
functions, core Storage Engines) and for external usage (plugins).

The main difference between API layers is interface stability. Maintaining a
public stable interface (preserve backward compatibility, preserve binary
compatibility, ...) requires additional efforts.

Internal API:
  - this API is more for refactoring the server -- identifying interfaces
    between different modules;
  - the API will be formulated in object-oriented terms (not row-based API);
  - the API will use object-oriented language (C++);
  - the API will not be binary compatible;
  - generally, this API will not be backward compatible.
    We don't want to restrict ourselves too much.

External API:
  - this API is designed with compatibility in mind;
  - the API is intended to be used by 3rd-party plugins;
  - the API will be formulated in object-oriented terms (not row-based API);
  - however, the API will use C-language;
  - the API should be backward compatible;
  - the API should be binary compatible;

Third-party storage engines and plugins technically are able to use both
internal API and external API. However, usage of internal API is not
recommended nor supported. If anyone wants to do that, we don't plan to
introduce technical limitations for that, but that use case clearly will
not be supported.

However, internal storage engines and plugins will probably benefit from
using internal API, because they don't need binary compatibility (there is
no need to use internal component, which was built for older version).

Later Internal API only will be discussed.

API classes

DD API will contain the following groups of classes:

  - Data Dictionary object classes -- classes for tables, columns, indexes,
    views, stored programs, ...

  - Service classes -- Dictionary, Factory, ...

  - Auxiliary utilities

The Coding Style

Generally, the standard MySQL Coding Style should be used.
However, there are few exceptions and restrictions:

  - Prefer API in terms of iterators rather then in terms of index/size

  - Use RAII (Resource Acquisition Is Initialization)

  - Explicit ownership scope

  - C++ exceptions should not be used;

  - Use anonymous namespaces;

  - The order of functions / operations in .h and .cc files should be the

  - Use const references instead of const pointers for function arguments
    when a passing object can not be NULL;

  - Do not use non-const references.

  - Try to avoid passing classes/structs by value.

    There is one exception to this rule: LEX_STRING. Traditionally,
    it can be passed by value (note, that it's basically a pointer and an

  - If a class semantic does not allow for copy constructor / operator,
    forbid them explicitly.

  - Use spaces consistently -- it might be "X *x" or "X* x", but it must be
    consistent across the source file. The same goes for the references
    ("X &x" or "X& x").

  - String in-parameters must be passed as (const std::string &).
    String out-parameters must be passed as (std::string *).

  - String can be returned as:

    - (const std::string &) -- if the operation just returns a string
      member attribute;

    - (std::string) -- if the operation creates a new string, and it should
      be copied to the caller. NOTE, generally speaking, this is a bad
      style, that should be avoided. Returning objects by value leads to
      extra copying.

  - STL containers (std::vector, std::map, std::set) should be used instead
    of the MySQL-home-grown containers.

  - std::auto_ptr should be used to maintain strict ownership and prevent
    leaking objects/memory.

  - Header include guards must be used.

  - Include only needed header files.
  - For the public interface files: a header file should be included only
    when the definitions provided in the header file is required.
    Otherwise, use forward class declarations.

  - Forward class declarations must be grouped together in the beginning of
    the header file.


      - Do:

        class A;
        class B;

        class C
          const A *a() const;
          const B *b() const;

      - Do NOT:

        class C
          const class A *a() const;
          const class B *b() const;

  - A header file must be self-descriptive. In other words, it should
    include or declare all entities used in this header file. Or: there
    should be no unresolved dependency.

  - Header files should be included in the order that minimizes effect on
    other includes.

    For example:

      - The following order is not optimal, because if '#include '
        is missing in 'c1.h', the compiler will not throw an error:


        #include "c1.h"

      - The following order is better:

        #include "c1.h"


  - Use '#include <...>' for system headers.

  - Use '#include "..."' for MySQL headers.

Naming conventions

The following naming conventions will be used:

  - The MySQL class name convention is used.


  - Private attributes are prefixed with 'm_'.

      int m_id;
      bool m_enabled;
      std::string m_name;

  - Getters are named without the 'get_' or 'is_' prefix.

      int id() const;
      bool enabled() const;
      const std::string &name() const;

  - Setters are prefixed with 'set_'.

      void set_id(int id);
      void set_enabled(bool enabled);
      void set_name(const std::string &name);

  - We prefer non-abbreviated names for attributes (to the reasonable extent).

  - A few guidelines for naming operations:
    - Operation name should be prefixed with 'get_' if the operation
      returns a non-attribute. However, the 'get_' prefix implies that the
      operation is somewhat light -- it does not perform heavy I/O or
      anything like that. The relative cost of the operation is small.

    - Operation name should be prefixed with 'fetch_' or 'retrieve_' if the
      operation returns a non-attribute, but it might take significant
      efforts to "get" that non-attribute.

Data Dictionary and Character sets

The Data Dictionary API just translates what is in the
stored-data-dictionary without any change. 

The stored-data-dictionary data is usually in utf8mb3.
There are few exceptions when raw CREATE-strings are stored
in the original character set (because of character set introducers).

More details about the stored-data-dictionary are
in WL#6379 (Schema definitions for new DD).

Object ID

Every object (table, view, stored program, ...) in the Data Dictionary will
be identified by an Object ID.

The proposal is to have the following traits of Object IDs:

  - Object ID is synthetic, which means other business attributes must not
    be used to form it.

    In other words, Object ID is NOT a name (for example, it is NOT
    {schema-name, table-name} for tables).

    It could be some number, or generic GUID-string, or something else.

    Having Object ID synthetic simplifies handling of objects (especially
    object references) in case of renames (e.g. we don't need to update
    columns/indexes if a table name changes).
  - Object ID is opaque to Data Dictionary.

    That's related to the previous statement -- the Data Dictionary must not
    have any assumption about the format of ObjectID.

  - Object ID is persistent.

  - There is no way to get object type out of Object ID.

  - Object ID is not globally unique.

  - Object ID is 64-bit integer.

The main purpose of Object ID is to establish relations between objects in
the Data Dictionary.

Object IDs are not hidden by the Data Dictionary. They can be used by the
other parties as well.

For example, InnoDB uses identifiers prior to WL#6380 (tablespace id, table
id, ...). With the new Data Dictionary, InnoDB will start to use Object IDs
from the Data Dictionary.

However, Object ID users must not rely on its structure and treat it as
opaque data.  Data Dictionary API will also provide operations which return
an object pointer (reference / handler) based on name lookup.

Because of the current mix, it is important to distinguish the two operation
  - getting an object by ID
  - looking up an object by some attributes
    (most common: schema name, object name).

Object ID most likely will be the primary key of DD-object-record.
However, that's out of scope of this WL. It will be specified
in WL#6379 (Schema definitions for new DD). 

There is one predefined object-id named INVALID_ID -- value of -1.
This is the constant to be used to designate invalid, or non-existing,
or not-yet-assigned object id.

Object ID scope

There are the following alternatives for the Object ID scope:

  A. Globally unqiue Object IDs

    In other words:
      - object ids are unique on the Earth;
      - every DD object on the Earth has a unique id.

    That means, Object ID should be long enough to address every DD-object.
    That can be done, for example, using 128 bit GUID.

    Globally unique Object IDs seem to help in the following cases:

    - transportable tablespaces to avoid adjucement/translation of ids on
      the import phase.

      However, it seems, it is not possible to get rid of such a
      translation anyway, in particular because of ALTER TABLE IMPORT

    - backup, replication and cluster scenarious to identify objects among
      groups of servers. That's not clear however, there are more questions
      than answers at the moment. The value of that is also questionable.

    It has been decided that this alternative would might introduce
    unnecessary complexity and performance slowdowns, the potential
    benefits are not quite clear now. Thus, this alternative is rejected.

  B. Server unique Object IDs

    In other words:
      - object ids are unique within one deployment;
      - every DD object in one deployment has a unique id.

    This alternative itself can be broken down into two approaches:

    - B1: server unique Object IDs and a directory to identify the object
      type by the given object id.

      This is the way ORM implements. This way allows to have "virtual
      functions" or have an object reference of unknown type.

    - B2: server unique Object IDs, no way to identify the object type for
      the given object id.

      This approach mostly makes sense as an intermediate step to
      implementing B1. Basically, there is no any sensible way to use the
      fact that IDs are server-unique in this alternative. If one wants to
      access any object by ID, one needs to know its type, which means that
      either object has fixed type in a given context (and thus
      server-uniqueness is not necessary for referencing it) or there is a
      way to find out object type from ID (this is B1).

  C. Object-type unique Object IDs

    In other words:
      - object ids are unique within each type (within one deployment);
      - every DD object is identified by a pair {object-type, object-id};
      - two different objects of different types might have the same object

The core MySQL Data Dictionary (at the current state) can be implemented
using (C), because we always know the final object type of the object we're
rferencing. There is no need for object references of unknown (any) type.

"Core" Data Dictionary here means the data dictionary we currently have
without future extensions.

InnoDB will have transportable tablespaces which can store any object
information. This might be seen as a need for (B1). However, that would
actually mean that InnoDB will have to use the Data Dictionary as
translation service to get idea about kind of object they store and thus
even for interpreting their contents even on very basic level.
It would increase dependency on DD. For example, it would complicate
debugging/digging in tablespace in case of disaster.

It is better for tablespace to have basic knowledge about types of objects
it contains e.g. to have idea about its structure and be able to skip of
some objects in case when we don't have DD around (the same reasoning why
Support wants us to have serialized representation of objects in

Taking all this into account, it's been decided to go with the (C)
alternative, which is Object-type unique Object IDs.

Object ID implementation

The question is how to maintain object id. Basically, we need a sequence,
but we don't have sequences in MySQL.

  - we need a way to get next value;
  - object id "counter" should be crash-safe;

There are two options here:

  1. Use AUTO_INCREMENT (it works for object-type-unique object ids)

    This alternative has the simplest implementation: every DD table
    (every table storing information about object type) has ID column,
    that ID column should be declared as AUTO_INCREMENT and that's it.

      - simple

      - generally AUTO_INCREMENT might cause troubles as it combines two
        things into one: 1) place to store data, 2) the way new data is
        generated. It might be a problem when we have to create an object
        with a pre-defined id. That might be the case (now or in the
        future) in upgrade, backup/restore, replication scenarios.

        However, in order to solve that problem, globally-unique object ids
        should be used, which is not acceptable at the moment.

  2. For server-unique object ids we need a workaround for global sequence.

    It might be something along the following lines:

      - reuse AUTO_INCREMENT feature --- have an artificial table with a
        single column and the only purpose to generate new object ids.

        The problems are:
          - concurrency
          - in order to get next object id, we need to insert a new row.
            So, the table would grow as we go.

      - invent something clever like in-memory counter. It should be
        written to a special place in InnoDB on shutdown and read from it
        on startup. The tricky thing is to restore the counter after crash.
        Possible way to do this is to scan all DD tables and determine max
        id. (or we can play with timestamps).

It's been decided to have option (1), i.e. have an AUTO_INCREMENT column
for the object id in every table corresponding to DD-object-type.

Object type

As it's been outlined in the previous section, every DD-object is
identified by a pair {object-type, object-id}. This section describes what
the object-type is.

While discussing object-type we should keep in mind, that the set of
object/data types is not fixed -- virtually any plugin can introduce new
data types to the Data Dictionary. The plugins can be written by external
companies, so we don't have any control on them.

Thus, the Data Dictionary will have extendable object types: there are core
object types used by the server, and custom object types introduced by
3rd-party plugins. The infrastructure that allows dynamic type management
is also very much affected by the nature of object types.

There are two possibilities to represent Object type:

  - A string (object type name)

    Object type names are better when the set of object types can be
    extended by the 3rd parties. The server can have reserved prefix (let's
    say "mysql") which can only be used for the core object types.

    Object type name examples:
    - "mysql::dd::Table"
    - "mysql::dd::User"
    - "My_plugin_type::My_cool_type"

  - A number (object type code)

    Object type codes are faster to compare and smaller to store.

    Flexible type management based on object codes alone requires 
    dynamic object-type-code assignment. In other words, object-type code
    for a new object type is assigned by some central component when the
    object type is being registered in the Data Dictionary. That means,
    it's pretty difficult to have stable object-type-codes to be
    persistently stored.

It's been decided to build the dynamic type management on the object type
names. That means, every object type is primarily identified by a string
(object-type-name), which must be unique in the server context. If a plugin
registers an object type with a conflicting name, an error occurs.

However, to gain from the speed provided by numbers, the core object types
will also have corresponding object type codes.

The Data Dictionary will provide a service, which will allow to get an
object type code by object type name and vice versa.

Reserved Object type codes

It's been decided to reserve object type code 0 (zero) for special needs.

It's been also decided to start object type code numbering with 101
(reserve first 100 numbers).

Object references

Data Dictionary objects have relations between each other, and we need to
represent them.

Object references are represented by Object ID in the stored-data-dictionary.

In memory, object references can be presented either by a C-pointer (if the
corresponding object has been also loaded into memory), or again by Object
ID (if the corresponding object has not been loaded into memory yet).

From an API user view point, these differences are hidden by the API and the
user receives a C-pointer to an object in any case: either iterator over a
collection of objects will be returned, or a name lookup will be done.

So, from implementation point of view there are two object reference
  - strong object reference (C-pointer)
  - weak object reference (Object ID)

Objects, tied by strong references must be loaded simultaneously and can
not be flushed from the cache separately. On the contrary, objects related
by ObjectID can be loaded afterwards and can be flushed from the cache

We could have only weak object reference, but that would mean an extra
cache lookup. We want to avoid that for performance reasons.


  - Table -> Columns -- strong reference
  - Schema -> Table  -- weak reference

Interface and implementation classes

"Interface class" here means a purely abstract class (Java-like interface).

There are the following approaches when designing API:

  - use classes with public and private parts;

  - use interfaces and put all the implementation into classes behind the

  - have a mix: use interfaces for main components, use just classes for
    "simple" things.

Having different interface and implementation and classes mean

  - there is a clear separation between declared behavior and means to
    achieve it;

  - it's possible to change the implementation at some point, or have two
    versions of implementations -- although it sounds good, the practical
    value in our case is unclear.

  - there are parallel hierarchies of interfaces and their implementations
    (i.e. more code);

It's proposed to go with interfaces in all cases:

  - although this approach requires more code to write, it also lowers
    the chances that some implementation details will squeeze in into the
    contract part (interface). The later seems to be more important than
    the former in our organization (pretty large and distributed);

  - we could go with "a mix", but again, it's difficult to clearly define
    the border.

  - when you have an interface and implementation, and have to make a hack,
    you have to hack interface (add something there). That's obvious and
    less chances it will pass the review. 

Class naming convention

All Data Dictionary classes should follow the same naming convention.

The following two options were discussed.

  - use a common prefix:

    - Dict_table, Dict_stored_program, ...
    - ddTable, ddStored_program, ... (against MySQL coding style)

  - use namespace:

    dd::Table, dd::Stored_program, ...

The decision is:

  - Use 'dd' namespace for both interface and implementation classes;

  - The interface classes are named like 'dd::Table';

  - The implementation classes are named like 'dd::Table_impl'
    (i.e. the "_impl" suffix is added);


It's proposed to use inheritance.

Multiple inheritance can be used for interfaces, as that does not have any
risk. For the implementation classes, it's proposed to use single
inheritance only as multiple inheritance could cause problems.

In other words: multiple inheritance can be used for purely abstract
classes (without data members). I.e. one implementation class can inherit
from one other implementation class, and 1-n interfaces.

Multiple inhertiance details

Virtual base classes have to be used even for interface classes.

Example (struct instead of class for brevity):

  struct A_interface
    virtual void f() = 0;

    virtual ~Object_interface() { }

  struct B_interface : virtual A_interface
    virtual void g() = 0;

  struct A_impl : virtual A_interface
    virtual void f()
    { ... }

  struct B_impl : virtual B_interface, virtual A_impl
    virtual void g()
    { ... }

    // (1)

  A_interface *create_a();

  void f()
    // (2)
    B_interface *b= dynamic_cast (create_a());

(1) Virtual inheritance is required not to re-implement f() here.
Otherwise, it should have been:
  virtual void f()
  { A_impl::f(); }
Such "re-routing" code should have been written for every operation in base

(2) In very few cases we need to cast from base interface to a derived one.
Putting off all the discussion about how bad the cast is and how it must be
avoided, the thing is that in rare cases, it's needed. And then, virtual
inheritance means dynamic_cast has to be used (instead of static_cast).

The server is already built with RTTI enabled, so that should not be a

dynamic_cast is slower than static_cast, but DDL/DD does not seem to be
that much performance critical.

The top-most interface

It's proposed to use a single parent interface (dd::Dictionary_object)
for all data-dictionary entities (Table, View, ...). Data Dictionary
service classes/interfaces (Dictionary, Factory, ...) will not inherit
from this interface.

The interface might look like the following:

class Dictionary_object
  virtual Object_id id() const = 0;

  virtual const std::string &type_name() const = 0;
  virtual const Object_name &name() const = 0;

  virtual bool store(...) const = 0;
  virtual bool restore(...) = 0;

  virtual ~Dictionary_object()
  { }

Custom dynamic type information

C++ RTTI is not used in MySQL. We'll need to implement our own
RTTI-replacement for the interfaces, inheriting from dd::Dictionary_Object.
This can be done by having a method like dd::Dictionary_Object::type_name(),
returning an string constant, which is specific for every leaf interface.

This mechanism will be also used in the Data Dictionary.

Example of type name strings:
  - "mysql::dd::Table"
  - "mysql::dd::User"
  - "My_plugin_type::My_cool_type"

Entry point

It's proposed to have an interface like dd::Dictionary, which provides all
the needed methods to work with the new API. The server will have a single
instance of that interface, which is used by all connections (shared between

A pointer to that instance is returned by a global function
like get_data_dictionary().

Stateful DD objects

Data Dictionary objects will be stateful. An object state is a set of
object attributes read from the stored-data-dictionary.

In other words, let's say we have object type X and it has the 'name'
attribute. The name value is read when an object instance is
loaded/constructed and then remembered in the X::m_name class attribute. 
X::name() just returns m_name.

The alternative could be to have a stateless object and query
the stored-data-dictionary each time upon request. That option is not good
for performance. Also, error handling becomes complicated
as every get-operation might fail.

DDL transactions

DDL transactions are not supported at the moment. However, there are plans
to add such support in the future. So, it is proposed to plan the Data
Dictionary API with transactional DDL in mind.

For a given dictionary object we can have many "regular transactions"
(which only reads the object, i.e. transactions containing only DML
statements) and one "DDL transaction" that modifies the object (using DDL
statements). Now that is guaranteed by the Meta Data Locking subsystem.

"Regular transactions" can share one object instance (of course, that
object instance must be read-only). "DDL transaction" needs a separate
object instance. Ultimately, that means that one Data Dictionary object
(e.g. a table) can be represented by two C++ instances.

Object name

Since object names are what we get from the parser, fast lookup on
name is important. For most object types, name will currently be 
catalog name + schema name + object name. Suggestion is to implement
a class encapsulating name, referred to as dd::Object_name below.

Here are object names for most common object types:
  - Catalog: just catalog name
  - Schema: catalog name, schema name
  - Table: catalog name, schema name, table name

The idea is that the API user instantiates a new Object-name and then calls
corresponding lookup-method passing that Object-name instance.

Dictionary singleton

All the Data Dictionary objects which reside in memory are owned by
the Dictionary (i.e. stored in the Dictionary instance). There is a single
instance of Dictionary per server process.

The Dictionary can be seen as a collection of all objects with one or many
indexes. The primary index is Object ID. We will also need fast lookup on

The Dictionary is the main entry point for any class, which needs to get an
object instance. There are the following main use cases:

  - lookup an existing object by object name.

    The Dictionary tries to find needed object instance in the cache:

    - If it's found, it's returned to the caller.

    - If it is not found, the Dictionary uses a Factory to create a new
      instance, asks the object to load its data from the tables, adds the
      object to cache and returns the object to the caller.

    If two objects are linked, you can also get one object by asking
    the other. E.g. ask a table about it's indexes or columns.

  - retrieve an existing object by Object ID

    This will be similar, but for internal use (remember that Object ID
    is not exposed outside the DD API).

  - create an empty object

    The Dictionary just uses a Factory to create a new instance, and returns it
    to the caller.

The main ideas here are:

  - API users deal with Dictionary to get or create object instances.
    API users do not create object instances directly.

  - Dictionary owns all object instances.

  - Dictionary is a Data Dictionary service class, which is able to handle
    any object type derived from dd::Object.

  - Dictionary uses Factories to create object instances.

  - It is the object's responsibility to store/restore its state to
    the stored-data-dictionary.

  - Dictionary is a central module, which is used by many connections.
    It employs mutexes internally to be MT-safe.

  - It is important to have all get-object-by-id, lookup-object-by-name,
    store-object and drop-object requests to go through Dictionary so that
    it can maintain consistent cache.

  - All the operations that require THD (connection handler), accept it
    (THD *thd) explicitly.

    NOTE: an alternative could be to create a Dictionary-proxy object for
    connection having THD-pointer inside, and then using that proxy object
    without passing THD-pointer explicitly.
    I.e. something like:

      class Dictionary_connection : public Dictionary {
                          // implements Dictionary interface
        THD *m_thd; 
        Dictionary_connection(THD *thd);

      class THD {
        Dictionary *get_dictionary()
        { return new Dictionary_connection(this); }

      dd::Dictionary *d= thd->get_dictionary();
      dd::Table *t= d->retrieve_table("table_name");
    That alternative has been rejected because:

      1. There are other object type operations, that still require
      THD-context. So, it will not be possible to get rid of passing
      THD-pointers completely.

      2. It hides dependencies.

For the first few Data Dictionary project increments, the Dictionary will not
have its own caches. At some point, legacy caches will be replaced by Dictionary

This also means that we will keep existing structures around for the first
increment (e.g. TABLE_SHARE). So we will mainly change how existing caches
are populated. This limits the amount of code changes needed to get the
first increment to a state where it can be pushed to trunk.

It is important to note that Dictionary should be able to deal with any
object type, derived from dd::Dictionary_object. That's important to
handle object types imported by external components like plugins.

Dictionary interface looks like:

class Dictionary
  // Get object by name.
  //  - core types
  dd::Table *get_table(Object_name *name) = 0;
  dd::View *get_view(Object_name *name) = 0;
  //  - generic operation
  dd::Dictionary_object *get_object(const std::string &object_type,
                                    Object_name *name) = 0;

  // Create a new (empty) object.
  //   - core types
  dd::Table *create_table() = 0;
  dd::View *create_view() = 0;
  //   - generic operation
  dd::Dictionary_object *create_object(const std::string *object_type) = 0;

  // Register an object factory (see below).
  bool register_object_factory(const std::string &object_type_name,
                               Object_factory *object_factory);

Data Dictionary Object Factories

In order for the Dictionary to be able to create object instance by object
type name, every object type should provide Factory class.

The Factory interface should look like the following:

  class Factory
    dd::Dictionary_object *create_object() = 0;

The Dictionary then will have a map of registered object types:
  { object type name -> object Factory instance }

This way the Dictionary will be able to instantiate an object by type/class

Object life cycle

Life cycle of C++ (in-memory) objects

Object instance construction and destruction should be invisible to the API
  - object instance *can* be constructed when user asks for it
  - object instance *can* be destroyed when user finishes a transaction
  - when a user needs an empty (new) object, the user asks for it
    (instead of explicitly constructing it).

Internally, all Data Dictionary objects are owned by one central module,
which is called Dictionary. The user deals with it to acquire and release
object instances.

At some point, we'll be introducing a notion of "DDL transactions" in the
server. At that point, the API users will start to use Transaction contexts
to acquire and release object instances (instead of Dictionary). However,
that will be just an interface change -- internally Data Dictionary objects
will continue to be owned by the Dictionary.

The Dictionary uses Factories to instantiate an object when needed.

So, the big picture is like this:

          <--> Dictionary  --> Factories
API user           | 1:n
          <--> DDL_transaction

Object life cycle and Object ID

The question here is when a new object id is assigned to a new object?

In other words, let's assume a DD user gets a new C++ instance of
DD-object. Does this instance has a valid object id? If not, when it gets a
valid id?

Let's assume, we have the following (typical) use cases:

  * Create a new table

    dd::Dictionary dict = 
    dd::Table *table = dict->create_table(...);

  * Alter existing table

    dd::Dictionary dict = 
    dd::Table *table = dict->lookup_table(...);

  * Drop existing table

    dd::Dictionary dict = 
    dd::Table *table = dict->lookup_table(...);

The following alternatives have been discussed:

  1. Assign a valid object id right from the start
    (in the create_table() method).

    As it's been decided to use AUTO_INCREMENT for object-id generation,
    assigning a valid object id in create_table() means that we should
    perform 'INSERT INTO dd_table' statement. That INSERT INTO statement
    will insert default/fake/dummy attribute values (because actual values
    are not known at this point), its whole point is to get a valid

    Later actual attribute values will be set. When the user issues
    Dictionary::store_object(), we should issue UPDATE statement, that
    will update already existing row with the actual data.

    The only good thing about this approach is that we have a valid
    object-id right from the start, so theoretically, an object graph
    (several objects referencing each other) can be created first in memory
    and then just stored persistently. However this possibility is not
    needed for the Data Dictionary.

    As for the drawbacks of this approach, they are:

      - Two SQL-statements to create one DD-object;

      - Inconsistent state between INSERT and UPDATE.

  2. Assign a valid object id when the object is persistently stored.

    In other words, the object id assignment is delayed until
    Dictionary::store_object() is called. store_object() issues INSERT INTO
    statement, which inserts a new row with the actual data (all needed
    attributes should be now set) and obtains a new object-id.
It's been decided to use 2nd approach.

Persistent operations

As it's been already mentioned, storing and loading of object data is the
responsibility of the corresponding object type class. The use
cases in the previous section, however, contains Dictionary::store_object()
/ Dictionary::drop_object() instead of Table::store() / Table::drop().

The idea is as follows: the Dictionary class contains high-level methods
(store_object(), ...), which prepares some context and then call
corresponding object type methods (store(), ...) passing this context. The
object type methods use the context to deal with the storage.

Persistency support

Classes like dd::Catalog, dd::Schema, dd::Table, etc represent one row
in the corresponding table. There will be another set of classes
representing the whole table. Those classes will contain TABLE_LIST, TABLE,
TABLE_SHARE objects of the corresponding Data Dictionary table.

Loading of DD objects

We need to define when to load Data Dictionary objects from the database
into memory. 

The proposal is to:

  - have a loading strategy, which can be changed independently;

  - load all the information related to objects like tables (e.g. columns,
    indexes) and stored programs when the object is accessed/requested first 

    Basically, this approach combines "lazy loading" with the "aggressive
      - we postpone loading information of a Data Dictionary object until
        it is needed;
      - when some information is needed, we load all of it;
      - it's assumed that we're going need that information later.

    As mentioned above, strong object references are resolve (objects are
    loaded) at once. Weak object references can be resolved on demand.

Iterating child objects

There are a few "weak" bidirectional relationships between DD objects:

  - Catalog (1) -- (0..N) Schema
  - Catalog (1) -- (0..N) Tablespace
  - Schema (1)  -- (0..N) Table
  - Schema (1)  -- (0..N) View
  - Schema (1)  -- (0..N) Stored routine
  - Schema (1)  -- (0..N) Trigger

The "child" object class in these relations provides a method to get its

  Ref Schema::catalog();
  Ref Tablespace::catalog();
  Ref Table::schema()

The "parent" object class should also provide a method to get an interator
over its children -- something like:

  Schema_iterator Catalog::schemas();

The question is how this iterator can be implemented. There are the
following alternatives.

1. Maintain bidirectional relations in memory

The parent class can have an in-memory collection of child objects, which
is filled when the parent object is restored from the database.

  For example:

    class Catalog {
      std::vector m_schemas;

Then, the iterator-operation just returns an iterator over the collection
in memory.

The downside of this alternative is the cost of maintaining such
bidirectional relations in memory, making sure they don't go out of sync
with the relations in the database and resolving conflicts (if any).

2. Query data from the database

The iterator-operation can fetch needed data from the database -- open a
table, prepare an access key and iterate through the matched records.

The downside here is that any not-yet-stored/committed change will be not
visible. So, although this operation is a part of in-memory Dictionary API,
it actually deals with the database.

There are two options to implement this alternative:

  2a. Prefetch all rows at the time the iterator is created, store the
  collection in memory and return an iterator over that in-memory

  2b. Open a new transaction at the time the iterator is created, store the
  transaction context in the implementation of the iterator, keep the
  transaction open/active, fetch rows from the database gradually and close
  the transaction when the iterator is closed.

The problem of (2a) is potential memory usage because *all* child records
should be prefetched and all child objects should be created and stored.

The problem of (2b) is that the transaction is open/active while the
iterator is open, which means the DD-user must not deal with the database
while iterating over the objects.

It seems that the 2b-problem is more serios because:

  - it's error prone -- the DD API has no way to prevent DD-users from
    accessing the database before the iterator is closed;

  - due to this limitation, the DD-users will have to do basically the same
    "full-prefetch" manually;

  - we're talking about metadata, so potential number of rows to prefetch
    should not be that big;

  - the memory usage in the (2a) case can be significantly decrease by
    providing a way to "push-down" the user condition -- the idea is that
    the DD-user can supply a functor, which chooses which objects should be
    restored based on the user's criteria.

The decision is to go with the (2a) alternative.

Read-only and read-write objects

As it's been previously mentioned, there are read-only and read-write
objects (for "regular transactions" and "DDL transactions").

How they should be represented?

Option 1: constant and non-constant operations within a single class:

  class Table
    const X *x() const;
    void set_x(const X *x);

Option 2: class for read-write objects is derived from class for
read-only objects:

  class Table_ro
    const X *x() const;

  class Table : public Table_ro
    void set_x(const X *x);

The 1st option requires less code to write. However, it's easy to do
a const cast and override const flag.

The 2nd option implies more code, but is more defensive against abuse.

Tentative decision: 2nd option (different classes for read-only and
read-write objects).


Currently the server has three different (but related) structures that
represent a table object on different levels:

  - TABLE_SHARE represents a table in the current data dictionary. In other
    words, if contains table metadata. TABLE_SHARE will be encapsulated and
    eventually replaced by new Data Dictionary object (dd::Table);

  - TABLE_LIST represents a table in the parser and executioner.
    This object will not be replaced by the Data Dictionary.

  - TABLE provides access to table data.
    This object will not be replaced by the Data Dictionary.

TABLE and TABLE_LIST will probably be affected by the Data Dictionary,
but will remain in the server codebase. Although these objects don't belong
to the Data Dictionary level, the boundaries are not quite clear.
For example, TABLE_LIST objects do contain Data Dictionary information
(e.g for view definitions). Anyway, refactoring of these objects is beyond
the scope of this WL.

Customizing the Data Dictionary

One of the requirements for new Data Dictionary is to provide a way for
Storage Engines and Plugins to
  a) store custom data in the Data Dictionary

There are the following use cases:

  1. Plugin needs to store its own data.

  Basically, there should be some persistence service for plugins.
  Plugins may or may not be concerned where and how the data is stored.

  2. Plugin needs to add its own attributes to existing data
  (for example, a plugin may need to add a few attributes for each table).

  In other words, plugin may need to associate its own data with some other

  3. Plugin needs to add new virtual tables, such as add new tables to

  4. Plugin needs to add new columns to the existing virtual tables
  (for example, a plugin may need to add a few columns to

Store plugin data

The proposal is to use separate tables to store custom plugin data. (This means
both case 1 and 2 above.) In other words, even if a plugin needs to associate
some custom data with the existing records, it should be implemented using a
new table, which has the existing record key and custom data.


  - it seems to be good to preserve "the core" data dictionary. The server
    should be able to start with any combination of plugins (or without any
    plugin at all), thus the plugins should be prevented from polluting
    "the core" data.

  - if the core data was extended, how would other server components insert
    records there? They don't know anything about new extensions, so they
    can't provide meaningful data.

We also need a way for plugins to create such tables in the storage engine
at server startup (if they are not present already).

Virtual tables

A virtual table is a table which does not have persistent data.
Currently this is only INFORMATION_SCHEMA and PERFORMANCE_SCHEMA tables.

Virtual tables can be provided by the server core. But plugins should be
able to add new virtual tables. For now we will limit this to adding new
virtual tables to INFORMATION_SCHEMA and PERFORMANCE_SCHEMA. If the server 
discovers two virtual tables with the same name, we should abort and return an 
error. (Rather than having the available table depend on plugin load order.)
The server core will have no knowledge about how to populate virtual tables
provided by plugins, the server just provides a framework for dealing with

Any plugin wishing to extend a virtual table, should add a new virtual table. 
This new virtual table should include the record key of the existing virtual 
table - similarly to how extensions to normal tables are handled.

This means that the definition of a given virtual table won't depend 
which plugins are loaded or their load order. Either the virtual table 
exists or it doesn't. If it exists it always has the same definition for a 
given version of its provider (either the server core or a plugin).
We also won't support changing the definition of virtual tables online.

Since the definition of a given virtual table is fixed, it can be hard-coded 
such that the memory DD object can be constructed from code (at server startup 
or on cache miss).

One option is to not write virtual table definitions to the data dictionary 
storage engine. This means that INFORMATION_SCHEMA queries accessing these 
definitions cannot simply send these queries to the storage engine since
virtual table definitions does not exist in the DD storage engine.

However, the current suggestion is to write the virtual table definitions
to disk during server startup. This will allow INFORMATION_SCHEMA
queries accessing them to be processed normally by the DD storage engine.
Since virtual table definitions can change if a) the server is upgraded
b) a plugin is upgraded; the definitions can be recreated each server startup
to simplify upgrade/downgrade.


Sequences will replace AUTO_INCREMENT. There are also plans to support
sequences on the SQL layers (CREATE SEQUENCE, ...).

From the Data Dictionary API point of view, Sequence is just another object

From the overall Data Dictionary view point, Sequences are somewhat special
object type, as they are updated very often. So, there should be a split
between Sequence metadata (which is updated rarely) and Sequence data
(which is updated often).

Common API attribute guidelines

DD API for different object types will have the same groups / kinds of
attributes. This section outlines common traits of these groups.


Properties is a collection of key-values pairs. It's stored as a string in
the DD tables, and it's represented as dd::Properties interface to the DD
API users.

An attribute of the properties type (e.g. dd.tables.options or
dd.columns.se_private_data) is represented by the following three

  - An operation returning a read-only reference (const dd::Properties &)
    to get or iterate over the set of key-value pairs in a convenient way.


      virtual const Properties &options() const = 0;
  - An operation returning an interface (read-write) which also allows to
    add/delete/modify key-value pairs in a convenient way.


      virtual Properties &options() = 0;

  - An operation which takes a new raw option string (in an internal
    format) and tries to parse it to build a new set of key-value pairs. If
    the parsing succeeds, the new set replaces the old one and the
    operation returns false. Otherwise, the current set of key-value pairs
    remains untouched and the operation failes (returns true).


      virtual bool set_options_raw(const std::string &options_raw) = 0;


Collections come to light when a "Parent-class" object is tightly-coupled
with a group of "Child-class" objects.

One collection can contain items of one type.

Child and Parent must be in tightly-coupled relationship, meaning that
Parent owns Child-objects.

  - Table - Columns -- a collection
  - Table - Indexes -- a collection
  - Schema - Tables -- not a collection since Schema does not own its

One child class can be only in one collection (tightly-coupled)
relationship. For instance, if the Column class is included in Table,
the Column class can not be included in any other class.

A collection can be either ordered or unordered. Ordered collections
maintain order of their items.

An attribute of the collection type (e.g. Table::columns) is represented by
the following operations.

Parent class operations

  - An operation to add new child element.


      class Table
        virtual Column *add_column() = 0;

  - An operation to return a const iterator over the child elements (const
    iterators return a const reference/pointer to an element, meaning that
    the element can not be modified by that iterator).


      class Table
        virtual Const_iterator *columns() const = 0;

  - An operation to return a non-const iterator over the child elements
    (non-const iterators return a non-const reference/pointer to an
    element, meaning that the element can be modified by that iterator).


      class Table
        virtual Iterator *columns() = 0;

  - For ordered collections only: an peration changing the order of items.

      class Table
      virtual void move_element(int old_index, int new_index) = 0;
    The old_index and new_index must be valid, otherwise the behaviour is

Child class operations

  - An operation to remove the object from the collection.

    This operation marks the child object to be dropped, but does not
    actually drops it in the database, neither does it delete the
    corresponding C++ object.

    The actual record is deleted when the parent object is stored next

    The child C++ object is deleted either when the parent object is stored
    to the database, or when the parent object is deleted as C++ object.


      class Column
        virtual void drop() = 0;

  - An attribute returning a reference to the parent object.


      class Column
        const Table &table() const;
        Table &table();

  - For ordered collections only: an attribute returning the current
    ordinal position of the item in the parent's collection.

      class Column
        virtual uint ordinal_position() const = 0;