WL#4211: Online Backup: Dependency checking

Affects: Server-6.0   —   Status: Complete

Rationale
---------
Develop a mechanism that identifies a dependency order for the creation of 
objects on restore.

Dependency order checking
-------------------------
Ensure that views are stored in dependency order, that is, view A on which 
view B depends should be stored before B.

Analysis
--------

Objects will be restored in the following order:

1. users
2. table spaces
3. databases
4. tables
5. procedures and functions
6. views
7. triggers
8. events 

This ensures that most of the dependencies are satisfied. For example tables are
created before views so when a view is created its base tables should exist.
However, views and tables need special handling since they can depend on each
other and also on objects of the same type. For example a table can depend (via
foreign key constraint) on another table or on a view. A view usually depends on
a table but it can also depend on another view.

Dependencies for tables are handled by turning off foreign key constraint 
checking during restore operation. Checks are activated when all objects are 
restored.

For views we have no other choice but to take into account their mutual 
dependencies. If view v1 depends on v2 then we need to store v2 *before* v1 in 
the meta-data section so that v2 will be created before v1. Let us say that 
views are stored in "dependency order" if the following holds for any two views 
in the sequence:

 If view v1 depends on v2 then v2 is stored before v1.

The order in which objects are restored is determined by the order in which
their meta-data is saved inside backup image. Saved meta-data is divided into
three main sections:

1. global objects: (users, table spaces, databases),
2. tables,
3. per-database objects: (stored routines, views, triggers and events).

Global objects are already stored in the correct order (first users then table
spaces then databases). Therefore it remains to correctly order per-database
objects inside their section.

DELIVERABLE
-----------

The goal of this task is to modify backup kernel code so that meta-data for 
per-database objects is stored in the following order:

1. stored procedures and functions,
2. all views in dependency order,
3. triggers,
4. events.
Dependency list for per-database objects
----------------------------------------

The order in which meta-data of per-database objects is stored inside backup
image is determined by the order in which they are listed by the iterator
obtained from bcat_iterator_get() with type= BSTREAM_IT_PERDB. 

Currently this iterator lists objects in an arbitrary order. To enforce the
dependency order described in HLD, objects inserted into the catalogue will be
arranged into a dependency list. This list will be used by the iterator to list
them in the correct order.

The dependency list is divided into four main sections:

1. stored routines
2. views
3. triggers 
4. events

Normally, objects are appended at the end of each section. However, it is
possible to create an empty slot in the list which will be used if an object
with given name is added to the catalogue later. 

Using slots, the dependency order between views stored in the list can be enforced.

For example, suppose that view v depends on views v1, v2, v3. Before v is added
to the catalogue, first slots for v1, v2 and v3 are created on the list. Then
view v is added after these slots. If later one of v1, v2 or v3 is put on the
dependency list, it will be placed in the corresponding slots, i.e. before view v.

When a new object is inserted into the dependency list, first it is checked if
an empty slot for that object exists. If yes, then this slot is used and the
object is stored at the position of the slot. Otherwise the object is appended
at the end of the section corresponding to its type.

Handling dependencies for views
-------------------------------

Before inserting a view into the catalogue, slots will be created for all the
views on which it depends using the following recursive procedure.

add_view_deps(v)=
  for every base view v' of v do
    add_view_deps(v')
  append slot for v to the dependency list.

This ensures a correct order between slots for all the views on which a given
one depends.

Actually the algorithm is a bit more complicated to ensure correct termination 
even when there are cyclic dependencies. To that end, set A of processed views 
is maintained. If a view is in this set then it is not processed again and 
recursion terminates:

add_view_deps(v)=
  if A contains v then return else add v to A.
  for every base view v' of v do
    add_view_deps(v')
  append slot for v to the dependency list.

Thus even if view v has itself as an (indirect) dependency the algorithm will 
terminate. When processing v for the first time it will be added to A and then, 
when add_view_deps() comes across v for the second time it will return 
immediately.


Note that for all other object types, their dependencies are handled by grouping
them within sections inside the dependency list.
Implementation of the dependency list
-------------------------------------

The dependency list will be implemented as a linked list of nodes inside
Backup_info class. Each node can hold a catalogue item - in that case a pointer
to Dbobj instance is stored in it. Alternatively, it can represent an empty slot
- in this case the stored pointer is NULL. Dbobj_iterator will iterate over the
dependency list ignoring the empty slots.

All nodes of the dependency list are kept inside a hash table 
Backup_info::dep_hash indexed by the name of an object qualified with the name
of the database to which it belongs.

The nodes are allocated using mem_root inherited from Image_info and thus will 
dissapear when Backup_info instance is destroyed.

Maintaining the list
--------------------

When new object is inserted to the catalogue with Image_info::add_db_object() 
an instance of Dbobj class is created for it. Then it is added to the dependency
list as follows.

1. First a Dep_node for that object is created or an existing one
found if it was created before.

2. If an existing node was found, then it is already placed at correct position 
in the list and the pointer to the new catalogue item is stored in it.

3. Otherwise, the new node is added to the correct section of the dependency 
list. However, if the new object is a view, then its dependencies are handled 
first using insert_view_deps() procedure described above.

Two methods will be added to Backup_info class for manipulating the dependency
list:

int get_dep_node(db_name, name, ptr); 

It saves in ptr a pointer to Dep_node for an object with given name. If such 
node was created before a pointer to the existing node is returned. Otherwise a 
new node is created. Whether new or existing node was returned is indicated by 
the return value.

int add_to_dep_list(type, node);

Adds node to the dependency list, appending it to the correct section of the 
list based on the type of object.