WL#8017: Infrastructure for Optimizer Hints

Affects: Server-5.7   —   Status: Complete

In order to implement optimizer hints (WL#3996), some general infrastructure
common to all hints are needed.  Such infrastructure includes:

- A data structure for associating hints with relevant query block during parsing
- Utility functions for resolving names of database objects (e.g. table names, 
  index names) and associate hints with actual database objects (e.g., table 
  object)
- A data structure for optimizer to look up hints on current query


Some key ideas:

1. Comment syntax will be used for new hints.

   The syntax will be /*+ */

2. Multiple hints may be specified in the same comment. E.g., 

   /*+ HINT1(...) HINT2 HINT3(...) */

3. A statement block can have only one comment containing hints, and
   that comment must follow the SELECT, UPDATE, INSERT, REPLACE, or
   DELETE keyword.

4. Hints specified in the query block affect only current query block
   if QB_NAME is not specified. If QB NAME is specified for the hint,
   this hint affects query block with appropriate query block name.
   QB_NAME can not be used in views.  

5. Bad syntax in a hint will cause a warning.

6. The first of conflicting hints will have effect, subsequent
   conflicting/duplicating hints are ignored with warning.

7. Hints in views are not supported. Access to view's objects
   can be done only via system identifiers. System identifiers
   should be visible in EXPLAIN FORMAT=JSON.
   We already have query block identifier('select_id' field).
   Table identifier should be added('table_id' field).It displays
   table number in the query block(TABLE_LIST::m_tableno). Note
   that m_tableno = 0  for the first table.
   Note: Will be implemented in separate WL.

8. Naming convention for system identifiers.
   Query block system identifier could be determined
   using 'SEL#' prefix. Table identifier uses prefix 'TAB#'.

   Examples of notations:

   SEL#1 - first query block.
   TAB#0 - first table in the query block.
   TAB#0@SEL#1 - first table in the first query block.
   Note: Will be implemented in separate WL.

9. All existing hint-like things are merged with new hints. In case of
   duplication or conflict hints in old style are ignored with warning.

   MySQL already supports some hints(USE|FORCE|IGNORE INDEX).
   They should comply with new style hints.

   For example,
   /*+ NO_INDEX(t1 idx1) */ ... FROM t1 USE INDEX (idx1);

   Hints above are conflicting and hints in old style should be
   ignored with warning. There won't be any change of the old
   style hints processing. We just add verification procedure
   of the result of old style hint processing and new style hints.

   For example, for INDEX hints it's TABLE_LIST::process_index_hints()
   procedure. This procedure sets various key maps according to
   old style hints. New procedure which verifies populated key maps
   with new style hints could be added there.
   Note: Will be implemented in separate WL.


10. Hints always have higher priority against system variables.

11. Multilevel hints are supported for TABLE level.

    For instance, BKA/NO_BKA hint:

    SELECT /*+ BKA() */ * FROM t1, t2 ...; //syntax is supported
    SELECT /*+ BKA(t1) BKA(t2) */ * FROM t1, t2 ...; //syntax is supported
    SELECT /*+ BKA(t1, t2) */ * FROM t1, t2 ...; //syntax is supported

Parser for HINTs:

We will use version4 from WL#8016 Parser for optimizer hints.
(see WL#8016 for details).
Functional requirements:

F-1: Hints must be enclosed into /*+ */ comment.
F-2: Hints must be specified after SELECT|INSERT|REPLACE|UPDATE|DELETE key words.
F-3: EXPLAIN must understand hints.
F-4: Active hints must be printed in EXPLAIN warning.
F-5: Subsequent conflicting/duplicating hints are ignored with warning.
F-6: Unresolved hints cause warning.

Non-Functional requirements:

NONE
HINT parser parses each /*+ */ comment following
SELECT|INSERT|REPLACE|UPDATE|DELETE key words and provides
list of the hints declared in the current query block
(PT_hint_list, see WL#8016). EXPLAIN is also supported.
When parsing is complete, ::contextualize() method is called.
This method creates global Opt_hints object(linked to LEX structure).
If the object already exists, this object is reused.

Opt_hints class is hierarchical structure.
It consists of list of 'QUERY BLOCK' objects
(Opt_hints_qb class). Each 'QUERY BLOCK' object
is linked to appropriate SELECT_LEX structure and
contains list of 'TABLE' hints(class Opt_hints_table).
Each 'TABLE' hint contains list of 'KEY' hints
(Opt_hints_key class).


Consider we have /*+ index( t1 idx1, idx2 ) */ hint. 

At ::contextualize() stage:

PT_hint_index::contextualize(Parse_context *pc)
{
   1. Get global hint object(create if not exists).
   2. Get query block object(create if not exists),
      link it with appropriate SELECT_LEX structure.

   3. Get table level object(create if not exists).
   4. Get index level object for each index(create if not exists).
   5. Set appropriate flag.
}


How does it find the appropriate SELECT_LEX?

If query block is not specified, we just use current SELECT_LEX.
If  query block hint is not created yet(SELECT_LEX::opt_hints_qb is null)
then we create query block hint, set SELECT_LEX::opt_hints_qb and use it.
If the hint is already created, we use this query block hint.

If query block is specified, the processing is more tricky.

Consider we have following hint:

/*+ QB_NAME(qb1) INDEX(@qb1 t1 PRIMARY) */

When we process QB_NAME, we find appropriate query block and
update current SELECT_LEX::opt_hints_qb with the pointer to the
query block. At INDEX hint processing stage we try to find 'qb1'
block in the list of query block hints using query block name.
If block is not found, we issue warning. QB name is case insensitive.


At st_select_lex::setup_tables() stage:

{
   1. Find table level hint for the real table(TABLE_LIST).
   2. Merge TABLE_LIST::process_index_hints() with new
      style hints, i.e. update TABLE::keys_in_use_for_query,
      TABLE::keys_in_use_for_order_by,
      TABLE::keys_in_use_for_group_by with new hints.
}



Hints have four levels of applicability:
1. GLOBAL
2. QUERY BLOCK
3. TABLE
4. KEY


LEVELS of the HINTS.

Global level hints:
  hints affect whole query
  EXAMPLE: MAX_EXECUTION_TIME

  Specification:
  HINT([arguments])

Query Block level hints:
  hints affect query block(SELECT_LEX)
  EXAMPLE: atm SEMIJOIN, SUBQUERY

  Specification:

  HINT([@QB_NAME] [arguments])
  If QB NAME is specified for the hint
  then this hint affects query block with
  appropriate query block name, otherwise
  it belongs to current query block.

Table level hints:
  hints control table behavior
  EXAMPLE: BNL, BKA with arguments(tables)
  
  Specification:
  HINT([@QB_NAME] table[, table] ...)
  or 
  HINT(table[@QB_NAME] [, table[@QB_NAME]] ...)

  Both styles are Oracle styles, QB_NAME is used
  to specify appropriate query block. We could have some hints with
  tables from different query block in future. 'table@QB_NAME' syntax
  allows to specify such a tables. 'table@QB_NAME' syntax is more
  flexible. So it's good to have it.

  NOTE: Alias is used as a table name, if alias is not
  specified than table name is used. Schema name is not
  allowed in specification. Table alias can not be used
  to get an access to view's objects. System identifiers
  should be used in views.


Index level hints:
  hints control index behavior
  EXAMPLE: ICP, MRR

  Specification:
  HINT([@QB_NAME] table [index[, index]...])
  or 
  HINT(table[@QB_NAME] [index[, index]...])

  Both styles are Oracle styles, QB_NAME is used
  to specify appropriate query block.

How it should work(for MRR, for example):

  MRR can be on two levels. If no indexes is specified, then
  it's table level and appropriate table hint map bit is set.
  If indexes are specified, then key level object is created for
  each key and appropriate key hint map bit is set.
  When we need to check if index  can be used for MRR, we check
  appropriate key object and parent(table)object.



There are also two possible types of the hints,
SWITCHES and COMPLEX.

TYPES of the HINTS:

SWITCHES:
  These hints work like switches, i.e. they turn
  ON|OFF some features. It does not require a special
  container, several hints can be stored in a single
  bitmap(for example, MRR & NO_MRR hints).

COMPLEX:
  Some hints need a special object since they have
  additional data(for example, MAX_EXECUTION_TIME(milliseconds)).
  Such a object is explicitly declared in the hint class of
  appropriate level.

  
Internal structures for hints:

1. Global hints.

class Opt_hints : public Sql_alloc
{
  Mem_root_array query_blocks;
  Opt_hints_map hints_map;   // Hint map

public:
  // Complex objects
  PT_hint_max_execution_time *hint_max_exec_time;
  ...
};

  
2. Query block hints.

class Opt_hints_qb : public Sql_alloc
{
  Opt_hints *parent;
  const LEX_CSTRING *name;
  Mem_root_array tables;
  Opt_hints_map hints_map;   // Hint map

public:
  // Complex objects
  ...

};

3. Table level hints.

class Opt_hints_table : public Sql_alloc
{
  Opt_hints_qb *parent;
  const LEX_CSTRING *name;
  Mem_root_array keys;
  Opt_hints_map hints_map;   // Hint map

public:
  // Complex objects
  ...
};

  
4. Index level hints.

class Opt_hints_key : public Sql_alloc
{
  Opt_hint_table *parent;
  const LEX_CSTRING *name;
  Opt_hints_map hints_map;   // Hint map

public:
  // Complex objects
  ...

};


Processing of the HINT objects:

1. At ::contextualize() stage new Opt_hints is
   created with appropriate child objects. At the
   same time conflicting and duplicated hints are
   processed. If two conflicting|duplicated hints
   are found, the first specified processed hint
   is marked as specified and all subsequent objects
   of the same type are just ignored with warning.

2. At compile() stage 'switches' & 'switches_specified'
   maps are populated based on the hint list. This stage
   happens in resolve phase. 'switches_specified' is ON
   if the hint was specified and 'switches' value is ON|OFF
   depending on the hint value.
   Along with the populating of the maps each hint object is
   linked to appropriate optimizer structure.
   Opt_hints_table is linked to TABLE_LIST structure,
   Opt_hints_key to keyinfo_array alement.
   Old style hints are also merged with new hints.

3. Obtaining of the hint value. Hint valued can be
   obtained via auxiliary functions(see hint_key_state,
   hint_table_state for example)  

4. Printing hints.
   ::print() method is used for printing the hints. It is necessary
   to print hint info in explain warning. Printing is also important
   for views. If hints are used in the view we should print this info
   when view is created. 
/**
  Opt_hints_map contains information
  about hint state(specified or not, hint value).
*/

class Opt_hints_map : public Sql_alloc
{
  Bitmap<64> hints;           // hint state
  Bitmap<64> hints_specified; // true if hint is specified
public:

  /**
     Check if hint is specified.

     @param type_arg   hint type
     
     @return true if hint is specified,
             false otherwise
  */
  my_bool is_specified(opt_hints_enum type_arg) const
  {
    return hints_specified.is_set(type_arg);
  }
  /**
     Set switch value and set hint into specified state.

     @param type_arg           hint type
     @param switch_state_arg   switch value
  */
  void set_switch(opt_hints_enum type_arg,
                  bool switch_state_arg)
  {
    if (switch_state_arg)
      hints.set_bit(type_arg);
    else
      hints.clear_bit(type_arg);
    hints_specified.set_bit(type_arg);
  }
  /**
     Get switch value.

     @param type_arg    hint type

     @return switch value.
  */
  bool switch_on(opt_hints_enum type_arg) const
  {
    return hints.is_set(type_arg);
  }
};


/**
  Opt_hints class is used as ancestor for Opt_hints_global,
  Opt_hints_qb, Opt_hints_table, Opt_hints_key classes.

  Opt_hints_global class is hierarchical structure.
  It contains information about global hints and also
  conains array of QUERY BLOCK level objects (Opt_hints_qb class).
  Each QUERY BLOCK level object contains array of TABLE level hints
  (class Opt_hints_table). Each TABLE level hint contains array of
  KEY lelev hints (Opt_hints_key class).
  Hint information(specified, on|off state) is stored in hints_map object.
*/

class Opt_hints : public Sql_alloc
{
  /*
    Name of object referred by the hint.
    This name is empty for global level,
    query block name for query block level,
    table name for table level and key name
    for key level.
  */
  const LEX_CSTRING *name;
  /*
    Parent object. There is no parent for global level,
    for query block level parent is Opt_hints_global object,
    for table level parent is Opt_hints_qb object,
    for key level parent is Opt_hints_key object.
  */
  Opt_hints *parent;

  Opt_hints_map hints_map;   // Hint map

  /* Array of child objects. i.e. array of the lower level objects */
  Mem_root_array child_array;
  /* true if hint is connected to the real object */
  bool resolved;
  /* Number of resolved children */
  uint resolved_children;

public:

  Opt_hints(const LEX_CSTRING *name_arg,
            Opt_hints *parent_arg,
            MEM_ROOT *mem_root_arg)
    : name(name_arg), parent(parent_arg), child_array(mem_root_arg),
      resolved(false), resolved_children(0)
  { }

  bool is_specified(opt_hints_enum type_arg) const
  {
    return hints_map.is_specified(type_arg);
  }

  /**
    Function sets switch hint state.

    @param switch_state_arg  switch hint state
    @param type_arg          hint type
    @param check_parent      true if hint can be on parent level

    @return  true if hint is already specified,
             false otherwise
  */
  bool set_switch(bool switch_state_arg,
                  opt_hints_enum type_arg,
                  bool check_parent)
  {
    if (is_specified(type_arg) ||
        (check_parent && parent->is_specified(type_arg)))
      return true;

    hints_map.set_switch(type_arg, switch_state_arg);
    return false;
  }

  /**
    Function returns switch hint state.

    @param type_arg          hint type

    @return  hint value if hint is specified,
            false otherwise
  */
  bool get_switch(opt_hints_enum type_arg) const;

  virtual const LEX_CSTRING *get_name() const { return name; }
  void set_name(const LEX_CSTRING *name_arg) { name= name_arg; }
  Opt_hints *get_parent() const { return parent; }
  void set_resolved() { resolved= true; }
  bool is_resolved() const { return resolved; }
  void incr_resolved_children() { resolved_children++; }
  Mem_root_array *child_array_ptr() { return &child_array; }

  bool is_all_resolved() const
  {
    return child_array.size() == resolved_children;
  }

  void register_child(Opt_hints* hint_arg)
  {
    child_array.push_back(hint_arg);
  }

  /**
    Returns pointer to complex hint for a given type

    @param type  hint type

    @return  pointer to complex hint for a given type.
  */
  virtual PT_hint *get_complex_hints(uint type)
  {
    DBUG_ASSERT(0);
    return NULL; /* error C4716: must return a value */
  };

  /**
    Find hint among lower-level hint objects.

    @param name_arg        hint name
  
    @return  hint if found,
             NULL otherwise
  */
  Opt_hints *find_by_name(const LEX_CSTRING *name_arg) const;
  /**
    Print all hints except of QB_NAME hint.

    @param thd             Pointer to THD object
    @param str             Pointer to String object
  */
  void print(THD *thd, String *str);
  /**
    Check if there are any unresolved hint objects and
    print warnings for them.

    @param thd             Pointer to THD object
  */
  void check_unresolved(THD *thd);
  virtual void append_name(THD *thd, String *str)= 0;

private:
  /**
    Append hint type.

    @param str             Pointer to String object
    @param type            Hint type
  */
  void append_hint_type(String *str, opt_hints_enum type);
  /**
    Print warning for unresolved hint name.

    @param thd             Pointer to THD object
  */
  void print_warn_unresolved(THD *thd);
};


/**
  Global level hints.
*/

class Opt_hints_global : public Opt_hints
{

public:
  PT_hint_max_execution_time *max_exec_time;
 
  Opt_hints_global(MEM_ROOT *mem_root_arg)
    : Opt_hints(NULL, NULL, mem_root_arg)
  {
    max_exec_time= NULL;
  }

  virtual void append_name(THD *thd, String *str) {}
  virtual PT_hint *get_complex_hints(uint type);
};


/**
  Query block level hints.
*/

class Opt_hints_qb : public Opt_hints
{
  uint select_number;     // SELECT_LEX number
  LEX_CSTRING sys_name;   // System QB name
  char buff[32];          // Buffer to hold sys name

public:

  Opt_hints_qb(Opt_hints *opt_hints_arg,
               MEM_ROOT *mem_root_arg,
               uint select_number_arg);

  const LEX_CSTRING *get_print_name()
  {
    const LEX_CSTRING *str= Opt_hints::get_name();
    return str ? str : &sys_name;
  }

  /**
    Append query block hint.

    @param thd   pointer to THD object
    @param str   pointer to String object
  */
  void append_qb_hint(THD *thd, String *str)
  {
    if (get_name())
    {
      str->append(STRING_WITH_LEN("QB_NAME("));
      append_identifier(thd, str, get_name()->str, get_name()->length);
      str->append(STRING_WITH_LEN(") "));
    }
  }
  /**
    Append query block name.

    @param thd   pointer to THD object
    @param str   pointer to String object
  */
  virtual void append_name(THD *thd, String *str)
  {
    str->append(STRING_WITH_LEN("@"));
    append_identifier(thd, str, get_print_name()->str, get_print_name()->length);
  }

  /**
    Function finds Opt_hints_table object corresponding to
    table alias in the query block and attaches corresponding
    key hint objects to appropriate KEY structures.

    @param table      Pointer to TABLE object
    @param alias      Table alias

    @return  pointer Opt_hints_table object if this object is found,
             NULL otherwise.
  */
  Opt_hints_table *adjust_table_hints(TABLE *table, const char *alias);
};


/**
  Table level hints.
*/

class Opt_hints_table : public Opt_hints
{
public:
  Mem_root_array keyinfo_array;

  Opt_hints_table(const LEX_CSTRING *table_name_arg,
                  Opt_hints_qb *qb_hints_arg,
                  MEM_ROOT *mem_root_arg)
    : Opt_hints(table_name_arg, qb_hints_arg, mem_root_arg),
      keyinfo_array(mem_root_arg)
  { }

  /**
    Append table name. 

    @param thd   pointer to THD object
    @param str   pointer to String object
  */
  virtual void append_name(THD *thd, String *str)
  {
    append_identifier(thd, str, get_name()->str, get_name()->length);
    get_parent()->append_name(thd, str);
  }
  /**
    Function sets correlation between key hint objects and
    appropriate KEY structures.

    @param table      Pointer to TABLE object
  */
  void adjust_key_hints(TABLE *table);
};


/**
  Key level hints.
*/

class Opt_hints_key : public Opt_hints
{
public:

  Opt_hints_key(const LEX_CSTRING *key_name_arg,
                Opt_hints_table *table_hints_arg,
                MEM_ROOT *mem_root_arg)
    : Opt_hints(key_name_arg, table_hints_arg, mem_root_arg)
  { }

  /**
    Append key name.

    @param thd   pointer to THD object
    @param str   pointer to String object
  */
  virtual void append_name(THD *thd, String *str)
  {
    get_parent()->append_name(thd, str);
    str->append(' ');
    append_identifier(thd, str, get_name()->str, get_name()->length);
  }
};


/**
  Returns key hint value if hint is specified,
  returns optimizer switch value if hint is not
  specified.

  @param thd               Pointer to THD object
  @param tab               Pointer to TABLE object
  @param keyno             Key number
  @param type_arg          Hint type
  @param optimizer_switch  Optimizer switch flag

  @return key hint value if hint is specified,
          otherwise optimizer switch value.
*/
bool hint_key_state(const THD *thd, const TABLE *table,
                    uint keyno, opt_hints_enum type_arg,
                    uint optimizer_switch);

/**
  Returns table hint value if hint is specified,
  returns optimizer switch value if hint is not
  specified.

  @param thd                Pointer to THD object
  @param tab                Pointer to TABLE object
  @param type_arg           Hint type
  @param optimizer_switch   Optimizer switch flag

  @return table hint value if hint is specified,
          otherwise optimizer switch value.
*/
bool hint_table_state(const THD *thd, const TABLE *table,
                      opt_hints_enum type_arg,
                      uint optimizer_switch);