WL#5217: Add explicit partition selection

Affects: Server-5.6   —   Status: Complete   —   Priority: Medium

To allow the user to only use a specific sets of partitions, i.e. explicit
pruning, the syntax should be extended with 'PARTITION(one or more
partitions)':

SELECT * FROM t1 PARTITION(p0,p1) WHERE ...;
DELETE FROM t1 PARTITION(p1,p4) WHERE ...;
INSERT INTO t1 PARTITION(p3) VALUES ...;
REPLACE INTO t1 PARTITION(p3) VALUES ...;
UPDATE t1 PARTITION(p1,p8) SET ... WHERE ...;
LOAD ... INTO TABLE t1 PARTITION (p3) ...


This worklog supersedes WL#2682.

This would decrease the impact of BUG#37252, allowing the user to avoid locking
non used partitions.

Requested in BUG#20279 (several customers).

Also asked about in the partitioning forum:
http://forums.mysql.com/read.php?106,387707,387707#msg-387707
Syntax extension (taken from the 5.5 manual):

SELECT syntax (explained in JOIN syntax)
Current syntax:
FROM table_references

table_references:
    table_reference [, table_reference] ...

table_reference:
    table_factor
  | join_table

table_factor:
    tbl_name [[AS] alias] [index_hint_list]
  | table_subquery [AS] alias
  | ( table_references )
  | { OJ table_reference LEFT OUTER JOIN table_reference
        ON conditional_expr }

...

TO:
table_factor:
    tbl_name [PARTITION (partitions)] [[AS] alias] [index_hint_list]
  | ...

partitions:
    partition_name [, partition_name] ... 

partition_name could be either a partition or a subpartition. Duplicate names or
overlapping partitions/subpartitions are accepted, but will still only mark the
partition/subpartition for use once. The order of the elements in the list
does not matter. (All elements in the list will set a bit in a bitmap, resulting
in duplicate and overlapping elements will only set the same bits again).


DELETE syntax
Single-table syntax:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name
    [WHERE where_condition]
    [ORDER BY ...]
    [LIMIT row_count]

TO:
DELETE ... FROM/USING tbl_name [PARTITION (partitions)]

Multiple-table syntax:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
    tbl_name[.*] [, tbl_name[.*]] ...
    FROM table_references
    [WHERE where_condition]

Or:

DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
    FROM tbl_name[.*] [, tbl_name[.*]] ...
    USING table_references
    [WHERE where_condition]

Both uses the same table_references as SELECT, so no syntax change specific to
multi-table delete.

INSERT syntax change:
INSERT ... [INTO] tbl_name [PARTITION (partitions)]

LOAD XML/DATA INFILE:
LOAD ... INTO TABLE tbl_name
TO:
LOAD ... INTO TABLE tbl_name [PARTITION (partitions)]

REPLACE (Only different from INSERT if PK/UNIQUE is used, which then will not
move any row to different partition. Note: this may make global indexes harder
to implement.):
REPLACE ... [INTO] tbl_name
TO:
REPLACE ... [INTO] tbl_name [PARTITION (partitions)] 

UPDATE:
UPDATE uses table_reference[s] from SELECT, so no syntax change specific for UPDATE.
Note: if PARTITION is given for a table to be updated, it restricts both the
read AND write to that set of partitions! (i.e if a row is updated to be in a
partition not in the set of given partitions, the command will fail, or not
update such rows (but give a warning) if IGNORE was given.

HANDLER: 
Is not supported for partitioned tables (disabled table flag).

ANALYZE/CHECK/OPTIMIZE/REPAIR/TRUNCATE TABLE already have partition aware
counterparts in ALTER TABLE ... ANALYZE/CHECK/OPTIMIZE/REPAIR/TRUNCATE
PARTITION, so no extension of these commands in this WL.

CACHE INDEX and LOAD INDEX INTO CACHE already have PARTITION in their
syntax, so no extension of these commands in this WL.

LOCK, GRANT, REVOKE, CHECKSUM, SHOW COLUMNS, SHOW CREATE TABLE, SHOW INDEX,
CREATE, ALTER, DROP, RENAME, should all operate on table level only (except
ALTER which is already extended to handle partitions)


Error handling:
If PARTITION () clause is used for a non partitioned table, a new error will be
returned (ER_PARTITION_CLAUSE_ON_NONPARTITIONED).
If there are partition names in the partition list that does not match an
existing partition, a new error will be returned
(ER_PARTITION_CLAUSE_DID_NOT_MATCH).
Duplicate partition names and subpartition names within a given overlapping
partition name is will not be considered as errors (so the list: p0, p1, p0sp1,
p1,p2sp0 will be accepted and handled as the list p0,p1,p2sp0).

In Select there will be no new errors, only limitation of the result set.
For the other DML:s error HA_ERR_NO_PARTITION_FOUND will be given from the
engine, resulting in error ER_NO_PARTITION_FOR_GIVEN_VALUE.


PARTITION will be a new reserved word.
The higher level description:
* Change the parser to allow a PARTITION () clause according to the HLD.
* Make the partitioning names accessible to ha_partition during locking.

LOCK TABLES handling:
LOCK TABLES works by locking all tables when the 'LOCK TABLES ...' command is
executed.
It then only calls handler::start_stmt for each statement instead of
ha_external_lock (and never calls ha_external_lock again for unlocking for each
statement).
It still calls handler::ha_reset() for resetting the handlers.
UNLOCK TABLES calls handler::ha_external_lock(F_UNLCK) to unlock all previously
locked tables.

handler::open() will not be pruned, since ::open() calls will not be called for
each statement due to the table cache. Only when there is no opened handler, for
the table, available in the table cache it will need to create a new handler and
issue handler::ha_open() for that handler. I.e. opened handlers are reused and
not closed until they are evicted from the table cache.


Lower level description:

To handle the partition names per table, a new list of strings must be added to
the TABLE_LIST struct, so that the list of partition names
can be added with add_table_to_list in sql_yacc.yy and included for each table's
TABLE_LIST node.

Split partition_info.used_partitions bitmap into two:
* read_partitions - partitions that satisfy both
                    the explicit PARTITION () clause, and
                    the WHERE clause, handled in prune_partitions().
* lock_partitions - partitions that satisfy the explicit PARTITION () clause
                    and needs to be locked (or call start_stmt() to execute
                    the statement.

Add an internal bitmap in ha_partition to keep track of the locked partitions,
since partition_info->lock_partitions may be different for each statement under
LOCK TABLE, so that it can unlock the partitions it previously have locked.

Add an entry point in open_and_lock_tables() between open_tables() and
lock_tables() to:
* Set all the newly opened table's partition_info->lock_partitions,
  according to the PARTITION () clause (and later in WL#4443 also call
  prune_partitions()).

In ha_partition::store_lock, ha_partition::start_stmt() and
ha_partition::reset(): only forward the call to the partitions marked
in partition_info->lock_partitions.

In ha_partition::external_lock: if not UNLOCK only forward the call to the
partitions marked in partition_info->lock_partitions and mark the locked
partitions in the handler internal bitmap locked_partitions, so that when
a UNLOCK call comes it can unlock previous locked partitions.

Update ha_partition::write_row() to give error if a partition is not in
part_info->lock_partitions. If IGNORE, give warning and skip write.

Update ha_partition::delete_row() to debug assert that only partitions marked in
part_info->read_partitions is used and that it gives error if a partition is not
in part_info->lock_partitions.

Update ha_partition::update_row() to assert that update_row()/delete_row() are
only calles for partitions marked in part_info->read_partition (should never
been scanned)
and error is returned before a partition that is not marked in
part_info->lock_partitions calls write_row() (to prevent updating partitions
with PARTITION () clause).
In case of IGNORE, issue warning instead of error if the partition is not set in
the lock_partitions bitmap (and skip the write).

write_row()/delete_row()/update_row() is only asserting that a row never should
be inserted, deleted or updated in a partition that is not scanned.
(update_row() is a special case when it have to move a row to another partition,
where the new partition must be included in the lock_partitions bitmap).