WL#4443: Remove scalability problem with many locks and external locks on partitioned tables with many partitions
Affects: Server-Prototype Only — Status: Complete — Priority: Medium
Since tables are locked before any preparation and optimizing is done, it is not possible to prune partition locks in the optimizing phase. This leads to bad performance and unnecessary work (all partitions are locked for each and every query, if not explicit pruning, WL#5217, is used). Therefore if we delay the locking until after prune_partitions is called we can skip locking of pruned/non used partitions. This can be done by also call prune_partitions in JOIN::prepare() to prune on constants (non constants and subqueries cannot be locked pruned). We must also enforce that nothing in the prepare phase rely on locked tables. One side effect of this worklog will be shorter table locks too, since the prepare phase will be done before locking all tables. This may increase the overall performance. This will fix BUG#37252. Since locking is the major problem for many partitions, this worklog will not try to delay any open calls (since open tables are cached). That will be wl#3513, which may need refactoring of the table cache.
There should be no user visible change by this worklog. There is no syntax changes. To make it possible to prune away locks of unused partitions, locking have to be placed after prune_partitions is done. Currently prune_partitions is called in JOIN::optimize after tables are both opened and locked. This worklog will change this so lock_tables() will be called after prune_partitions() are executed. We should not use open_and_lock_tables(), but instead use open_normal_and_derived_tables(), do preparations and pre-locking optimizations, call prune_partitions, and then call lock_tables() and continue with post-locking optimizations and execution. Affected DML statements: SELECT (including UNION and JOIN) should only lock the same partitions that needs to read. single table UPDATE will only be pruned for locking if no partitioning field is given for update. (Not part of this worklog: One could optimize this if the partitioning fields to be updated is constants only, then one could add those matching partitions to be locked too). multi table UPDATE will only prune locks for tables that does not update any partitioning fields. (for enhancements see single table UPDATE). REPLACE/INSERT ... VALUES should only lock the partitions having rows to be inserted/replaced. This means that we need to iterate over every row to be inserted/replaced and evaluate which partition it belongs to and mark that partition for use/lock. NOTE: If a AUTO_INCREMENT will be generated and it is a part of the partitioning fields then all partitions must be locked (Not yet part of this worklog: Optimization may be possible for range partitioning). INSERT ... VALUES ... ON DUPLICATE KEY will only be pruned for use/lock if no partition field is given for update (for optimization, see single-table UPDATE) INSERT SELECT, for the SELECT clause see SELECT above, for the INSERT table all partitions needs to be locked. (One could optimize this if the SELECT will return a known/constant set of values for the partitioning key, but that is not a part of this worklog). LOAD will not be pruned, since it is mainly used for inserting a large batch of rows. It would need the input file to be read and parsed and each row to evaluate its partitioning function to determine which partitions to use prior locking. In case of large batches the gain (pruning of locks) seems smaller than the cost (reading, parsing and calculate the partition id). (One could optimize this if all partitioning fields are static, and only non partitioning fields are read from the file. Not part of this worklog. Note: special care with REPLACE, see INSERT). There might be other positive performance side effects, since locks on table/partition level is held for a shorter period of time, which may affect concurrent queries. If there are BEFORE INSERT/UPDATE triggers on a table which uses any partitioning column, it cannot be pruned for INSERT/UPDATE, since the trigger can alter its values: If there are BEFORE INSERT trigger, then INSERT/REPLACE cannot be pruned, since the BEFORE INSERT trigger may change the partitioning columns before it is inserted, forcing it to a different partition. If there are BEFORE UPDATE trigger, then INSERT ON DUPLICATE KEY UPDATE cannot be pruned and UPDATE cannot prune locks.
The main files that needs changes is: runtime area: * sql_base.cc - Create open_query_tables() and lock_query_tables() which stores its state in Query_tables_list. move the explicit lock pruning added in WL#5217 from open_tables() to open_table(). * sql_parse.cc - remove open_and_lock_tables() where partitioning lock pruning should be enables and use open_query_tables() instead and making use of the modified handle_select. optimizer area: * sql_select.cc - Split the body of mysql_select() into two parts, mysql_prepare_select() and mysql_execute_select() where locking can be done in between. * sql_union.cc - Add open and lock handling with open_query_tables/lock_query_tables to mysql_union().. * sql_update.cc - same as select but for updates. * sql_insert.cc - same as select but for inserts. * opt_range.cc - prune locks by clearing bits in part_info->lock_partitions bitmap. * item* - fix various problems with the requirement that prepare and fix_fields only should access table metadata, as the tables are only opened, not locked. Currently it is possible that during the prepare phase/fix_fields call, subqueries is executed etc. which needs the tables to be locked. This worklog will not affect Database Administration statements. It will affect the following DDL's: * CREATE VIEW will no longer lock any tables, since it is not needed. Side effect of using open_query_tables. * EXCHANGE PARTITION does now prune locks (i.e. only the exchanged table and the exchanged partition is locked. * ALTER TABLE t TRUNCATE PARTITION does now prune locks. NOTEs: * INSERT DELAYED is not supported by partitioned tables (so it will not need to be enhanced). * LOCK TABLES cannot prune partition locks. * CALL stored_procedure(arg_expr) supports lock pruning, but evaluating the arg_expr does not. * DO does NOT support partitioning lock pruning. * SET does NOT support partitioning lock pruning. NO - it is a DB Admin statement. To investigate: * effects on replication (statement and row)
Copyright (c) 2000, 2016, Oracle Corporation and/or its affiliates. All rights reserved.