WL#3996: Add more hints
Affects: Server-5.7
—
Status: Complete
A hint is a substring within an SQL statement (usually a DML statement) which the user intends as a "suggestion" or "for your information" for the DBMS (usually the server). The DBMS may disregard a hint, but by documenting a hint we are telling users that we don't always disregard. A subclass of hints is the "if/endif" comment, which tells the DBMS to parse a clause only if some condition is true. Typical hint classes are "optimizer" and "portability". MySQL has used a variety of things that might be hints: "Index Hint Syntax" e.g. "FORCE INDEX (x)" http://dev.mysql.com/doc/refman/5.0/en/index-hints.html "View algorithm" e.g. "CREATE VIEW ALGORITHM=TEMPTABLE" http://dev.mysql.com/doc/refman/5.0/en/create-view.html "Comments" e.g. "/*! STRAIGHT_JOIN */" or "/*!32302 TEMPORARY */" http://dev.mysql.com/doc/refman/5.0/en/comments.html There was discussion of using hints for subqueries in 5.2, but the optimizer team decided to use a server variable instead. So the overall "hints" job (deciding a policy, changing the parser, deprecating old ways, and implementing) is something that probably won't come up again until after version 6. Decisions --------- I'll just list the things that require a decision someday. 1. Most hints should be /* Comment */ hints. Or some hints should be OPTION (option list) hints. 2. Hints may appear anywhere in any statement, and have effect throughout the statement. Or hints in a particular clause have effect only for that clause. Or hints in a particular subquery have effect only for that subquery. Or hints must be in a fixed place, at statement start/end. 3. Bad syntax in a hint causes the hint to be ignored. Or bad syntax causes an error. Or bad syntax causes a warning. Or there's a session variable that decides whether hints are ignored, or cause an error, or cause a warning. 4. Hints, even /* Comment */ hints, appear in SHOW CREATE and mysqldump. Or they don't. 5. Hints in main statements override hints in views. Or hints in views are ignored totally. 6. All existing hint-like things should be moved to the new style. Or they don't need to be. References ---------- "Using Optimizer Hints" (Oracle) http://download.oracle.com/docs/cd/B28359_01/server.111/b28274/hintsref.htm "Giving optimization hints to DB2" http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.pmtg/p9li375.htm "DB2 SELECTIVITY clause" http://www.ibm.com/developerworks/db2/library/tips/dm-0312yip/index.html "hints for DML queries" (SQL Server) http://blogs.msdn.com/queryoptteam/archive/2006/07/14/666196.aspx "Hints (Transact-SQL)" (SQL Server) http://msdn2.microsoft.com/en-us/library/ms187713.aspx "Using a SELECTIVITY clause to influence the optimizer" (DB2) http://www.ibm.com/developerworks/db2/library/tips/dm-0312yip/index.html "query enable" (PostgreSQL) http://www.postgresql.org/docs/8.1/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-ENABLE BUG#21300 Index hints for update statements BUG#28371 Add SQL 'hints' for fine grained control of engine_condition_pushdown "MySQL Forums :: Partitioning :: Are partition hints needed?" http://forums.mysql.com/read.php?106,154391,154391
Syntax: ------- 1. Comment syntax will be used for new hints. The syntax will be /*+ */ Reasoning: MySQL already supports the /*! */ syntax, but that is interpreted as part of the ordinary SQL syntax. Oracle uses /*+ */ syntax for hints. 2. Multiple hints may be specified in the same comment. E.g., /*+ HINT1(...) HINT2 HINT3(...) */ 3 A statement block can have only one comment containing hints, and that comment must follow the SELECT, UPDATE, INSERT, REPLACE, or DELETE keyword. It will have effect for the statement block where it is placed and other statement blocks nested within. In other words, hints in the outer-most query block of a statement will have effect on all sub-query blocks. A hint in a subquery block will override conflicting hints specified at outer query blocks. Conflicting hints for same query block, e.g. /*+ SEMIJOIN() NO_SEMIJOIN() */, will give a warning. 4. Bad syntax in a hint will cause a warning. 5. Object references (e.g., table names, index names) to objects that are not part of the query will give a warning. 6. Hints in views are not supported. 7. Over time we will deprecate old hint syntax. Optimizer decisions that should be possible to override with hints: ------------------------------------------------------------------- 1. Join order 2. Access method choice 3. Semi-Join/subquery strategy 4. Use of join buffering and BKA 5. Use of MRR 6. Index condition push-down 7. Use of index_merge 8. Strategy for filesort - Priority queue for LIMIT queries - Inclusion of additional fields in sort buffer Items 1 and 2 above is covered by existing hints; STRAIGHT_JOIN and USE/FORCE/IGNORE INDEX, respectively. Items 3 - 7 is currently possible to override by optimizer_switch. There are several drawbacks with with using optimizer_switch: - Requires extra round-trips to server to set/reset session variables. - Can be set only for whole statement, not for specific query block or table. E.g., it may be useful to turn on BKA for a specific index access, but not for whole table, or to turn off a specific semi-join strategy for one query block, but not for another. - Turning off an optimizer_switch flag is like IGNORE INDEX; it restricts available choices. We miss the equivalent of FORCE (with the exception of a few flags which turns off cost optimization) or USE. For Item 8 there is currently no way to override the decision of the optimizer. Setting session variables through hints --------------------------------------- In addition to overriding optimizer decision, it would be useful to be able to override the current setting of session variables that affect query execution (e.g., sort_buffer_size, join_buffer_size). One option is to provide this through hints, another option is to support a specific syntax for this purpose. For example: SELECT ... USING VARIABLES join_buffer_size=64*1024*1024; One problem with using hints for this purpose is that it may not be desirable to execute different query blocks within a statement with different values for session variables. One option could be to only allow setting of session variables in top-level query block. Hints that should be implemented -------------------------------- 1. Semi-Join strategy a. Specify which semi-join strategies can be used Syntax: /*+ SEMIJOIN( [strategy [strategy] ... ] ) */ strategy ::= MATERIALIZATION|FIRST_MATCH|LOOSE_SCAN|DUPS_WEEDOUT If no strategy is specified, all strategies are allowed b. Specify which semi-join strategies should NOT be used Syntax: /*+ NO_SEMIJOIN( [strategy [strategy] ... ] ) */ strategy ::= MATERIALIZATION|FIRST_MATCH|LOOSE_SCAN|DUPS_WEEDOUT If no strategy is specified, semi-join will not be used. TBD: How do we force a particular join order for semi-join? 2. Subquery strategy Specify whether Subquery Materialiation or In-to-exists should be used. Syntax: /*+ SUBQUERY(strategy) */ strategy ::= MATERIALIZATION|IN_TO_EXISTS 3. Join buffering (Block nested loop) a. Specify that join buffering (BNL) should be used for the specified tables. Syntax: /*+ BNL( table [table ...] ) */ BNL will be used for the specified table(s) when appropriate. That is, the table must be an inner table of a join operation and ref access is not used. For other tables, the hint is ignored without any warning. b. Specify that join buffering (BNL) should NOT be used for the specified tables. Syntax: /*+ NO_BNL( table [table ...] ) */ 4. Batched Key Access (BKA) a. Specify that batched key access (BKA) should be used for the specified tables. Syntax: /*+ BKA( table [table ...] ) */ BKA will be used for the specified table(s) when appropriate. That is, the table must be an inner table of a join operation and ref access is used. For other tables, the hint is ignored without any warning. b. Specify that batched key access (BNL) should NOT be used for the specified tables. Syntax: /*+ NO_BKA( table [table ...] ) */ 5. Multi-range-read (MRR) a. Specify that multi-range read (MRR) should be used for the specified index(es). Syntax: /*+ MRR( table [index [index] ...] ) */ If no indexes are specified, the hint will apply to all indexes of the given table. This hint will not impact which access method is chosen for a given table, but given that a specified index is chosen for range (or ref?) access, MRR will be used. b. Specify that multi-range read (MRR) should NOT be used for the specified index(es). Syntax: /*+ NO_MRR( table [index [index] ...] ) */ If no indexes are specified, the hint will apply to all indexes of the given table. 6. Index Condition Push-Down (ICP) a. Specify that index conditioni push-down (ICP) should be used for the specified index(es). Syntax: /*+ ICP( table [index [index] ...] ) */ If no indexes are specified, the hint will apply to all indexes of the given table. This hint will not impact which access method is chosen for a given table, but given that a specified index is chosen for range or ref access, ICP will be used. b. Specify that index condition push-down (ICP) should NOT be used for the specified index(es). Syntax: /*+ NO_ICP( table [index [index] ...] ) */ If no indexes are specified, the hint will apply to all indexes of the given table. 7. Use of Index Merge a. Specify that index merge should be used for the specified table/indexes Syntax: /*+ INDEX_MERGE( table [index [index] ...] ) */ If no indexes are specified the optimizer will choose whatever indexes makes sense to merge for the particular query. Note: There is no specific hints for the different kinds of index merge. It is assumed that it will be obvious for a given query which type of index merge that applies. b. Specify that index merge should NOT be used for the specified table. Syntax: /*+ NO_INDEX_MERGE( table ) */ Note: It is not possible to specify indexes for this hint. This means that a user can not specify that index merge is generally allowed, except for one particular combination of indexes. In such cases, the user will have to use force one particular index merge combination using the INDEX_MERGE hint. 8. Strategies for filesort a. Use priority queue for LIMIT queries Syntax: /*+ ORDER_BY_LIMIT_OPTIMIZATION */ b. Do NOT use priority queue for LIMIT queries Syntax: /*+ NO_ORDER_BY_LIMIT_OPTIMIZATION */ c. Include additional fields in sort buffer Syntax: /*+ ORDER_BY_ADD_FIELDS */ d. Do not include additional fields in sort buffer Syntax: /*+ NO_ORDER_BY_ADD_FIELDS */ 9. Statement timeout Specify statement specific timeout. Syntax: /*+ MAX_EXECUTION_TIME(milliseconds) */ Additional hints that could be implemented ----------------------------------------- 1. Name a query block Syntax: /* QB_NAME( name ) */ This name will replace the automatic name (e.g., select#1) used in EXPLAIN output etc. TBD: Maybe we should allow this hint also in views? 2. Specify that a hint should be applied to a specific query block Syntax: /* HINT(@name ...) */ Specify that a hint is to be applied to a specific (sub)-query block. This could be used for at least two purposes: - Collect all hints at top-level - Specify hints for views The name may either be explicitly assigned by a QB_NAME hint or implicitly assigned by the optimizer (e.g., select#1). 3. Join Order a. Specify join order for (subset of) tables Syntax: /*+ STRAIGHT_JOIN(table ... ) */ MySQL will join the specified tables in the order specified in the argument list. If no table arguments are given, MySQL will join tables in the order in which they appear in the FROM clause. Not all tables of the join needs to be specified. In that case, MySQL will be free to places non-specified tables wherever in the join sequence, it sees fit. b. Specify leading join order for join Syntax: /*+ JOIN_PREFIX(table ...) */ Instructs the optimizer to use the specified set of tables as the prefix in the execution plan. Note: In order to be able to specify the join order when semi-join is involved, including tables of sub-queries should be allowed for the above two hints. Specifying a join order that conflict with dependencies given by outer join will cause a warning. 4. Index usage a. Use one of the specified indexes for range and ref access Syntax: /*+ INDEX(table [index [index] ...] ) */ Specifying no index arguments, will have same effect as specifying all indexes. TBD: Should specifying an INDEX hint prohibit table scan? Or should only INDEX hints with no index arguments do so? b. Use none of the specified indexes for range and ref access Syntax: /*+ NO_INDEX(table [index [index] ...] ) */ Specifying no index arguments, will have same effect as specifying all indexes. That is, table scan than will be used. c. Use one of the specified indexes for ref access in nested loops join. Syntax: /*+ JOIN_INDEX(table [index [index] ...] ) */ Specifying no index arguments, will have same effect as specifying all indexes. d. Do not invoke range optimizer on specified indexes. Purpose is to avoid range optimization when it requires much resources. Syntax: /*+ NO_RANGE_OPTIMIZATION(table [index [index] ...] ) */ Specifying no index arguments, will have same effect as specifying all indexes.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.