WL#6987: Refactoring: Remove TABLE object from updatable views
Status: Complete
This worklog consists of three smaller refactoring tasks: - Remove TABLE object from updatable views - Change the resolver code for INSERT statement clauses - Change deletion through multi-table DELETE. The first task is the major task of this worklog. The other tasks are done because they made the first task easier in some specific areas, and because they generally make sense. User Documentation ================== Internal refactoring without user-visible effects. No user documentation required.
Functional requirements: ------------------------ F1: Reduce complexity of data change code by removing TABLE object from updatable views. F2: Reduce complexity of resolver code for INSERT statement clauses F3: Reduce complexity of deletion through multi-table DELETE. F4: Increase performance of multi-table delete for DELETE statements involving more than one table with eq_ref access method. The change is a side-effect of F3 and shows up in test cases, but is hardly noticeable in practical queries. F5: Provide improved lifetime control over TABLE objects for views against information schema table. Such objects are no longer available after they are deleted. This is a side-effect of F1 and manifests as a correct error message when a non-existing table is referenced. Non-Functional requirements: ---------------------------- NF1: The functionality of data change operations shall remain unchanged, except for F4 and F5, which are side-effects of the major functional requirements. NF2: Performance degradations are not accepted.
1. Remove TABLE object from updatable views. -------------------------------------------- The code for INSERT, UPDATE and DELETE has a special handling for views: Normally, a TABLE_LIST object representing a view has the table pointer set to NULL. However, when a view is used in an update statement and is determined to be updatable, then only one underlying base table of the view may be updated, and thus it is possible to patch the pointer of the corresponding base table into the view`s TABLE_LIST. However, this is a hack, and a more consistent way to process such tables is to keep track of both the view and the base table being updated in those functions. This patch removes this hack and performs some moderate refactoring of the update code. 2. Change the resolver code for INSERT statement clauses. --------------------------------------------------------- Currently, we first resolve INSERT values, followed by INSERT columns. With this patch, the work is now done in opposite order, which coincide with syntactical order. It also means that we check for the updatable base table of a view earlier (it is determined by looking at INSERT columns), and we later check that insert values, update columns and update values match this table. (Even though a multi-table view can be specified for an INSERT, we are not allowed to insert into, update nor reference columns from more than one table of the view.) By this two things are achieved: The updatable base table is located as soon as possible, and the resolve order follows the syntax. Update columns and update values are resolved in the same order as before. 3. Change deletion through multi-table DELETE. ---------------------------------------------- Deletion code in e.g. multi_delete::send_data() uses list of tables to update, while JOIN engine uses tables from JOIN_TAB array. In order to get proper join order, original list of tables is updated according to tables in JOIN_TAB array, but it still seems simpler to delete from tables according to order of tables in JOIN_TAB array. Deletion is modified as follows: - delete_table_map is a new table_map that identifies the tables to be deleted from. It is a subset of the overall map of tables associated with a JOIN object. - delete_immediate is another table_map. It is a subset of delete_table_map and identifies the tables that can be deleted from immediately. Essentially, these are the first table in the join order (if deleted from) and any table with an eq_ref relation to this table. - To keep track of transactional vs. non-transactional deletes, we have also added transactional_table_map and non_transactional_table_map, which are also subsets of delete_table_map. Notice that some multi_delete code is processed after the JOIN object has been deleted, hence multi_delete::~multi_delete() still needs to traverse the list of tables to delete from. We also needed to create the tables array to be used together with tempfiles in ::do_deletes().
1. Remove TABLE object from updatable views. -------------------------------------------- Functional changes performed: - check_insert_fields() now identifies base table being updated, returned in insert_table_ref. - mysql_insert() now identifies explicitly the base table being updated. It also calls query_cache_invalidate_single() on the single table being inserted into. This was necessary, as the list of TABLE_LIST objects no longer identify base tables through table pointer when processing a view. - mysql_delete() performs an earlier check for an updatable table. It also keeps an explicit pointer to the updatable base table: delete_table_ref It calls query_cache_invalidate_single since there is only one table. - mysql_multi_delete_prepare() filters out "corresponding_table" more explicitly. - In find_dup_table(), removed description of table pointer "hack", and calling updatable_base_table() to locate updatable table for a view. - invalidate_delete_tables() now wraps calls to query_cache_invalidate_single(). - check_table_access() passes over "correspondent_table" used in multi-table DELETE. - multi_delete_precheck() no longer sets table pointer for view. - mysql_load() explicitly identifies base table to be inserted into. - mysql_test_insert() explicitly identifies base table to be inserted into. - mysql_test_update() explicitly identifies base table to be updated. - mysql_test_delete() explicitly identifies base table to be deleted from and adds early explicit check for update table and multi-table view. - mysql_update() explicitly identifies the base table to be updated, and calls query_cache_invalidate_single() to invalidate a single table. - invalidate_update_tables() wraps invalidation of multiple tables being updated. - check_key_in_view() has gotten an added argument for actual updated table. - TABLE_LIST::set_underlying_merge() has stopped setting view's table pointer. - updatable_base_table() is a new function that returns the updatable base table of an updatable view. This can be used from DELETE and UPDATE code, which requires that the view is not a multi-table view. Note that INSERT uses the list of insert columns to identify an updatable base table, so it does not need to call this function. Notice that there is a small behavioral change here: Earlier, an information schema table could be available after it was deleted, but this is no longer possible. There are a couple of changed test results because of this. 2. Change the resolver code for INSERT statement clauses. --------------------------------------------------------- Functional changes performed: - check_view_single_update() only checks columns to be inserted into. Updated columns are now matched against the table located by this function. - check_valid_table_refs() is a new function which checks insert values, update columns and update values against base table determined by check_insert_fields(). - In mysql_prepare_insert(), the check for a single table insert has been simplified: It is now done solely based on insert columns, and then all remaining columns and values are matched against this table. 3. Change deletion through multi-table DELETE. ---------------------------------------------- Data changes performed for class multi_delete: - table_being_deleted is removed as it is no longer needed. - Added tables, which is an array of pointers to table objects matching tempfiles. - transactional_tables, normal_tables and delete_while_scanning are replaced with other data members. - Added delete_table_map, which is a map over all tables that are being deleted from. - Added delete_immediate, which is a map over all tables to be deleted from immediately (ie not through tempfiles). - Added transactional_table_map, a map over all tables being deleted from that are transactional. - Added non_transactional_table_map, a map over all tables being deleted from that are non-transactional. - Added non_transactional_deleted, which keeps track of whether the statement has deleted any rows from non-transactional tables yet. Functional changes performed: - multi_delete::multi_delete(), creation of tempfiles array is moved to ::initialize_tables(), as the constructor cannot report an OOM error. - multi_delete::initialize_tables() now allocates tempfiles and tables arrays, and calculates the table maps delete_table_map, transactional_table_map, non_transactional_table_map and delete_immediate. - multi_delete::send_data() uses JOIN_TAB array to identify tables to delete from: * Deleting from tables that are part of delete_table_map, * Immediately deleting from tables that are also present in delete_immediate, * Collecting rows from other tables to be deleted from through tempfiles. - multi_delete::abort_result_set() has a simplified check before call to thd->binlog_query(). - multi_delete::do_table_deletes() calculates the map non_transactional_deleted. - check_grant() needed explicit code to keep track of "correspondent_table". (Not sure the solution is a good idea, may reconsider). Notice that there is a behavioral change because of this: The old code could only keep track of one table being deleted from immediately. By extending this to be handled by a table_map, we can now delete immediately from all const tables of a query, as well as the first non-const table. 4. Test changes --------------- mysql-test/r/partition_locking.result - Due to the way we collect information about immediately deleted tables, we can now delete immediately from two tables instead of one. mysql-test/r/ps_ddl.result - A view against an information schema table is no longer available after it is deleted, hence we get an error message here. mysql-test/suite/perfschema/r/multi_table_io.result - Probably caused by the optimization for multi-table delete. (Needs checking). mysql-test/t/ps_ddl.test - A view against an information schema table is no longer available after it is deleted, hence we get an error message here. mysql-test/t/delete.test - Added coverage of error message ER_UPDATE_WITHOUT_KEY_IN_SAFE_MODE. Lack of coverage revealed through DGCOV testing. mysql-test/t/view.test - Added better coverage of error message ER_NON_UPDATABLE_TABLE. Lack of coverage revealed through DGCOV testing.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.