WL#7057: Make --replication-* filter settings dynamic
Affects: Server-5.7 — Status: Complete — Priority: Medium
Make the slave options --replicate-* dynamic variables so that these options can be changed dynamically while the server is running, which enables users to modify replication filtering rules without having to stop and restart the server. In the current replication version of MySQL, We can tell the slave SQL thread to restrict/avoid few statements by setting “Replication Filter rules”. replicate_do_db replicate_ignore_db replicate_do_table replicate_ignore_table replicate_wild_do_table replicate_wild_ignore_table replicate_rewrite_db Currently, either through command line parameters or by setting in my.cnf file we can set these variables. In either case Mysqld must be restarted if we want to change replicate filtering rules. But it is not easy to restart a MySQL server in real time scenarios. It would be quite helpful in many situations if there was some way to dynamically configure these filtering rules. This would be particularly helpful in environments where slaves are configured dynamically to replication certain databases and/or tables based on load and usage. Related Bug: bug#15877941 / BUG#67362 Make Replication filter settings dynamic
Slave SQL Thread is the only thread which uses these replication filter rules. To Make the replication filter rules dynamic, there are three areas which we have to consider in the High level design 1) Should we introduce new system variables or through a new command 2) Are these changes to rules should be Temporary or Permanent? 3) Should allow users to set these rules while SQL Thread is active? Area:1: New command Vs System Variables: ---------------------------------------- Option:1: Introduce system variables and user should be able set these system variables when server is up and running Default Value for System Variables: The default value of these system variables can be taken from initial values specified at the time of server startup ( command line params or my.cnf file). If it is specified in both the places, command line params will be considered over the values in my.cnf file. If nothing is specified in both places, then the default value will be NULL for these system variables. Pros: Set command is very commonly used one. Cons: 1) Through set command, users needs to give comma separated rules. This will create problems since commas are also allowed in meta data names. 2) Users needs to specify the meta data names names in non-standard way i.e., to give d`1 => Standard way is `d``1` , but with this implementation the should give it d`1 way. 3) Parser gives a comma separted rules and Server should write a custom parse to separate the rules which might be error prone. It would be better if parser does the separation. Option:2: Introduce a new command 'CHANGE REPLICATION FILTER' command to set these rules when server is up and runnning Pros: Same as Cons of the above approach 1) Users can give standard way of specifying metadata names. Hence no problem for specifying special characters like ",". 2) Parser can separate out rules and no customer parsing is required. Cons: Introduction of new command might be confusing to customers. Area:2 : Should SQL Thread is active or down? --------------------------------------------- Option1: we can restrict setting of these rule when slave SQL thread is not running. Stop Slave SQL thread Set any of these filter rules using “SET” command. Start Slave SQL thread to make those new rules be affective Pros: Implementation will be easy Cons: Should make slave SQL thread down Option2: Allow setting of these filter rules when the slave SQL thread is running, but they will be affective only if SQL thread is restarted. Set any of these filter rules using “SET” command. ( when SQL thread is running/stopped) Generate Warning “New filter rule will be affective only after SQL thread restarted” Pros: Rules can be set even while SQL thread is running. cons: Implemention will be little more complex than the option 1 (with Multiple slave SQL threads (appliers) running ). Is it really required to go with this option? Even in this option also, SQL thread needs to be restarted to make these variables effective. Area3: TEMPORARY or PERMANENT changes ------------------------------ What should happen after slave mysqld shutdown and new start-up: Option-1: Permanent change: should slave remember all dynamically added/deleted filters? Implementation: When we are setting the variables, these variables can be stored in system table and at the time restarting the server, we can load the values from this system table. At the new start-up, the preference will be given to command-line parameters. If not mentioned, the values from the system tables will be read. If it is NULL, then values from my.cnf will be read. If it is not specified in my.cnf also, these system variables will be NULLs. Pros: User does not have to set the variables again and again. Cons: What if User wants to make these settings temporary only? They have to specify in the command lines each time they do start up? Or should we give some other system variable which will be decided whether these should be permanent or temporary? TODO: Currently no framework in the system i.e., none of system variables changes are persistent. Need to do analysis little more if there is any specific reasons behind that. Option-2: Temporary changes: Should it forget dynamically added and reload the current filters from my.cnf? Cons: an administrator has to modify filters both dynamically in the current server and at the same time in my.cnf for the future start ups. This is quite error prone and may cause data loss in case of mistakes (e.g. correct filter is added dynamically, but my.cnf is updated with a mistake). After review was done, it has been decided that, we will go with the following options: 1) Introduce CHANGE REPLICATION FILTER command Add parse rules to support: CHANGE REPLICATION FILTER REPLICATE_DO_DB=(comma_separated_db_names_list); CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB=(comma_separated_db_names_list); CHANGE REPLICATION FILTER REPLICATE_DO_TABLE=(comma_separated_table_names_list); CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE = (comma_separated_table_names_list); CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE= (comma_separated_wild_pattern_list); CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE = (comma_separated_wild_pattern_list); CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE = (comma_separated_wild_pattern_list); CHANGE REPLICATION FILTER REPLICATE_REWRITE_DB =(comma_separated_db_pair_list); and each db_pair will be in the form of (db1, db2); Following are the few behavioral points about the command: a) Users should be able to set multiple filter rules in one command like CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1), REPLICATE_IGNORE_TABLE= (db1.t1), ..; b) To reset the filter value, they have to use void brackets "()" syntax, i.e, empty list will clear the existing values and set it empty value For example: CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE = (); c) The non-empty list filter rules will clear the existing values and set the value to new list. CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE = (db1.t1); will reset all existing values and set it the rule to db1.t1 value. d) Unspecified filter rules will be unchanged. For example: CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE = (db1.t1); will change REPLICATION_IGNORE_TABLE rule only and rest of the filter rules will not be changed and existing values will be continued. e) If some rule is specified multiple times, the latter list will be considered and the earlier list will be ignored. For example: CHANGE REPLICATION FILTER REPLICATION_DO_DB=(db1,db2), REPLICATE_DO_DB=(db3,db4); db1 and db2 list will be ignored and REPLICATE_DO_DB list will be updated with db3 and db4 values. f) In case of OUT OF MEMORY error, the command might get executed partially i.e., few of the filter rules might get executed and few of them might not get executed. User needs to verify them manually to see which ones are executed. 2)Restrict setting of these system variables when slave SQL thread is not running. 3) The changes to these system variables are not persistent. Now after this feature is implemented, the filter rules can be set using 1) command line parameters at the time of server startup (--replicate-* ) 2) Set it in my.cnf file before server startups 3) Set it dynamically when the server is running through a CHANGE REPLICATION FILTER COMMAND.
Add Parse rule: CHANGE REPLICATION FILTER opt_filter_def_list; opt_filter_def_list can be either NULL, () to reset the values or list of filter_defs. filter_def_list: filter_def_list, filter_def. A new sub class will be inherited from Sql_cmd class to implement the new command(Sql_cmd_change_repl_filter). This new class also contains 7 List<Item> * members one for each replication filter rule and parser will prepare the List and depends on what type of replication filter is being set , the corresponding member will be set. Executor will have 7 set functions to add these list of rules to corresponding data structures. In the current code, initial value of replicate_[do | ignore]_table is maintained in a hash data structure, initial value of replicate_[do | ignore]_db is maintained in a string_list data structure, initial value of replicate_wild_[do | ignore]_table is maintained in a dynamic array. initial value of replicate_rewrite_db is maintained in a string_pair_list change_master(): if (slave is running) error( slave must be stopped before setting the variable) And if slave is not running, we will parse the input list of rules and add those values into appropriate data structures by clearing the existing data. When we are setting these variables, we have to take active_mi's slave run_lock lock just to make sure that no parallel start slave sql_thread happens. Get() functions for all these 7 variables are already present in the existing code. The only way users can view the values of filter rules through "show slave status" and show slave status acquires active_mi. Since new change_master ( set() functions) is protected through active_mi and Get() is only through show slave status command which is also protected through active_mi lock, there will not be any issues in parallel set and get commands.
Copyright (c) 2000, 2018, Oracle Corporation and/or its affiliates. All rights reserved.