WL#7648: Decoupling provisioning from updates to state store
Status: In-Documentation
GOAL ==== The aim of this work is to decouple operations that update the state store from those that are responsible for setting up servers such as provisioning them or simply configuring replication or any high-availability solution. This is key to have an easy-to-use product, which could be easily adopted and deployed at different environments. In other words, this work is a stepping stone towards supporting different provisioning mechanisms (e.g. MySQL Enterprise Backup, AWS Support, etc) and different High-Availability Solutions (e.g. MySQL Cluster, DRDB, etc). Besides it will allow users who are willing to use Fabric but want to rely on their own scripts to continue doing so. CONTEXT ======= Fabric provides three main sub-systems: High-availability, Sharding and Provisioning. Fabric organizes servers into high-availability groups for providing resilience to failures and shards are assigned to groups in order to take advantage of their high-availability features. Currently, only the standard MySQL Replication is supported though. Appropriate interfaces are provided so that groups and shards can be set up and provisioned by a database administrator. On the other hand, connectors and, in particular, users' applications don't need to manage groups and shards but need to fetch information on them to find out which servers are responsible for a group or shard. So Fabric must provide interfaces that allow to: . fetch information - Retrieve information on high-availability groups and shards stored in the state store, etc . update information - Add/remove servers into/from a group or define/move/ split/remove shards, etc. In order words, update the state store. . provisioning the system - Install new servers; take backups and restore them; configure replication among servers, etc. Usually when information on groups or shards is updated, provisioning steps are executed as well. For example, moving a shard from one group to another may require to restore a backup of the source group to the destination group before updating any information which maps a shard to a group. However, in some cases, users may want to execute they own provisioning steps because: . Fabric provisioning steps are not a good fit to their environment; . Their provisioning steps provide better performance; . Integrating their provisioning steps into Fabric requires time, etc. PROPOSAL ======== Any command that may execute a provisioning step must provide the --update_only option so that users may choose whether they want to execute the provisioning steps or skip them. By default the option is false meaning that the provisioning shall be executed. If the option is set to true, the command will only update the state store. In the future, any provisioning method shall be available as a command as well. Doing so, users will be able to choose the appropriate provisioning method while setting up their environment. Currently, Fabric only supports a provisioning method that is deeply rooted in the shard.move/split routines. Besides, the group.add/set_status configures replication which can also be considered a provisioning method. Note though that we will not extract these provisioning methods as commands in the context of this WL. REMARKS ======= . We will not provide support to different provisioning solutions in the context of this work. . We will not provide support to different High-availability solutions in the context of this work. See WL#7392 for further details. . This work depends on WL#7528 which provide the means to use/access optional parameters in a command. . We also took the opportunity to do the following changes in this patch: . Removed the group.import_topology(...). It required to start servers with the following options: --report-host and --report-port. This was not very useful. . Renamed group.check_group_availability(...) to group.health(...) and removed the "is_master" from the returned value. Users can check the same information in the "status" value.
CHANGED COMMANDS ================ We shall add the --update_only parameter to the following commands: . group.add(server, ..., update_only=False, ...) - This command adds a server into a group: updates the state store and configures replication. . group.promote(group_id, slave_uuid=None, update_only=False, ...) - This command demotes the current master if there is any and promotes a new server to master. Only secondaries are automatically chosen to become primary. If users want to promote a spare to master, the --slave_uuid parameter must be provided. . group.demote(group_id, slave_uuid=None, update_only=False, ...) - Demotes the current master if there is any. . server.set_status(server, ..., update_only=False, ...) - This command changes a server's status: updates the state store and configures replication. . sharding.move(shard_id, ..., update_only=False, ...) - This command moves a shard from a group to another: updates the state store, takes a backup of the source group, restore it to the destination group and synchronizes the source and destination group. . sharding.split(shard_id, ..., update_only=False, ...) - This command splits a shard between to groups: updates the state store, takes a backup of the source group, restore it to the destination group and synchronizes the source and destination group. STATUS TRANSITIONS ================== These are the possible server's status: . Primary denotes that a server may accept write transactions and secondaries connect to it to fetch updates. . Secondary denotes that a server accepts read-only transactions and connect to a primary to fetch updates. . Spare is a secondary that is not automatically elected to become a primary if there is a need to do so. . Faulty is a server that is not behaving as expected or is unreachable. +-----------+---------+-----------+-------+--------+ | FROM/TO | PRIMARY | SECONDARY | SPARE | FAULTY | |-----------|---------|-----------|-------|--------| | PRIMARY | | * | | + | |-----------|---------|-----------|-------|--------| | SECONDARY | * | | x | + | |-----------|---------|-----------|-------|--------| | SPARE | * | x | | + | |-----------|---------|-----------|-------|--------| | FAULTY | | | x | | +-----------|---------+-----------+-------+--------+ The operations that can change a server's status are the following: . group.promote() and group.demote() denoted with *. . threat.report_faulty() and threat.report_error() denoted with +. . server.set_status() denoted with x.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.