WL#2860: INFORMATION_SCHEMA table for SHOW SLAVE HOSTS
Affects: Server-9.x
—
Status: On-Hold
SUMMARY ------- Master should store information about slaves which have been connected in a table instead of memory. REQUIREMENTS ------------ - (Elliot) The merlin folks definitely want to see slaves which are not currently connected. Since Merlin monitors health, it is valuable for it to connect to a master, and then draw some picture with a big alert showing the slaves that were not able to be contacted DECISIONS REGARDING THIS WL --------------------------- - Elliot 9 Nov 2005: In the September 2005 Chicago dev-mtx planning meeting, during a Merlin discussion, it was decided that we should change SHOW SLAVE HOSTS to store information about slaves which have been connected in a table instead of memory. This should be done in 5.1, and is needed for Merlin advisors. - Lars 21 Feb 2006: According to Scrum meeting this should be in 5.2. OPEN ISSUES ----------- - How to purge table? - What format of table? - How to update table? DRAFT (TO BE DISCUSSED) ---------------------- 1. What should be done MySQL is extended with a new INFORMATION_SCHEMA.SLAVEHOSTS table. The table has the same columns as displayed by SHOW SLAVE HOSTS statement plus an additional column CONNECTED which defines whether the connection to the slave is OK. Column name Description ----------- ----------- MASTER_ID integer SERVER_ID integer HOST varchar PORT integer USER varchar PASSWORD varchar RPL_RECOVERY_RANK integer CONNECTED bool REMARK. SLAVEHOSTS and SHOW SLAVE HOSTS display slaves which are "registered" within the master. Currently, registration of a slave occurs each time the slave connects to the master. A slave can become unregistered only on master restart. Because of that SLAVE HOSTS generally does not display the overall "replication topology" (as defined by which master and slaves are installed and how they are configured). E.g., - Suppose the following occurs: slave Si shutdowns, master M shutdowns, master M restarts. In this case, Si will not be displayed by SLAVE HOSTS until Si restart. - Suppose Si is uninstalled. In this case Si will stay visible by SLAVE HOSTS until master M shutdown. Note also, that it is not quite obvious how to add a direct slave unregistering (consider the case when slave wants to unregister itself but cannot connect to master). REMARK When connections are to be checked? One alternative is to check connections during SLAVEHOSTS table filling (like other INFORMATION_SCHEMA tables, SLAVEHOSTS is filled when a reference to this table is encountered in a query). But this can make the table filling too slow (checking is based on waiting for response from slave). Other alternative is to check periodically the connections to all registered slaves in a separate thread and to use the result of the last check in the SLAVEHOST. Yet one more alternative is to use the result of each attempt of the master to connect to a slave (e.g. when sending an event). 2. Merlin's case The overall "replication topology" is to be described by an ordinary table, say 'repology', created within master server. It defines which slaves are configured to this master. Something like this: CREATE TABLE repology (name VARCHAR, server_id INT); The table is managed by an administrator who adds/deletes entries according to changes in the topology): INSERT INTO repology ('slave01', 101); INSERT INTO repology ('slave02', 102); INSERT INTO repology ('slave03', 103); ... DELETE FROM repology WHERE name='slave12'; ... To get the current state of slaves some kind of outer join can be used, e.g. SELECT * FROM repology AS t1 LEFT JOIN INFORMATION_SCHEMA.SLAVEHOSTS AS t2 ON (t1.server_id=t2.server_id); +---------+-----------+-----------+-----+-----------+ | name | server_id | SERVER_ID | ... | CONNECTED | +---------+-----------+-----------+-----+-----------+ | slave01 | 101 | 101 | ... | 1 | - connected | slave02 | 102 | 102 | ... | 0 | - registered, not connected | slave03 | 103 | NULL | ... | NULL | - not registered | ... | ... | ... | ... | ... | +---------+-----------+-----------+-----+-----------+ REMARK. Using right join instead of the left one above will show the slaves which are registered to the master but are not included into the repology: SELECT * FROM repology AS t1 RIGHT JOIN INFORMATION_SCHEMA.SLAVEHOSTS AS t2 ON (t1.server_id=t2.server_id); +---------+-----------+-----------+-----+ | name | server_id | SERVER_ID | ... | +---------+-----------+-----------+-----+ | slave01 | 101 | 101 | ... | | slave02 | 102 | 102 | ... | | ... | ... | ... | ... | | NULL | NULL | 200 | ... | - not included in repology | ... | ... | ... | ... | +---------+-----------+-----------+-----+ SOME MORE IDEAS FROM COMMUNITY - http://code.openark.org/blog/mysql/impressions-from-mysql-conf-2011-part-iii-bof-replication
SEMANTICS OF SHOW SLAVE HOSTS ============================= Before indulging into details of implementation one fundamental issue must be clarified: what "SHOW SLAVE HOSTS" statements is supposed to produce. There are two general understandings which are not clearly distinguished in the docs. a) "SHOW SLAVE HOSTS" generates list of slaves connected (potentially or actually) to the host on which query was issued. b) "SHOW SLAVE HOSTS" generates global table of all known master->slave connections. That is, the list produced by the query is *the same* on all hosts participating in replication. The current code and some formulations in the reference manual suggest that currently we (try to) support meaning b) although the support is faulty in many respects. One could argue in fawour of a) however: 1. It might be more natural that "SHOW SLAVE HOSTS" issued on master M talks only about slaves directly connected to M. 2. Implementing meaning b) is much more complex. Basically one needs to maintain a distributed database of all known master->slave connections. Some kind of replication would be necessary so that any host in the replication network knows about a slave connecting/disconnecting to/from master in any other part of that network. For example, consider replication topology: (A <- ) B <- C -> D If new host A connects to B as slave, D would have to know about that, so that "SHOW SLAVE HOSTS" on D reports that A is connected to B. 3. Even when "SHOW SLAVE HOSTS" has meaning a), it is possible to discover global replication topology. To do that, it is enough to connect to all involved masters, execute "SHOW SLAVE HOSTS" on each of them and compile the global list from the results. Of course, changing meaning of the query must be done carefully as some clients might rely on the old meaning. However, this is probably not such big problem because: - meaning b) is not clear from the documentation, - it is not correctly implemented in the current code anyway. ACTION: someone should decide which meaning of "SHOW SLAVE HOSTS" we are going to implement.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.