WL#2860: INFORMATION_SCHEMA table for SHOW SLAVE HOSTS
Affects: Server-9.x
—
Status: On-Hold
SUMMARY
-------
Master should store information about slaves which have been connected
in a table instead of memory.
REQUIREMENTS
------------
- (Elliot) The merlin folks definitely want to see slaves which are
not currently connected. Since Merlin monitors health, it is valuable
for it to connect to a master, and then draw some picture with a big
alert showing the slaves that were not able to be contacted
DECISIONS REGARDING THIS WL
---------------------------
- Elliot 9 Nov 2005:
In the September 2005 Chicago dev-mtx planning meeting, during a
Merlin discussion, it was decided that we should change SHOW SLAVE
HOSTS to store information about slaves which have been connected in a
table instead of memory. This should be done in 5.1, and is needed
for Merlin advisors.
- Lars 21 Feb 2006:
According to Scrum meeting this should be in 5.2.
OPEN ISSUES
-----------
- How to purge table?
- What format of table?
- How to update table?
DRAFT (TO BE DISCUSSED)
----------------------
1. What should be done
MySQL is extended with a new INFORMATION_SCHEMA.SLAVEHOSTS table.
The table has the same columns as displayed by SHOW SLAVE HOSTS
statement plus an additional column CONNECTED which defines whether
the connection to the slave is OK.
Column name Description
----------- -----------
MASTER_ID integer
SERVER_ID integer
HOST varchar
PORT integer
USER varchar
PASSWORD varchar
RPL_RECOVERY_RANK integer
CONNECTED bool
REMARK.
SLAVEHOSTS and SHOW SLAVE HOSTS display slaves which are "registered"
within the master. Currently, registration of a slave occurs each time
the slave connects to the master. A slave can become unregistered only
on master restart. Because of that SLAVE HOSTS generally does not display
the overall "replication topology" (as defined by which master and
slaves are installed and how they are configured). E.g.,
- Suppose the following occurs: slave Si shutdowns, master M shutdowns,
master M restarts. In this case, Si will not be displayed by
SLAVE HOSTS until Si restart.
- Suppose Si is uninstalled. In this case Si will stay visible by
SLAVE HOSTS until master M shutdown.
Note also, that it is not quite obvious how to add a direct slave
unregistering (consider the case when slave wants to unregister itself
but cannot connect to master).
REMARK
When connections are to be checked?
One alternative is to check connections during SLAVEHOSTS table
filling (like other INFORMATION_SCHEMA tables, SLAVEHOSTS is filled
when a reference to this table is encountered in a query). But this
can make the table filling too slow (checking is based on waiting
for response from slave).
Other alternative is to check periodically the connections to all
registered slaves in a separate thread and to use the result of the
last check in the SLAVEHOST.
Yet one more alternative is to use the result of each attempt of the
master to connect to a slave (e.g. when sending an event).
2. Merlin's case
The overall "replication topology" is to be described by an ordinary
table, say 'repology', created within master server. It defines which
slaves are configured to this master. Something like this:
CREATE TABLE repology (name VARCHAR, server_id INT);
The table is managed by an administrator who adds/deletes entries
according to changes in the topology):
INSERT INTO repology ('slave01', 101);
INSERT INTO repology ('slave02', 102);
INSERT INTO repology ('slave03', 103);
...
DELETE FROM repology WHERE name='slave12';
...
To get the current state of slaves some kind of outer join can be used,
e.g.
SELECT * FROM repology AS t1
LEFT JOIN INFORMATION_SCHEMA.SLAVEHOSTS AS t2
ON (t1.server_id=t2.server_id);
+---------+-----------+-----------+-----+-----------+
| name | server_id | SERVER_ID | ... | CONNECTED |
+---------+-----------+-----------+-----+-----------+
| slave01 | 101 | 101 | ... | 1 | - connected
| slave02 | 102 | 102 | ... | 0 | - registered, not
connected
| slave03 | 103 | NULL | ... | NULL | - not registered
| ... | ... | ... | ... | ... |
+---------+-----------+-----------+-----+-----------+
REMARK.
Using right join instead of the left one above will show the slaves
which are registered to the master but are not included into the repology:
SELECT * FROM repology AS t1
RIGHT JOIN INFORMATION_SCHEMA.SLAVEHOSTS AS t2
ON (t1.server_id=t2.server_id);
+---------+-----------+-----------+-----+
| name | server_id | SERVER_ID | ... |
+---------+-----------+-----------+-----+
| slave01 | 101 | 101 | ... |
| slave02 | 102 | 102 | ... |
| ... | ... | ... | ... |
| NULL | NULL | 200 | ... | - not included in repology
| ... | ... | ... | ... |
+---------+-----------+-----------+-----+
SOME MORE IDEAS FROM COMMUNITY
-
http://code.openark.org/blog/mysql/impressions-from-mysql-conf-2011-part-iii-bof-replication
SEMANTICS OF SHOW SLAVE HOSTS
=============================
Before indulging into details of implementation one fundamental issue must be
clarified: what "SHOW SLAVE HOSTS" statements is supposed to produce. There are
two general understandings which are not clearly distinguished in the docs.
a) "SHOW SLAVE HOSTS" generates list of slaves connected (potentially or
actually) to the host on which query was issued.
b) "SHOW SLAVE HOSTS" generates global table of all known master->slave
connections. That is, the list produced by the query is *the same* on all hosts
participating in replication.
The current code and some formulations in the reference manual suggest that
currently we (try to) support meaning b) although the support is faulty in many
respects.
One could argue in fawour of a) however:
1. It might be more natural that "SHOW SLAVE HOSTS" issued on master M talks
only about slaves directly connected to M.
2. Implementing meaning b) is much more complex. Basically one needs to maintain
a distributed database of all known master->slave connections. Some kind of
replication would be necessary so that any host in the replication network knows
about a slave connecting/disconnecting to/from master in any other part of that
network. For example, consider replication topology:
(A <- ) B <- C -> D
If new host A connects to B as slave, D would have to know about that, so that
"SHOW SLAVE HOSTS" on D reports that A is connected to B.
3. Even when "SHOW SLAVE HOSTS" has meaning a), it is possible to discover
global replication topology. To do that, it is enough to connect to all involved
masters, execute "SHOW SLAVE HOSTS" on each of them and compile the global list
from the results.
Of course, changing meaning of the query must be done carefully as some clients
might rely on the old meaning. However, this is probably not such big problem
because:
- meaning b) is not clear from the documentation,
- it is not correctly implemented in the current code anyway.
ACTION: someone should decide which meaning of "SHOW SLAVE HOSTS" we are going
to implement.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.