There is currently no official solution for providing failover between master and slaves in the event of a failure. With the currently available features, you would have to set up a master and a slave (or several slaves), and to write a script that monitors the master to check whether it is up. Then instruct your applications and the slaves to change master in case of failure.
Remember that you can tell a slave to change its master at any
time, using the CHANGE MASTER TO statement. The
slave will not check whether the databases on the master are
compatible with the slave, it will just start executing events
from the specified log and position on the new master. In a
failover situation all the servers in the group are probably
executing the same events from the same binary log, so changing
the source of the events should not affect the database structure
or integrity providing you are careful.
Run your slaves with the --log-bin option and
without --log-slave-updates. In this way, the
slave is ready to become a master as soon as you issue
STOP SLAVE; RESET MASTER,
and CHANGE MASTER TO statement on the other
slaves. For example, assume that you have the structure shown in
Figure 15.4, “Redundancy using replication, initial structure”.
In this diagram, the MySQL Master holds the
master database, the MySQL Slave computers are
replication slaves, and the Web Client machines
are issuing database reads and writes. Web clients that issue only
reads (and would normally be connected to the slaves) are not
shown, as they do not need to switch to a new server in the event
of failure. For a more detailed example of a read/write scaleout
replication structure, see
Section 15.2.3, “Using Replication for Scale-Out”.
Each MySQL Slave (Slave 1, Slave
2, and Slave 3) are slaves running
with --log-bin and without
--log-slave-updates. Because updates received by
a slave from the master are not logged in the binary log unless
--log-slave-updates is specified, the binary log
on each slave is empty initially. If for some reason
MySQL Master becomes unavailable, you can pick
one of the slaves to become the new master. For example, if you
pick Slave 1, all Web
Clients should be redirected to Slave
1, which will log updates to its binary log.
Slave 2 and Slave 3 should
then replicate from Slave 1.
The reason for running the slave without
--log-slave-updates is to prevent slaves from
receiving updates twice in case you cause one of the slaves to
become the new master. Suppose that Slave 1 has
--log-slave-updates enabled. Then it will write
updates that it receives from Master to its own
binary log. When Slave 2 changes from
Master to Slave 1 as its
master, it may receive updates from Slave 1
that it has already received from Master
Make sure that all slaves have processed any statements in their
relay log. On each slave, issue STOP SLAVE
IO_THREAD, then check the output of SHOW
PROCESSLIST until you see Has read all relay
log. When this is true for all slaves, they can be
reconfigured to the new setup. On the slave Slave
1 being promoted to become the master, issue
STOP SLAVE and RESET MASTER.
On the other slaves Slave 2 and Slave
3, use STOP SLAVE and CHANGE
MASTER TO MASTER_HOST='Slave1' (where
'Slave1' represents the real hostname of
Slave 1). To CHANGE MASTER,
add all information about how to connect to Slave
1 from Slave 2 or Slave
3 (user,
password,
port). In CHANGE
MASTER, there is no need to specify the name of
Slave 1's binary log or binary log position to
read from: We know it is the first binary log and position 4,
which are the defaults for CHANGE MASTER.
Finally, use START SLAVE on Slave
2 and Slave 3.
Once the new replication is in place, you will then need to
instruct each Web Client to direct their
statements to Slave 1. From that point on, all
updates statements sent by Web Client to
Slave 1 are written to the binary log of
Slave 1, which then contains every update
statement sent to Slave 1 since
Master died.
The resulting server structure is shown in Figure 15.5, “Redundancy using replication, after master failure”.
When Master is up again, you must issue on it
the same CHANGE MASTER as that issued on
Slave 2 and Slave 3, so that
Master becomes a slave of S1
and picks up each Web Client writes that it
missed while it was down.
To make Master a master again (because it is
the most powerful machine, for example), use the preceding
procedure as if Slave 1 was unavailable and
Master was to be the new master. During this
procedure, do not forget to run RESET MASTER on
Master before making Slave
1, Slave 2, and Slave
3 slaves of Master. Otherwise, they
may pick up old Web Client writes from before
the point at which Master became unavailable.
Note that there is no synchronization between the different slaves to a master. Some slaves might be ahead of others. This means that the concept outlined in the previous example might not work. In practice, however, the relay logs of different slaves will most likely not be far behind the master, so it would work, anyway (but there is no guarantee).
A good way to keep your applications informed as to the location
of the master is by having a dynamic DNS entry for the master.
With bind you can use
nsupdate to dynamically update your DNS.


User Comments
Another option instead of dynamic dns is to use a network VIP. Read-Only, Read-Write or Write-Only.
Each MySQL server master and slave(s) have two IPs. The first IP is the server's base IP. The second is a floating IP that can be changed at will.
If the master dies, just assign the IP from the master to one of the slaves.
If the master comes backup, it should check if the floating IP is in use before assigning it back to itself.
Add your own comment.