Documentation Home
MySQL Utilities 1.5 Manual
Related Documentation Download this Manual
PDF (US Ltr) - 1.5Mb
PDF (A4) - 1.5Mb
HTML Download (TGZ) - 289.5Kb
HTML Download (Zip) - 301.7Kb


8.7.4.6 Configure Corosync and Pacemaker

At this point, the DRBD file system is configured and initialized and both MySQL Fabric and MySQL Server has been installed and the required files set up on the replicated DRBD file system. Pacemaker and Corosync are installed but they are not yet managing the MySQL Fabric Process, MySQL Server and DRBD resources to provide a clustered solution - the next step is to set that up.

Firstly, set up some network-specific parameters from the Linux command line and also in the Corosync configuration file. The multi-cast address should be unique in your network but the port can be left at 5405. The IP address should be based on the IP addresses being used by the servers but should take the form of XX.YY.ZZ.0.

Copy an example to make your life easier:

shell> cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf

After editing it, it should have a content similar to what follows:

  totem {
          version: 2
          crypto_cipher: none
          crypto_hash: none
          interface {
                  ringnumber: 0
                  bindnetaddr: 192.168.1.0
                  mcastaddr: 239.255.1.1
                  mcastport: 5405
                  ttl: 1
          }
  }
  logging {
          to_syslog: yes
  }
  quorum {
      provider: corosync_votequorum
      two_node: 1
      wait_for_all: 1
  }
  nodelist {
      node {
          ring0_addr: 192.168.1.101
          nodeid: 1
      }
      node {
          ring0_addr: 192.168.1.102
          nodeid: 2
      }
  }

Be careful while setting up the network address that the Corosync binds to. For example, according to the Corosync documentation, if the local interface is 192.168.5.92 with netmask 255.255.255.0, set bindnetaddr to 192.168.5.0. If the local interface is 192.168.5.92 with netmask 255.255.255.192, set bindnetaddr to 192.168.5.64, and so forth.

This makes Corosync automatically pick the network interface based on the network address provided. It is also possible to set up a specific address, such as 192.168.5.92, but in this case the configuration file is different per machine.

Create the /etc/corosync/service.d/pcmk file to tell the Corosync to load the Pacemaker plug-in:

service {
  # Load the Pacemaker Cluster Resource Manager
  name: pacemaker
  ver: 1
}

Change the /etc/default/corosync file as follows:

# start corosync at boot [yes|no]
START=yes

To avoid any mismatches, the configuration file can be copied across by using these commands on host1:

shell> scp /etc/corosync/corosync.conf host2:/etc/corosync/corosync.conf
shell> scp /etc/corosync/service.d/pcmk host2:/etc/corosync/service.d/pcmk
shell> scp /etc/default/corosync host2:/etc/default/corosync

Start Corosync on both hosts using:

shell> /etc/init.d/corosync start

Run tpcdump to check whether Corosync is working or not:

shell> tcpdump -i eth0 -n port 5405

To start the Pacemaker on host1, execute the following command:

shell> /etc/init.d/pacemaker start

Run Pacemaker's cluster resource monitoring command on host1 to view the status of the cluster:

shell> crm_mon --one-shot -V

As we are configuring a cluster made up of just 2 hosts, when one host fails (or loses contact with the other) there is no node majority (quorum) left and so by default the surviving node (or both if they are still running but isolated from each other) would be shut down by Pacemaker. This is not the desired behavior as it does not offer High Availability and so that default should be overridden (we later add an extra behavior whereby each node shuts itself down if it cannot ping a 3 node that is external to the cluster, thus preventing a split brain situation):

[root@host1]# crm configure property no-quorum-policy=ignore

We turn STONITH (Shoot The Other Node In The Head) off as this solution relies on each node shutting itself down in the event that it loses connectivity with the independent host:

[root@host1]# crm configure property stonith-enabled=false

Roughly speaking, STONITH refers to one node trying to kill another in the even that it believes the other has partially failed and should be stopped in order to avoid any risk of a split-brain scenario. To prevent a healthy resource from being moved around the cluster when a node is brought back on-line, Pacemaker has the concept of resource stickiness which controls how much a service prefers to stay running where it is.

[root@host1]# crm configure rsc_defaults resource-stickiness=100

In the next steps, we describe how to configure the different resources as a cluster:

[root@host1]# crm configure edit

This opens your default text editor, and you should use it to add the following lines into the cluster configuration:

primitive p_drbd_mysql ocf:linbit:drbd \
        params drbd_resource="clusterdb_res" \
        op monitor interval="15s"
primitive p_fabric_mysql ocf:heartbeat:mysql-fabric \
        params binary="/usr/local/bin/mysqlfabric" \
               config="/var/lib/mysql_drbd/fabric.cfg" \
        op start timeout="120s" interval="0" \
        op stop timeout="120s" interval="0" \
        op monitor interval="20s" timeout="30s"
primitive p_fs_mysql ocf:heartbeat:Filesystem \
        params device="/dev/drbd0" directory="/var/lib/mysql_drbd" \
               fstype="ext4"
primitive p_ip_mysql ocf:heartbeat:IPaddr2 \
        params ip="192.168.1.200" cidr_netmask="24" nic="eth0"
primitive p_mysql ocf:heartbeat:mysql \
        params binary="/usr/sbin/mysqld" \
               config="/var/lib/mysql_drbd/my.cnf" \
               datadir="/var/lib/mysql_drbd/data" \
               pid="/var/run/mysqld/mysqld.pid" \
               socket="/var/run/mysqld/mysqld.sock \
               user="mysql" group="mysql" \
               additional_parameters="--bind-address=localhost" \
        op start timeout="120s" interval="0" \
        op stop timeout="120s" interval="0" \
        op monitor interval="20s" timeout="30s"
group g_mysql p_fs_mysql p_ip_mysql p_mysql p_fabric_mysql
ms ms_drbd_mysql p_drbd_mysql \
        meta master-max="1" master-node-max="1" clone-max="2" \
             clone-node-max="1" notify="true"
colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
primitive p_ping ocf:pacemaker:ping params name="ping" \
          multiplier="1000" host_list="192.168.1.1" \
          op monitor interval="15s" timeout="60s" start timeout="60s"
clone cl_ping p_ping meta interleave="true"
location l_drbd_master_on_ping ms_drbd_mysql rule $role="Master" \
    -inf: not_defined ping or ping number:lte 0

As the MySQL service (group) has a dependency on the host it is running on being the DRBD master, that relationship is added by defining a co-location and an ordering constraint to ensure that the MySQL group is co-located with the DRBD master and that the DRBD promotion of the host to the master must happen before the MySQL group can be started:

colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start

In order to prevent a split-brain scenario in the event of network partitioning, Pacemaker can ping independent network resources (such as a network router) and then prevent the host from being the DRBD master in the event that it becomes isolated:

primitive p_ping ocf:pacemaker:ping params name="ping" multiplier="1000" \
        host_list="192.168.1.1" \
        op monitor interval="15s" timeout="60s" start timeout="60s"
clone cl_ping p_ping meta interleave="true"
location l_drbd_master_on_ping ms_drbd_mysql rule $role="Master" \
        -inf: not_defined ping or ping number:lte 0

Check if everything is running fine using the following command:

[root@host1]# crm_mon --one-shot -V
Ensure the correct daemons are started at system boot

At this point, a reliable MySQL service is in place but it is also important to check that the correct cluster services are started automatically as part of the servers' system startup. It is necessary for the Linux startup to start the Corosync and Pacemaker services but not DRBD or MySQL Process and MySQL Server as those services are started on the correct server by Pacemaker. To this end, execute the following commands on each host:

[root@host1] sysv-rc-conf drbd off
[root@host1] sysv-rc-conf corosync on
[root@host1] sysv-rc-conf  mysql off
[root@host1] sysv-rc-conf  pacemaker on

[root@host2] sysv-rc-conf drbd off
[root@host2] sysv-rc-conf corosync on
[root@host2] sysv-rc-conf  mysql off
[root@host2] sysv-rc-conf  pacemaker on
Note

MySQL Fabric is not installed as a service so there is nothing to do here for it.


User Comments
Sign Up Login You must be logged in to post a comment.