MySQL Utilities 1.4 Manual  /  ...  /  Sharding Scenario Sharding Scenario

In the sections that follow, we take the example of a employee table that must be sharded across three MySQL groups. The following procedure lists the sequence of commands to run to perform each step.

Unsharded Data

Assume that we have an unsharded table named employees that contains Employee IDs, on which we want to create the following shards. Each of these ranges is placed in one shard:

  • 1-99999: shard-1

  • 100000-199999: shard-2

  • 200000-: shard-3

In addition to creating the ranges themselves (in a range based sharding scheme) we also must define where this range of values should reside.

Starting Multiple MySQL Servers

MySQL Servers must be started on the directories that were copied. Assume that MySQL servers are started on the following hosts and ports:

  • localhost:13001

  • localhost:13002

  • localhost:13003

  • localhost:13004

  • localhost:13005

  • localhost:13006

  • localhost:13007

  • localhost:13008

Creating the Groups in Fabric

A logical group in Fabric maps to a shard. So as a first step to sharding we must implement the Fabric groups that store the shards. This can be done as follows:

shell> mysqlfabric group create group_id-1
shell> mysqlfabric group create group_id-2
shell> mysqlfabric group create group_id-3

The preceding commands create three high-availability groups: group_id-1, group_id-2, and group_id-3. Each group will store a shard.

Then we must define a global group which is used to propagate schema updates to all tables in the sharding setup and updates to global tables throughout the sharding scheme.

shell> mysqlfabric group create group_id-global
Registering the Servers

The MySQL servers must be registered with the groups. Each group contains two servers.

13001, 13002 belong to group_id-1

shell> mysqlfabric group add group_id-1 localhost:13001
shell> mysqlfabric group add group_id-1 localhost:13002

13003, 13004 belong to group_id-2

shell> mysqlfabric group add group_id-2 localhost:13003
shell> mysqlfabric group add group_id-2 localhost:13004

13005, 13006 belong to group_id-3

shell> mysqlfabric group add group_id-3 localhost:13005
shell> mysqlfabric group add group_id-3 localhost:13006

13007, 13008 belong to group_id-global

shell> mysqlfabric group add group_id-global localhost:13007
shell> mysqlfabric group add group_id-global localhost:13008

Then promote one server to master in each group:

shell> mysqlfabric group promote group_id-global
shell> mysqlfabric group promote group_id-1
shell> mysqlfabric group promote group_id-2
shell> mysqlfabric group promote group_id-3
Define a Shard Mapping

When we define a shard mapping, we basically do three things:

  1. Define the type of sharding scheme we want to use (RANGE, HASH, LIST, and so forth).

  2. Define a global group that stores all the updates that must be propogated to all the shards that are part of this sharding scheme.

  3. We generate a unique shard mapping id to which we can later associate database objects (tables).

shell> mysqlfabric sharding create_definition RANGE group_id-global
{ uuid        = 195bca1e-c552-464b-b4e3-1fa15e9b49d5,
  finished    = True,
  success     = True,
  return      = 1,
  activities  =
Add Tables to Defined Shard Mappings

Add the database table to the shard mapping defined previously.

shell> mysqlfabric sharding add_table 1 employees.employees emp_no
Add Shards for the Registered Tables
shell> mysqlfabric sharding add_shard 1 "group_id-1/1, group_id-2/100000, group_id-2/200000" --state=ENABLED
Executing Updates and Queries

Now you can write some example code for querying the sharded system.

import random
import mysql.connector
from mysql.connector import fabric

def prepare_synchronization(cur):
    # We need to keep track of what we have executed so far to guarantee
    # that the employees.employees table exists at all shards.
    gtid_executed = None
    cur.execute("SELECT @@global.gtid_executed")
    for row in cur:
        gtid_executed = row[0]
    return gtid_executed

def synchronize(cur, gtid_executed):
    # Guarantee that a slave has created the employees.employees table
    # before reading anything.
        (gtid_executed, )

def add_employee(conn, emp_no, first_name, last_name, gtid_executed):
    conn.set_property(tables=["employees.employees"], key=emp_no,
    cur = conn.cursor()
    synchronize(cur, gtid_executed)
    cur.execute("USE employees")
        "INSERT INTO employees VALUES (%s, %s, %s)",
        (emp_no, first_name, last_name)

def find_employee(conn, emp_no, gtid_executed):
    conn.set_property(tables=["employees.employees"], key=emp_no,
    cur = conn.cursor()
    synchronize(cur, gtid_executed)
    cur.execute("USE employees")
    for row in cur:
        print "Had to synchronize", row, "transactions."
        "SELECT first_name, last_name FROM employees "
        "WHERE emp_no = %s", (emp_no, )
    for row in cur:
        print row

def pick_shard_key():
    shard = random.randint(0, 2)
    shard_range = shard * 100000
    shard_range = shard_range if shard != 0 else shard_range + 1
    shift_within_shard = random.randint(0, 99999)
    return shard_range + shift_within_shard

# Address of the Fabric, not the host we are going to connect to.
conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274,
            "username": "admin", "password" : "adminpass"
    user="webuser", password="webpass", autocommit=True

conn.set_property(tables=["employees.employees"], scope=fabric.SCOPE_GLOBAL,
cur = conn.cursor()
cur.execute("CREATE DATABASE IF NOT EXISTS employees")
cur.execute("USE employees")
cur.execute("DROP TABLE IF EXISTS employees")
    "CREATE TABLE employees ("
    "   emp_no INT, "
    "   first_name CHAR(40), "
    "   last_name CHAR(40)"
gtid_executed = prepare_synchronization(cur)


first_names = ["John", "Buffalo", "Michael", "Kate", "Deep", "Genesis"]
last_names = ["Doe", "Bill", "Jackson", "Bush", "Purple"]

list_emp_no = []
for count in range(10):
    emp_no = pick_shard_key()
    add_employee(conn, emp_no,
                 first_names[emp_no % len(first_names)],
                 last_names[emp_no % len(last_names)],

for emp_no in list_emp_no:
    find_employee(conn, emp_no, gtid_executed)
Shard Move

If the current set of servers for a shard is not powerful enough, we can move the shard to a more powerful server set.

The shard-move functionality can be used to move a shard from one group to another. These are the steps necessary to move a shard.

  1. Set up the required group or groups.

    shell> mysqlfabric group create group_id-MOVE
    shell> mysqlfabric group add group_id-MOVE localhost:13009
    shell> mysqlfabric group add group_id-MOVE localhost:13010
    shell> mysqlfabric group promote group_id-MOVE
  2. Execute the shard move.

    shell> mysqlfabric sharding move_shard 1 group_id-MOVE
  3. Verify that the move has happened.

    shell> mysqlfabric sharding lookup_servers employees.employees 4
Shard Split

If the shard becomes overloaded, we may need to split the shard into another group. The shard-split feature can be used to split the data in a given shard into another shard. The following demonstrates how to do this.

  1. Set up the required group or groups.

    shell> mysqlfabric group create group_id-SPLIT
    shell> mysqlfabric group add group_id-SPLIT localhost:13011
    shell> mysqlfabric group add group_id-SPLIT localhost:13012
    shell> mysqlfabric group promote group_id-SPLIT
  2. Execute the shard split.

    shell> mysqlfabric sharding split_shard 2 group_id-SPLIT --split_value=150
  3. Verify that the shard Split happened.

    shell> mysqlfabric sharding lookup_servers employees.employees 152
    shell> mysqlfabric sharding lookup_servers employees.employees 103

User Comments
Sign Up Login You must be logged in to post a comment.