Because using InnoDB
in combination with
memcached involves writing all data to disk,
whether immediately or sometime later, raw performance is
expected to be somewhat slower than using
memcached by itself. When using the
InnoDB
memcached plugin,
focus tuning goals for memcached operations
on achieving better performance than equivalent SQL operations.
Benchmarks suggest that queries and DML operations (inserts, updates, and deletes) that use the memcached interface are faster than traditional SQL. DML operations typically see a larger improvements. Therefore, consider adapting write-intensive applications to use the memcached interface first. Also consider prioritizing adaptation of write-intensive applications that use fast, lightweight mechanisms that lack reliability.
Adapting SQL Queries
The types of queries that are most suited to simple
GET
requests are those with a single clause
or a set of AND
conditions in the
WHERE
clause:
SQL:
SELECT col FROM tbl WHERE key = 'key_value';
memcached:
get key_value
SQL:
SELECT col FROM tbl WHERE col1 = val1 and col2 = val2 and col3 = val3;
memcached:
# Since you must always know these 3 values to look up the key,
# combine them into a unique string and use that as the key
# for all ADD, SET, and GET operations.
key_value = val1 + ":" + val2 + ":" + val3
get key_value
SQL:
SELECT 'key exists!' FROM tbl
WHERE EXISTS (SELECT col1 FROM tbl WHERE KEY = 'key_value') LIMIT 1;
memcached:
# Test for existence of key by asking for its value and checking if the call succeeds,
# ignoring the value itself. For existence checking, you typically only store a very
# short value such as "1".
get key_value
Using System Memory
For best performance, deploy the
daemon_memcached
plugin on machines that are
configured as typical database servers, where the majority of
system RAM is devoted to the InnoDB
buffer pool, through the
innodb_buffer_pool_size
configuration option. For systems with multi-gigabyte buffer
pools, consider raising the value of
innodb_buffer_pool_instances
for maximum throughput when most operations involve data that is
already cached in memory.
Reducing Redundant I/O
InnoDB
has a number of settings that let you
choose the balance between high reliability, in case of a crash,
and the amount of I/O overhead during high write workloads. For
example, consider setting the
innodb_doublewrite
to
0
and
innodb_flush_log_at_trx_commit
to 2
. Measure performance with different
innodb_flush_method
settings.
For other ways to reduce or tune I/O for table operations, see Section 10.5.8, “Optimizing InnoDB Disk I/O”.
Reducing Transactional Overhead
A default value of 1 for
daemon_memcached_r_batch_size
and
daemon_memcached_w_batch_size
is intended for maximum reliability of results and safety of
stored or updated data.
Depending on the type of application, you might increase one or
both of these settings to reduce the overhead of frequent
commit operations. On a busy
system, you might increase
daemon_memcached_r_batch_size
,
knowing that changes to data made through SQL may not become
visible to memcached immediately (that is,
until N
more get
operations are processed). When processing data where every
write operation must be reliably stored, leave
daemon_memcached_w_batch_size
set to 1
. Increase the setting when
processing large numbers of updates intended only for
statistical analysis, where losing the last
N
updates in an unexpected exit is an
acceptable risk.
For example, imagine a system that monitors traffic crossing a
busy bridge, recording data for approximately 100,000 vehicles
each day. If the application counts different types of vehicles
to analyze traffic patterns, changing
daemon_memcached_w_batch_size
from 1
to 100
reduces I/O
overhead for commit operations by 99%. In case of an outage, a
maximum of 100 records are lost, which may be an acceptable
margin of error. If instead the application performed automated
toll collection for each car, you would set
daemon_memcached_w_batch_size
to 1
to ensure that each toll record is
immediately saved to disk.
Because of the way InnoDB
organizes
memcached key values on disk, if you have a
large number of keys to create, it may be faster to sort the
data items by key value in the application and
add
them in sorted order, rather than create
keys in arbitrary order.
The memslap command, which is part of the
regular memcached distribution but not
included with the daemon_memcached
plugin,
can be useful for benchmarking different configurations. It can
also be used to generate sample key-value pairs to use in your
own benchmarks.