In addition to choosing which tables to compress (and the page
size), the workload is another key determinant of performance.
If the application is dominated by reads, rather than updates,
fewer pages need to be reorganized and recompressed after the
index page runs out of room for the per-page
“modification log” that InnoDB maintains for
compressed data. If the updates predominantly change
non-indexed columns or those containing
BLOBs or large strings that happen to be
stored “off-page”, the overhead of compression
may be acceptable. If the only changes to a table are
INSERTs that use a monotonically increasing
primary key, and there are few secondary indexes, there is
little need to reorganize and recompress index pages. Since
InnoDB can “delete-mark” and delete rows on
compressed pages “in place” by modifying
DELETE operations on a
table are relatively efficient.
For some environments, the time it takes to load data can be as important as run-time retrieval. Especially in data warehouse environments, many tables may be read-only or read-mostly. In those cases, it might or might not be acceptable to pay the price of compression in terms of increased load time, unless the resulting savings in fewer disk reads or in storage cost is significant.
Fundamentally, compression works best when the CPU time is available for compressing and uncompressing data. Thus, if your workload is I/O bound, rather than CPU-bound, you may find that compression can improve overall performance. Therefore when you test your application performance with different compression configurations, it is important to test on a platform similar to the planned configuration of the production system.