Some operating systems implement compression at the file system level. Files are typically divided into fixed-size blocks that are compressed into variable-size blocks, which easily leads into fragmentation. Every time something inside a block is modified, the whole block is recompressed before it is written to disk. These properties make this compression technique unsuitable for use in an update-intensive database system.
The InnoDB Plugin implements a novel type of compression with the help of the well-known zlib library, which implements the LZ77 compression algorithm. This compression algorithm is mature, robust, and efficient in both CPU utilization and in reduction of data size. The algorithm is “lossless”, so that the original uncompressed data can always be reconstructed from the compressed form. LZ77 compression works by finding sequences of data that are repeated within the data to be compressed. The patterns of values in your data determine how well it compresses, but typical user data often compresses by 50% or more.
Unlike compression performed by an application, or compression
features of some other database management systems, InnoDB
compression applies both to user data and to indexes. In many
cases, indexes can constitute 40-50% or more of the total
database size, so this difference is significant. When
compression is working well for a data set, the size of the
InnoDB data files (the
.ibd files) is 25%
to 50% of the uncompressed size or possibly smaller. Depending
on the workload, this smaller database can in turn lead to a
reduction in I/O, and an increase in throughput, at a modest
cost in terms of increased CPU utilization.