WL#4037: Online backup: Use zlib compression to reduce backup file size

Affects: Server-6.0 — Status: Complete

Description
High Level Architecture
Low Level Design

The rationale is:

Make backup images smaller.

Compressing at the time that we produce the image file,
rather than producing an uncompressed file and compressing
afterwards, means less total time to write and less
disk use.

Compressing during backup is a feature of other DBMSs.

The rationale is not:

Make data transfer from drivers quicker.

Syntax
------

BACKUP DATABASE ...
[ WITH COMPRESSION [ COMPRESSION_ALGORITHM = algorithm_name ] ]
...;

WITH COMPRESSION means "compress the entire image file".
If it is omitted, there is no compression.
But in any case MySQL uses the binlog format of records,
instead of the handler format (to make rows smaller),
which was done for BUG#31538 "Backup file is too large".

The optional COMPRESSION_ALGORITHM=algorithm_name clause
has no significance at this time, because
the only legal algorithm name is gzip.
We ignore that gzip is not really the 'name' of the algorithm.
We ignore that Oracle prefers to say 'zlib'.
If COMPRESSION_ALGORITHM=algorithm_name clause is not
specified, it is assumed.

According to WL#4271 "Online Backup: Encryption" there might be
encryption clauses, for example "ENCRYPTION_ALGORITHM = AES".
The order does not matter, that is, WITH COMPRESSION
may come either before or after ENCRYPTION_ALGORITHM.

The WITH COMPRESSION clause modifies the entire statement.
We will not have syntax that allows compression of only one
database, or of only one image file.

Our syntax will differ from syntax of other DBMSs:
BACKUP DATABASE ... TO ... WITH [NO] COMPRESSION /* SQL Server */
BACKUP AS COMPRESSED INCREMENTAL LEVEL 1 TAG = WEEKLY DATABASE; /* Oracle */
BACKUP DATABASE ... COMPRESS [ COMPRLIB name [ EXCLUDE] ] ...; /* DB2 */

Examples:

BACKUP DATABASE test to '1.bak.gz'
WITH COMPRESSION;

BACKUP DATABASE test TO '1.bak.gz'
WITH COMPRESSION
COMPRESSION_ALGORITHM = gzip;

BACKUP DATABASE a,b,c TO '2.bak.gz'
WITH COMPRESSION
COMPRESSION_ALGORITHM = gzip
ENCRYPTION_ALGORITHM = aes;

BACKUP DATABASE a,b,c TO '2.bak.gz'
ENCRYPTION_ALGORITHM = aes
WITH COMPRESSION
COMPRESSION_ALGORITHM = gzip;

Privileges
----------

No special privileges are necessary.

Effects
-------

The image file will be compressed using zlib (see "The zlib library" section).

It will be possible to uncompress the image file with gunzip,
or uncompress the image file with winzip,
or handle the image file directly with RESTORE.

This fixed string will be in the comment field of the file header:
'MySQL 6.0 image file backup WITH COMPRESSION COMPRESSION_ALGORITHM = gzip'
not "Compressed MySQL Online Backup Stream v1.0".
For information about the null-terminated comment field in the header, see:
http://www.gzip.org/zlib/rfc-gzip.html or http://www.faqs.org/rfcs/rfc1952.html

We will not use the null-terminated "original file name" field
of the file header. The compressed-file file name is the same as
the original-file file name that the user says in the BACKUP
statement; MySQL does not add an extension like ".gz".

The PKZIP format (.zip) uses the same method as zlib, but
has a fancier file format. We will not use .zip file format.

The original image header (not to be confused with the gzip
header) will be compressed.

Choice of compression library
-----------------------------

The functionality is in a separate library, which is
optionally included in the server at build time (not a plugin).

However, zlib (the only current legal option) often is
in the server already and we won't need to duplicate it.
So there is no change to build / configure specification.

We had three stream-based compression zlib-variant APIs which
might already be optionally in the server now:
zlib, azio, and NDB azio.

1) zlib (from zlib/gzio.c)
Often the server is linked against the system zlib.

Anything compressed with gzio can be uncompressed with /bin/gunzip (and
similar, such as winzip).
The gzio routines can be used to read uncompressed files too. It checks
for gzip magic numbers, and if they're not present, just passes IO
through to normal buffered IO routines.

2) azio (from storage/archive/azio.c)

Forked from gzio.c. The file format is slightly different, but
it is trivial to implement an azio-format gunzip (aunzip?).
This gzio with these additional features:
* Contains code to read/write gzip comment field.
* Removes some old crufty things from gzio.
* Adds ability to store "frm", although anything could be stored.

3) NDB azio (from storage/ndb/src/common/util/azio.c)

Forked from azio, that is, formed from "2)" above.
This is an extension of azio with these additional features:
* use of O_DIRECT, i.e. 512 byte aligned and sized IOs
* no dynamic memory allocation (all done on startup)
* broken flush (due to item 1, O_DIRECT support... and no need for ndb)
We would only do ndb azio format (512 byte aligned header,
limit of 512 bytes on header). It would not be gzip format compatible.
It would probably break "frm" storage.
It would probably handle errors better than archive azio.

4) deflate/inflate library (from zlib, I guess)

We did not really need either azio or gzio. We just need
to write trivial gzip-compatible headers/trailers. The
deflate and inflate routines are available and sufficient.

The only compression method that we need from the library is
deflate, and inflate to decompress. That's all that gzip is
(gzip is a simple header around the deflate stream).

Because we want to read/write the comment field,
because we want to maintain gzip format compatibility,
we expect the implementor to choose to use "4)".

RESTORE
-------

There is no new syntax for RESTORE. RESTORE can determine
from the image file header whether WITH COMPRESSION
was used.

When beginning restore:

* read initial N bytes from backup image, where N is
max(backup_image_header_size, gzip_header_size);
* if the server is compiled with zlib support and gzip headers are present,
initialize decompressor and read + decompress backup_image_header_size bytes.
If the server is compiled without zlib support, skip this step.
* check backup image header.

If the library is not available
-------------------------------

During BACKUP, if the zlib library cannot be accessed,
then BACKUP proceeds anyway, without warnings, and produces
an uncompressed image file.

During RESTORE, if the zlib libary cannot be accessed,
Error "Cannot find gzip; suggest you gunzip/winzip then try again".

Other errors, for example an error return from deflate(),
have the same effect as a disk read/write error.

Things that we will not do
--------------------------

The following items are related to backup compression,
and perhaps they are desirable for somebody someday.
But they are not part of this task. We will not do them.

1. New COMPRESSION column in mysql.online_backup.

2. Letting users create plugins; helping plugin communication
by passing a generalized comma-delimited "field=value" string.
Although there's no provision for that now, probably there
eventually will be. An example, which may be inexact, is:
BACKUP ... WITH COMPRESSION COMPRESSION_ALGORITHM=rar
(DICTIONARY_SIZE=50 COMPRESSION_LEVEL=3
TEXT_COMPRESSION_PARAM=12 DELTA_COMPRESSION=OFF MODE=SOLID);

3. Letting the engine (storage engine) be aware of compression
so that the storage engine could pass an already-compressed
string back to the main server, thus saving on transfer time.

4. Using any other library besides good old gzip, e.g.
Parallel gzip:
http://groups.google.com/group/comp.compression/browse_thread/thread/4d3a0d95053f581a/5514b6c38b0d4dfc
source:
http://freebsd.ntu.edu.tw/FreeBSD/ports/local-distfiles/chinsan/pigz17.c.gz
bzip2:
http://www.bzip.org/
http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html#std-rdwr
Parallel bzip2:
http://compression.ca/pbzip2/
LZO:
http://www.oberhumer.com/opensource/lzo/

5. There was a suggestion to "modify the default and consistent
snapshot drivers to compress their output to the kernel" (with
the goal of minimizing the stream size). But we're not trying
to minimize the stream size; the rationale is just to minimize
the image file size; see the High Level Description.
(The current default and consistent snapshot backup drivers
produce uncompressed file streams which are many times
the size of the raw database files.)

6. There was a suggestion to "Extend the zlib API to generalize
it for use in other areas of MySQL. Specifically, to create a
series of methods that will allow the backup drivers to compress
their data prior to sending it to the kernel during backup.
Likewise, methods would be made available to decompress the data
during restore." (This is approximately the same as number 5.)

Documentation Hints
-------------------

The following are hints for docs people who may read this
worklog task when it's done. We may email them later to docs-private.

The MySQL Reference Manual refers to zlib a few times:
" If you have problems with configure trying to link with -lz
when you don't have zlib installed, you have two options:
* If you want to be able to use the compressed communication
protocol, you need to get and install zlib from ftp.gnu.org.
* Run configure with the --with-named-z-libs=no option when
building MySQL."
http://dev.mysql.com/doc/refman/5.0/en/solaris.html
"This [COMPRESS] function requires MySQL to have been compiled
with a compression library such as zlib."
http://dev.mysql.com/doc/refman/5.0/en/encryption-functions.html
"The ARCHIVE engine uses zlib lossless data compression"
http://dev.mysql.com/doc/refman/5.0/en/archive-storage-engine.html
"have_compress YES if the zlib compression library is available
to the server, NO if not. If not, the COMPRESS() and UNCOMPRESS()
functions cannot be used."
http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html
...
The above references don't mention the importance of zlib
for backup compression. You (O docs people) have to decide
whether they should.

We recommend that file names end with '.bak.gz'.

References
----------

Discussion of the syntax took place in the dev-backup thread
"Re: Oracle compression options"
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=230&mail=940
"Backup Meeting 2008-05-13"
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=230&mail=915
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=230&mail=922
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=230&mail=924

Oracle documentation mentioning algorithm names ZLIB, BZIP2.
http://download-uk.oracle.com/docs/cd/B28359_01/backup.111/b28270/rcmconfa.htm#CHDEHCEB

Opening output stream
---------------------
If compression is requested, initialize compressed stream for writing in
Output_stream::open():
- allocate compression output buffer. It's size is defined as ZBUF_SIZE to 65K.
- initialize zstream structure.
- call deflateInit2() with default compression level. windowBits parameter must
be set to MAX_WBITS + 16, so it writes simple gzip header and tailer.

Backup header must be compressed, which means
Output_stream::write_magic_and_version() must write header using stream_write()
function. 

Closing output stream
---------------------
- flush compression output buffer. This needs to be done after after
bstream_close() as it flushes backup output buffer.
- deallocate compression output buffer.
- call deflateEnd().

Writing compressed stream
-------------------------
If compression is requested, it will be performed using deflate() in stream_write().

Opening input stream
--------------------
- read 10 bytes from the stream.
- check if first 3 bytes match gzip header. If they do not match continue
initialization as usual.
- allocate compression input buffer.
- initialize zstream structure.
- copy 10 bytes from header buffer to compression input buffer, so next call to
inflate() can skip gzip header.
- call inflateInit2() with windowsBits parameter set to MAX_WBITS + 16, so it
informs inflate() that gzip header is present and must be skipped.
- read backup header using stream_read(), so backup header is decompressed.

Closing input stream
--------------------
- deallocate compression input buffer.
- call inflateEnd().

Reading compressed stream
-------------------------
Decompression will be performed using inflate() in stream_read().