WL#12039: Add support for zstd compression to classic protocol in the server
SUMMARY
The goal of this worklog is to add support to specify zstd compression algorithm in the network communication between client and server. The worklog shall depend on WL#12475 which will add the user interface necessary to specify compression algorithms by clients (replication and classic clients) in their interaction with the server.
USER STORIES
As an application developer against MySQL, I want to be able to specify in my client application the zstd compression algorithm in my connection, so that I am able to minimize the amount of bytes transferred over my limited bandwidth network.
As an application developer against MySQL, I want to be able to specify in my client application the zstd compression algorithm in my connection, so that I do not saturate the server NIC by transferring a smaller amount of bytes at the cost of a little higher CPU usage.
As a MySQL DBA, I want to setup a specific slave [channel] to use compression provided by zstd in it's client connection to the master, so that I am able to minimize the network bandwidth used to transfer binary logs from the master to the slave.
As a MySQL DBA, I want to setup a slave [channel] to use compression provided by zstd in it's client connection to the master, so that I do not saturate the server NIC by transferring smaller amount of bytes at the cost of a little higher CPU usage.
As an application developer against MySQL, I shall to be able to specify in my client application a compression level for zstd algorithm.This enables me to trade off between CPU usage and network latency based on requirements.
As a MYSQL DBA, I want to associate a compression level with the zstd compression algorithm for a slave [channel]. This shall allow me to trade off between the CPU usage and network latency based on requirements.
SCOPE
The scope of work involves the following:
Allow for build option to specify zstd to be bundled with mysqld or link mysqld with system provided library. This is in line with existing compression libraries addded to mysqld server (lz4, zlib).
Add stripped down version of the zstd library (providing the minimal required APIs for compression and decompression) in the server.
Add compression generic API to the vio layer. This generic API will present a unified interface to the server and client code. It would allow call the underlying library API based on the compression algorithm specified.
Necessary code changes in server/client and test cases demonstrating the use of zstd compression when connection is made by replication client or classic client.
REFERENCES
- BUG#88567 Add zstd compression option for replication.
FR#1
The client application developer or the client SHALL be able to specify zstd compression algorithm (and an optional compression level) when connecting to mysql server.
FR#2 The replication slave client SHALL be able to specify zstd compression algorithm (and an optional zstd compression level) based on new syntax change added to CHANGE MASTER TO SQL by WL#12475.
FR#3
The classic protocol and related clients (like mysql, mysqltest) SHALL be able to specify zstd compression algorithm and an optional zstd compression level.
FR#4 An developer building mysql source SHALL have new cmake option WITH_ZSTD= {bundled|system}.
FR#5 By default the server SHALL have zstd bundled with it.
FR#6 If user wishes to use the system zstd, the mysql source SHALL be built with WITH_ZSTD=system.
FR#7 If no zstd compression level is specified, the zstd compression level by default SHALL be 3.
Non-Functional Requirements NF#1 - With compression enabled, the performance shall decrease. This worklog will benefit where there is low network banwidth and the network transfer time dominates the cost of compression/decompression and the result sets are large. Please see use cases in HLD for more understanding of user scenarios.
Contents
- SUMMARY OF THE APPROACH
- SECURITY CONTEXT
- UPGRADE/DOWNGRADE AND CROSS-VERSION REPLICATION
- USER INTERFACE
- OBSERVABILITY
- DEPLOYMENT AND INSTALLATION
- PROTOCOL
- FAILURE MODEL SPECIFICATION
SUMMARY OF THE APPROACH
The WL does the major changes by adding a msys compression API, the stripped down version of zstd library and the cmake changes. Necessary changes are made in client, server and replication client code to ensure the zstd compression when compression algorithm zstd is specified by client or replication slave in it's connection to the server. A summary of major changes made are presented below:
mysys Library Changes (Compression APIs)
The mysys compression APIs shall be refactored. It already provides the generic interface to other components. It has to accommodate support for other compression methods. The my_compress and my_uncompress shall be modified to take a compression context structure. The compression context shall contain the compression method enum value and a union of compression method specific context (which holds context information pertaining to the specific compression methods) in use. This makes the API generic. Also adhoc APIs to check if a compression method is supported, convert compression string value to the corresponding values, allocating a compression context (based on method, level argument) and deallocating a context structure. The code for compression/decompression based on zstd/zlib are done in separate files thereby keeping interface and the implementation clean and modularized.
CMAKE Build Changes
A new source build option WITH_ZSTD is introduced. It can take two values bundled (use zstd bundled with distribution) or system (use the system ZSTD library).
SECURITY CONTEXT
Currently we allow zlib compressed connections at default compression level for all clients (via the compress option, mysql option MYSQL_OPT_COMPRESS). The zstd compressed connections shall also be allowed for all clients using the default compression level. The limitation on packet buffer sizes/result sets , number of connections ensure server functions and other controls server functions at a optimal level even if the number of compressed connections is more. There is no specification of compression_level for reasons of simplicity and security and may be considered in future extensions. In the compression config options can be specified at connect time and cannot be changed during duration of connection on the fly. To effect any change shall require a re connection.
In case of slave connections, compression level specification is allowed at the slave with CHANGE MASTER TO command. The CHANGE MASTER TO is a privileged command and requires REPLICATION_SLAVE_ADMIN or SUPER privilege.If no compression level is specified, the slave connection shall use the default compression level associated with the corresponding compression methods.
UPGRADE/DOWNGRADE AND CROSS-VERSION REPLICATION
The WL doesn't bring any changes to downgrade/upgrade process because there is neither a change in an existing behavior or removal of an existing behavior. Similarly there is no problem with respect to rolling upgrades. However it should be noted that these feature itself is applicable only when the user explicitly specifies compression to be used. Thus a re connection shall be required to turn on this feature. In case of replication, it involves briefly breaking the replication to restart the receiver thread. It is to be noted that if a 8.0 slave server (with compression library specified) or a client connects to a 5.7 server, the client or 8.0 server shall return error CR_COMPRESSION_SPECIFICATION_UNSUPPORTED and terminate the connection.
USER INTERFACE
Please refer WL#12475 for the user interface specification of client compression algorithms/level.
Monitoring:
See WL#12475.
OBSERVABILITY
See WL#12475.
DEPLOYMENT AND INSTALLATION
Server release builds shall have WITH_ZSTD=bundled option enabled and the server would statically link with libzstd.a with this option. Hence there would be no specific changes to installation scripts. However it is to be noted that developer can also build with option WITH_ZSTD=system. With this option enabled, the build system shall automatically detect the system availiable zstd headers and dynamically link mysqld with the libary libzstd.so.
PROTOCOL
N/A
FAILURE MODEL SPECIFICATION
N/A
Contents
Compression API
The mysys library (mysql system library) contains APIs relating to zlib compression. With this WL, we add support for zstd compression. The APIs are made generic so that it presents an abstract interface to deal with the different kinds of compression. Accordingly the following data members are added:
/** Enumeration of what kind of compression to use. We currently support ZLIB and ZSTD compression for network data transfers. */ enum mysql_compression_method { /** No Compression represented by string "none". Even if compress flag is set, the compression is not turned on. */ MYSQL_COMPRESSION_NONE = 0, /** ZLIB Compression represented by string "zlib". */ MYSQL_COMPRESSION_ZLIB = 1, /** ZSTD Compression represented by string "zstd". */ MYSQL_COMPRESSION_ZSTD = 2 }; /** Compress context information. relating to zlib compression. */ typedef struct mysql_zlib_compress_context { /** Compression level to use in zlib compression. */ unsigned int compression_level; }mysql_zlib_compress_context; typedef struct ZSTD_CCtx_s ZSTD_CCtx; typedef struct ZSTD_DCtx_s ZSTD_DCtx; /** Compress context information relating to zstd compression. */ typedef struct mysql_zstd_compress_context { /** Pointer to compressor context. */ ZSTD_CCtx *cctx; /** Pointer to decompressor context. */ ZSTD_DCtx *dctx; /** Compression level to use in zstd compression. */ unsigned int compression_level; }mysql_zstd_compress_context; /** Compression context information. It encapsulate the context information based on compression method and presents a generic struct. */ typedef struct mysql_compress_context { enum mysql_compression_method method; ///< Compression method name. union { mysql_zlib_compress_context zlib_ctx; ///< Context information of zlib. mysql_zstd_compress_context zstd_ctx; ///< Context information of zstd. } u; }mysql_compress_context;
APIs Provided:
/** Check if a given compression method is supported. @param method Compression method. @return true if method is supported else false. */ bool mysql_is_compression_method_supported( enum mysql_compression_method method); /** Compression method enum value corresponding to a given name. @param name Name of the compression method. Value can be either "none", "zstd", "zlib". @return enum value corresponding to the given compression method. */ enum mysql_compression_method mysql_compression_method_enum_val( const char *name); /** Construct compress context to be associated with a NET object. @param method Compression method. @param level Comression level corresponding to be associated with compression method. @return pointer to Compress context info if successful else null pointer. */ mysql_compress_context *mysql_compress_context_alloc( enum mysql_compression_method method, unsigned int compression_level); /** Deallocate the compression context allocated. @param compress_context Pointer to Compression context. */ void mysql_compress_context_dealloc(mysql_compress_context *compress_context); /** Get the default compression level corresponding to a given compression method. @param method Compression Method. Possible values are zlib or zstd. @return an unsigned int representing default compression level. 6 is the default compression level for zlib and 3 is the default compression level for zstd. */ uint mysql_default_compression_level(enum mysql_compression_method method); /** This replaces the packet with a compressed packet @param comp_ctx Compression context info containing compression method and context relating to the compression method. @param packet Data to compress. This is is replaced with the compressed data. @param len Length of data to compress at 'packet' @param complen out: 0 if packet was not compressed @return 1 if error (len is not changed) else 0 means no error (len contains size of compressed packet). */ bool my_compress(mysql_compress_context *comp_ctx, uchar *packet, size_t *len, size_t *complen); /** Allocate zlib compression contexts if necessary and compress using zstd the buffer. @param comp_ctx Compression context info relating to zstd. @param packet Data to compress. This is is replaced with the compressed data. @param len Length of data to compress at 'packet' @param complen out: 0 if packet was not compressed @return nullptr if error (len is not changed) else pointer to buffer. size of compressed packet). */ uchar *zstd_compress_alloc(mysql_zstd_compress_context *comp_ctx, const uchar *packet, size_t *len, size_t *complen); /** Uncompress a zstd compressed data.. @param comp_ctx Pointer to compression context. @param packet Packet which zstd compressed data. @param len Length of zstd compressed packet. @param complen [out] Length of uncompressed packet. @return true on error else false. */ bool zstd_uncompress(mysql_zstd_compress_context *comp_ctx, uchar *packet, size_t len, size_t *complen); /* Uncompress packet @param comp_ctx Pointer to compression context. @param packet Compressed data. This is is replaced with the orignal data. @param len Length of compressed data @param complen [out] Length of the packet buffer after uncompression (must be enough for the original data) @return true on error else false on success */ bool my_uncompress(mysql_compress_context *comp_ctx, uchar *packet, size_t len, size_t *complen);
Cmake related Changes
A new cmake file which contain the necessary changes required to 'bundle' zstd libarary is added. (The Facebook Inc. patch doesn't have the bundled value for WITH_ZSTD build option, to maintain consistency with WITH_(ZLIB|LZMA|LZ4) that we have in MySQL, we allow bundled/system option).
New zstd.cmake file is added to cmake directory. This defines several useful cmake macros like FIND_SYSTEM_ZSTD (which finds libraray/inculde paths related to zstd if SYSTEM option is used), MYSQL_USE_BUNDLED_ZSTD (set library/include paths for bundled option). The compile defines ZSTD and HAVE_ZSTD_COMPRESS available when zstd is made available as part of the build. Necessary changes are made in existing cmake files required for linking the mysqld with zstd.