MySQL 8.0.39
Source Code Documentation
|
Topics in this section:
Client and Server implementations of the protocol should make use of the following:
to reduce the latency and CPU usage.
The client should decode the messages it receives from the server in a generic way and track the possible messages with a state-machine.
The client may send several messages to the server without waiting for a response for each message.
Instead of waiting for the response to a message like in:
the client can generate its messages and send it to the server without waiting:
The client has to ensure that when pipeline messages that in case of an error the following messages also error out correctly:
In network programming it is pretty common the to prefix the message payload with the header:
concat before send* leads to pretty wasteful reallocations and copy operations if the payload is huge.
multiple syscalls* is pretty wasteful for small messages as a few bytes only the whole machinery of copying data between user land and kernel land has to be started.
vectored io* combines the best of both approaches and sends multiple buffers to the OS in one syscall and OS can optimize sending multiple buffers in on TCP packet.
On Unix this is handled by writev(2)
, on Windows exists WSASend()
Known good implementation:
Further control about how and when to actually send data to the other endpoint can be achieved with "corking":
TCP_CORK
http://linux.die.net/man/7/tcpTCP_NOPUSH
https://www.freebsd.org/cgi/man.cgi?query=tcp&sektion=4&manpath=FreeBSD+9.0-RELEASEThey work in combination with TCP_NODELAY
(aka Nagle's Algorithm).
The protocol is structured in a way that the messages can be decoded completely without of knowing the state of the message sequence.
If data is available on the network, the server has to:
Instead of a synchronous read-execution cycle:
the Reader and the Executor can be decoupled into separate threads:
which allows to hide cost of decoding the message behind the execution of the previous message.
The amount of messages that are prefetched this way should be configurable to allow a trade-off between: