MySQL Internals Manual  /  X Protocol  /  Implementation Notes

15.8 Implementation Notes

Topics in this section:

Client and Server implementations of the protocol should make use of the following:

  • vectorized IO

  • pipelining

to reduce the latency and CPU usage.


Out-of-Band Messages

The client should decode the messages it receives from the server in a generic way and track the possible messages with a state-machine.

def getMessage(self, message):
  ## handle out-of-band message
  msg = messageFactory(message.type).fromString(message.payload)

  if message.type is Notification:
     raise NoMessageError()

  if message.type is Notice:
     raise NoMessageError()

  return msg


The client may send several messages to the server without waiting for a response for each message.

Instead of waiting for the response to a message like in:

Figure 15.21 Client Pipeline

the client can generate its messages and send it to the server without waiting:

Figure 15.22 Client Pipeline

The client has to ensure that when pipeline messages that in case of an error the following messages also error out correctly:

Figure 15.23 Client Pipeline

Vectored I/O

In network programming it is pretty common the to prefix the message payload with the header:

  • HTTP header + HTTP content

  • a pipeline of messages

  • message header + protobuf message

import struct
import socket

s = socket.create_connection(( "", 33060))

msg_type = 1
msg_payload = "abc"
msg_header = struct.pack(">I", len(msg_payload)) +
             struct.pack("B", msg_type)

## concat before send
s.send(msg_header + msg_payload)

## multiple syscalls

## vectored I/O
s.sendmsg([ msg_header, msg_payload ])

concat before send leads to pretty wasteful reallocations and copy operations if the payload is huge.

multiple syscalls is pretty wasteful for small messages as a few bytes only the whole machinery of copying data between user land and kernel land has to be started.

vectored io combines the best of both approaches and sends multiple buffers to the OS in one syscall and OS can optimize sending multiple buffers in on TCP packet.

On Unix this is handled by writev(2), on Windows exists WSASend()


Any good buffered iostream implementation should already make use of vectored I/O.

Known good implementation:

  • Boost::ASIO

  • GIO's GBufferedIOStream


Further control about how when to actual send data to the other endpoint can be achieved with "corking":

They work in combination with TCP_NODELAY (aka Nagle's Algorithm).



The protocol is structured in a way that the messages can be decoded completely without of knowing the state of the message sequence.

If data is available on the network, the server has to:

  • read the message

  • decode the message

  • execute the message

Instead of a synchronous read-execution cycle:

Figure 15.24 Server Pipeline

the Reader and the Executor can be decoupled into separate threads:

Figure 15.25 Separate Threads

which allows to hide cost of decoding the message behind the execution of the previous message.

The amount of messages that are prefetched this way should be configurable to allow a trade-off between:

  • resource usage

  • parallelism