This section describes the general properties of events as byte sequences as they are written to binary or relay log files.
All events have a common general structure consisting of an event header followed by event data:
+===================+ | event header | +===================+ | event data | +===================+
The details about what goes in the header and data parts have changed over time, which gives rise to different versions of the binary log format:
v1: Used in MySQL 3.23
v3: Used in MySQL 4.0.2 though 4.1
v4: Used in MySQL 5.0 and up
A v2 format was used briefly (in early MySQL 4.0.x versions), but it is obsolete and no longer supported.
Some details of event structure are invariant across binary log versions; others depend on the version. Within any given version, different types of events vary in the structure of the data part.
The first event in a log file is special. It is a descriptor event that provides information such as the binary log version and the server version. The information in the descriptor event enables programs to determine which version of the binary log format applies to the file so that the remaining events in the file can be properly read and interpreted.
For details about the initial descriptor event and how to use it to determine the format of a binary log file, see Binary Log Versions. For additional information about other types of events, see Event Data for Specific Event Types.
The following event diagrams contain field descriptions written using these conventions:
A field line has a name describing the contents of the field.
The name is followed by two numbers in offset : length format, where offset is the 0-based offset (position) of the field within the event and length is the length of the field. Both values are given in bytes.
The overall structure for events in the different binary versions is shown here. The following sections describe the header and data parts in more detail.
v1 event structure:
+=====================================+ | event | timestamp 0 : 4 | | header +----------------------------+ | | type_code 4 : 1 | | +----------------------------+ | | server_id 5 : 4 | | +----------------------------+ | | event_length 9 : 4 | +=====================================+ | event | fixed part 13 : y | | data +----------------------------+ | | variable part | +=====================================+
header length = 13 bytes
data length = (event_length - 13) bytes
y is specific to the event type.
v3 event structure:
+=====================================+ | event | timestamp 0 : 4 | | header +----------------------------+ | | type_code 4 : 1 | | +----------------------------+ | | server_id 5 : 4 | | +----------------------------+ | | event_length 9 : 4 | | +----------------------------+ | | next_position 13 : 4 | | +----------------------------+ | | flags 17 : 2 | +=====================================+ | event | fixed part 19 : y | | data +----------------------------+ | | variable part | +=====================================+
header length = 19 bytes
data length = (event_length - 19) bytes
y is specific to the event type.
v4 event structure:
+=====================================+ | event | timestamp 0 : 4 | | header +----------------------------+ | | type_code 4 : 1 | | +----------------------------+ | | server_id 5 : 4 | | +----------------------------+ | | event_length 9 : 4 | | +----------------------------+ | | next_position 13 : 4 | | +----------------------------+ | | flags 17 : 2 | | +----------------------------+ | | extra_headers 19 : x-19 | +=====================================+ | event | fixed part x : y | | data +----------------------------+ | | variable part | +=====================================+
header length = x bytes
data length = (event_length - x) bytes
fixed data length = y bytes variable data length = (event_length - (x + y)) bytes
x is given by the header_length field in the format description event (FDE). Currently, x is 19, so the extra_headers field is empty.
y is specific to the event type, and is given by the FDE. The fixed-part length is the same for all events of a given type, but may vary for different event types.
The fixed part of the event data is sometimes referred to as the "post-header" part. The variable part is sometimes referred to as the "payload" or "body."
For information about how to use the FDE to interpret v4 events, see Binary Log Formats.
Event contents are written using these conventions:
Numbers are written in little-endian format (least significant byte first), unless otherwise indicated.
Values that represent positions or lengths are given in bytes and should be considered unsigned.
Some numbers are written as Packed Integers. The format is described later in this section.
Strings are written in varying formats:
A string may be written to a fixed-length field and null-padded (with 0x00 bytes) on the right.
A variable-length string may be preceded by a length field that indicates the length of the string.
Some variable-length strings are null-terminated; others are not. The descriptions for individual string fields indicates which is the case.
For null-terminated strings that are preceded by a length field, the length does not include the terminating null byte, unless otherwise indicated.
If there is a variable-length string at the end of an event and no length field precedes it, its length may be determined as the event length minus the length of the other fields in the event.
Some events use Packed Integers, a special format for efficient representation of unsigned integers. A Packed Integer has the capacity of storing up to 8-byte integers, while small integers still can use 1, 3, or 4 bytes. The value of the first byte determines how to read the number, according to the following table.
First byte |
Format |
0-250 |
The first byte is the number (in the range 0-250). No additional bytes are used. |
252 |
Two more bytes are used. The number is in the range 251-0xffff. |
253 |
Three more bytes are used. The number is in the range 0xffff-0xffffff. |
254 |
Eight more bytes are used. The number is in the range 0xffffff-0xffffffffffffffff. |
Packed Integer format derives from the "Length Coded Binary" representation used in the MySQL client/server network protocol (see Chapter 15, MySQL Client/Server Protocol). That representation allows a first byte value of 251 to represent the SQL NULL value, but 251 is apparently unused for Packed Integers in the binary log.
Each event starts with a header of size
LOG_EVENT_HEADER_LEN. The value of this
constant is 13 in MySQL 3.23 (v1 format), and 19 in MySQL 4.0
and up (v3 format and up). The value is larger as of 4.0 because
next position and flags fields were added to the header format
then:
v1: 13 bytes: timestamp + type code + server ID + event length
v3: 19 bytes: v1 fields + next position + flags
v4: 19 bytes or more: v3 fields + possibly other information
The header for any version is a superset of the header for all earlier versions:
The first 13 bytes for v3 and v4 are the same as those for v1.
The first 19 bytes for v4 are the same as those for v3.
Because the event header in a newer binary log format starts with the header of the old formats, headers in different formats are backward compatible.
v1 event header:
+============================+ | timestamp 0 : 4 | +----------------------------+ | type_code 4 : 1 | +----------------------------+ | server_id 5 : 4 | +----------------------------+ | event_length 9 : 4 | +============================+
The 13 bytes of the v1 header are also present in the header of all subsequent binary log versions.
v3 event header:
+============================+ | timestamp 0 : 4 | +----------------------------+ | type_code 4 : 1 | +----------------------------+ | server_id 5 : 4 | +----------------------------+ | event_length 9 : 4 | +----------------------------+ | next_position 13 : 4 | +----------------------------+ | flags 17 : 2 | +============================+
Compared to v1, the header in v3 and up contains two additional fields, for a total of 19 bytes.
v4 event header:
+============================+ | timestamp 0 : 4 | +----------------------------+ | type_code 4 : 1 | +----------------------------+ | server_id 5 : 4 | +----------------------------+ | event_length 9 : 4 | +----------------------------+ | next_position 13 : 4 | +----------------------------+ | flags 17 : 2 | +----------------------------+ | extra_headers 19 : x-19 | +============================+
The v4 format includes an extra_headers
field; this is a mechanism for adding extra fields to the header
without breaking the format. This extension mechanism is
implemented via the format description event that appears as the
first event in the file. (See
Binary Log Versions
for details.) Currently, x = 19, so the
extra_headers field is empty; thus, the v4
header is the same as the v3 header.
Note: The extra_headers field does not appear
in the FORMAT_DESCRIPTION_EVENT or
ROTATE_EVENT header.
The offsets of several fields within the event header are available as constants in log_event.h:
EVENT_TYPE_OFFSET = 4
SERVER_ID_OFFSET = 5
EVENT_LEN_OFFSET = 9
LOG_POS_OFFSET = 13
FLAGS_OFFSET = 17
The header fields contain the following information:
timestamp
4 bytes. This is the time at which the statement began executing. It is represented as the number of seconds since 1970 (UTC), like the TIMESTAMP SQL data type.
type_code
1 byte. The type of event. 1 means
START_EVENT_V3, 2 means
QUERY_EVENT, and so forth. These numbers are
defined in the enum Log_event_type
enumeration in log_event.h. (See
Event Classes and
Types.)
server_id
4 bytes. The ID of the mysqld server that
originally created the event. It comes from the
server-id option that is set in the server
configuration file for the purpose of replication. The server ID
enables endless loops to be avoided when circular replication is
used (with option --log-slave-updates on).
Suppose that M1, M2, and M3 have server ID values of 1, 2, and
3, and that they are replicating in circular fashion: M1 is the
master for M2, M2 is the master for M3, and M3 is that master
for M1. The master/server relationships look like this:
M1---->M2 ^ | | | +--M3<-+
A client sends an INSERT statement to M1.
This is executed on M1 and written to its binary log with an
event server ID of 1. The event is sent to M2, which executes it
and writes it to its binary log; the event is still written with
server ID 1 because that is the ID of the server that originally
created the event. The event is sent to M3, which executes it
and writes it to its binary log, still with server ID 1.
Finally, the event is sent to M1, which sees server ID = 1 and
understands this event originated from itself and therefore must
be ignored.
event_length
4 bytes. The total size of this event. This includes both the
header and data parts. Most events are less than 1000 bytes,
except when using LOAD DATA INFILE (where
events contain the loaded file, so they can be big).
next_position (not present in v1 format).
4 bytes. The position of the next event in the master's binary log. The format differs between binlogs and relay logs, and depending on the version of the server (and for relay logs, depending on the version of the master):
binlog on a v3 server: Offset to the beginning of
the event, counting from the beginning of the binlog
file. In other words, equal to the value of
tell() just before the event was
written.
The next_position is used on the slave in two cases:
for SHOW SLAVE STATUS to be able
to show coordinates of the last executed event
in the master's coordinate
system.
for START SLAVE UNTIL MASTER_LOG_FILE=x,
MASTER_LOG_POS=y, so that the master's
coordinates can be used.
In 5.0 and up, next_position is called "end_log_pos" in
the output from mysqlbinlog and SHOW BINLOG
EVENTS. In 4.1, next_position is called
"log_pos" in the output from mysqlbinlog and
"orig_log_pos" in the output from SHOW BINLOG
EVENTS.
flags (not present in v1 format)
2 bytes. The possible flag values are described at Event Flags.
extra_headers (not present in v1, v3 formats)
Variable-sized. The size of this field is determined by the
format description event that occurs as the first event in the
file. Currently, the size is 0, so, in effect, this field never
actually occurs in any event. At such time as the size becomes
nonzero, this field still will not appear in events of type
FORMAT_DESCRIPTION_EVENT or
ROTATE_EVENT.
Event headers for v3 format and up contain event flags in the
two flag bytes at position FLAGS_OFFSET =
17. There are comments about these flags in log_event.h, in
addition to the remarks in this section.
Current event flags:
LOG_EVENT_BINLOG_IN_USE_F = 0x1 (New in
5.0.3)
Used to indicate whether a binary log file was closed
properly. This flag makes sense only for
FORMAT_DESCRIPTION_EVENT. It is set
when the event is written to the log file. When the log
file is closed later, the flag is cleared. (This is the
only case when MySQL modifies an already written part of a
binary log file).
LOG_EVENT_THREAD_SPECIFIC_F = 0x4 (New
in 4.1.0)
Used only by mysqlbinlog (not by the
replication code at all) to be able to deal properly with
temporary tables. mysqlbinlog displays
events from the binary log in printable format, so that
you can feed the output into mysql (the
command-line interpreter), to achieve incremental backup
recovery. But suppose that the binary log is as follows,
where two simultaneous threads used temporary tables with
the same name (which is allowed because temporary tables
are visible only in the thread which created them):
<thread id 1> CREATE TEMPORARY TABLE t (a INT); <thread id 2> CREATE TEMPORARY TABLE t (a INT);
In this case, simply feeding this into
mysql will lead to a "table t already
exists" error. This is why events that use temporary tables
are marked with the flag, so that
mysqlbinlog knows it has to set the
pseudo_thread_id system variable before,
like this:
SET PSEUDO_THREAD_ID=1; CREATE TEMPORARY TABLE t (a INT); SET PSEUDO_THREAD_ID=2; CREATE TEMPORARY TABLE t (a INT);
This way there is no confusion for the server that receives
these statements. Always printing SET
PSEUDO_THREAD_ID, even when temporary tables are not
used, would cause no bug, it would just slow down.
LOG_EVENT_SUPPRESS_USE_F = 0x8 (New in
4.1.7)
Suppresses generation of a USE statement before the actual
statement to be logged. This flag should be set for any
event that does not need to have the default database set
to function correctly, such as CREATE DATABASE and DROP
DATABASE. This flag should only be used in exceptional
circumstances because it introduces a significant change
in behavior regarding the replication logic together with
the --binlog-do-db and
--replicate-do-db options.
LOG_EVENT_UPDATE_TABLE_MAP_VERSION_F =
0x10 (New in 5.1.4)
Causes the table map version internal to the binary log to be increased after the event has been written to the log.
Obsolete event flags:
LOG_EVENT_TIME_F (obsolete as of
4.1.1). This flag was never set.
LOG_EVENT_FORCED_ROTATE_F (obsolete as
of 4.1.1). This flag was set in events of type
ROTATE_EVENT on the master, but was not
used for anything useful
They are now commented out in log_event.h
and their values are available for reuse or have already been
reused. (But see the associated comments in
log_event.h for various cautions!)
The structure of an event's data part depends on the event type:
In v1 and v3, the event type entirely determines the data format
In v4, interpretation of the data part depends on the event
type in conjunction with information from the format
description event. This is because v4 allows for an
extra headers field, the size of which is
defined in the format description event. In practice, the
extra headers field currently is empty.
The data part of an event consists of a fixed-size part and a
variable-size part. Either or both parts may be empty, depending
on the event type. (For example, a STOP_EVENT
consists only of the header part; the fixed and variable data
parts are both empty.)
The size of the event data part is the event size (contained in the header) minus the header size. The size of the fixed data part is a function of the event type. The size of the variable data part is the event size minus the size of the header, minus the size of the fixed data part.
The following principles hold across all events in a binary log file:
The fixed part of the event data is the same size for all events of a given type.
The variable part of the event data can differ in size among events of a given type.
For details about the fixed and variable parts of event data for different events, see Event Data for Specific Event Types.
The fixed part of the event data goes under different names, depending on which source file, work log, bug report, etc. you are reading:
Sometimes it is called the "fixed data" part, as in this discussion.
Sometimes it is called the "post-headers" part.
To make things notationally interesting, sometimes the
fixed data part is referred to as the "event-specific
headers" part of the event. That is, the word "header" is
used in reference to a portion of the data part. One
manifestation of this notational phenomenon appears in
log_event.h, where you will find the
symbol LOG_EVENT_MINIMAL_HEADER_LEN
defined as 19 (the header length for v3 and v4), plus
other symbols with names of the form
XXX_HEADER_LEN for different event
types. The former symbol is the size of the event header
(always 19). The latter symbols define the size of the
fixed portion of the data part that is to be treated as
the event-specific headers. For example,
ROTATE_HEADER_LEN is 8 because a
ROTATE_EVENT has an 8-byte field in the
fixed data part that indicates the position in the next
log file of the first event in that file.
The variable part of event data also goes under different names, such as the event "payload" or "body."
