This page contains:
A high-altitude "summary" picture of the parts of a
MySQL/InnoDB record structure.
A description of each part.
An example.
After reading this page, you will know how
MySQL/InnoDB stores a physical record.
The chart below shows the three parts of a physical record.
| Name | Size |
| Field Start Offsets | (F*1) or (F*2) bytes |
| Extra Bytes | 6 bytes |
| Field Contents | depends on content |
Legend: The letter 'F' stands for 'Number Of Fields'.
The meaning of the parts is as follows:
The FIELD START OFFSETS is a list of numbers containing the information "where a field starts".
The EXTRA BYTES is a fixed-size header.
The FIELD CONTENTS contains the actual data.
An Important Note About The Word "Origin"
The "Origin" or "Zero Point" of a record is the first byte of the Field Contents --- not the first byte of the Field Start Offsets. If there is a pointer to a record, that pointer is pointing to the Origin. Therefore the first two parts of the record are addressed by subtracting from the pointer, and only the third part is addressed by adding to the pointer.
The Field Start Offsets is a list in which each entry is the position, relative to the Origin, of the start of the next field. The entries are in reverse order, that is, the first field's offset is at the end of the list.
An example: suppose there are three columns. The first
column's length is 1, the second column's length is 2, and the
third column's length is 4. In this case, the offset values
are, respectively, 1, 3 (1+2), and 7 (1+2+4). Because values
are reversed, a core dump of the Field Start Offsets would
look like this: 07,03,01.
There are two complications for special cases:
Complication #1: The size of each offset can be either one byte or two bytes. One-byte offsets are only usable if the total record size is less than 127. There is a flag in the "Extra Bytes" part which will tell you whether the size is one byte or two bytes.
Complication #2: The most significant bits of an offset may contain flag values. The next two paragraphs explain what the contents are.
When The Size Of Each Offset Is One Byte
1 bit = NULL, = NULL
7 bits = the actual offset, a number between 0 and 127
When The Size Of Each Offset Is Two Bytes
1 bit = NULL, = NULL
1 bit = 0 if field is on same page as offset, = 1 if field and offset are on different pages
14 bits = the actual offset, a number between 0 and 16383
It is unlikely that the "field and offset are on different
pages" unless the record contains a large
BLOB.
The Extra Bytes are a fixed six-byte header.
Name |
Size |
Description |
info_bits: |
?? |
?? |
() |
1 bit |
unused or unknown |
() |
1 bit |
unused or unknown |
deleted_flag |
1 bit |
1 if record is deleted |
min_rec_flag |
1 bit |
1 if record is predefined minimum record |
n_owned |
4 bits |
number of records owned by this record |
heap_no |
13 bits |
record's order number in heap of index page |
n_fields |
10 bits |
number of fields in this record, 1 to 1023 |
1byte_offs_flag |
1 bit |
1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag) |
next 16 bits |
16 bits |
pointer to next record in page |
TOTAL |
48 bits |
?? |
Total size is 48 bits, which is six bytes.
If you're just trying to read the record, the key bit in the Extra Bytes is 1byte_offs_flag — you need to know if 1byte_offs_flag is 1 (that is, "short 1-byteoffsets") or 0 (that is, "2-byte offsets").
Given a pointer to the Origin, InnoDB finds
the start of the record as follows:
Let X = n_fields (the number of fields is by definition equal to the number of entries in the Field Start Offsets Table).
If 1byte_offs_flag equals 0, then let X = X * 2 because there are two bytes for each entry instead of just one.
Let X = X + 6, because the fixed size of Extra Bytes is 6.
The start of the record is at (pointer value minus X).
The Field Contents part of the record has all the data. Fields are stored in the order they were defined in.
There are no markers between fields, and there is no marker or filler at the end of a record.
Here's an example.
I made a table with this definition:
CREATE TABLE T
(FIELD1 VARCHAR(3), FIELD2 VARCHAR(3), FIELD3 VARCHAR(3))
Type=InnoDB;
To understand what follows, you must know that table
T has six columns — not three —
because InnoDB automatically added three
"system columns" at the start for its own housekeeping. It
happens that these system columns are the row ID, the
transaction ID, and the rollback pointer, but their values
don't matter now. Regard them as three black boxes.
I put some rows in the table. My last three
INSERT statements were:
INSERT INTO T VALUES ('PP', 'PP', 'PP');
INSERT INTO T VALUES ('Q', 'Q', 'Q');
INSERT INTO T VALUES ('R', NULL, NULL);
I ran Borland's TDUMP to get a hexadecimal dump of the
contents of \mysql\data\ibdata1,
which (in my case) is the MySQL/InnoDB
data file (on Windows).
Here is an extract of the dump:
Address Values in Hexadecimal |
Values in ASCII |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A reformatted version of the dump, showing only the relevant bytes, looks like this (I've put a line break after each field and added labels):
Reformatted Hexadecimal Dump
19 17 15 13 0C 06 Field Start Offsets /* First Row */ 00 00 78 0D 02 BF Extra Bytes 00 00 00 00 04 21 System Column #1 00 00 00 00 09 2A System Column #2 80 00 00 00 2D 00 84 System Column #3 50 50 Field1 'PP' 50 50 Field2 'PP' 50 50 Field3 'PP' 16 15 14 13 0C 06 Field Start Offsets /* Second Row */ 00 00 80 0D 02 E1 Extra Bytes 00 00 00 00 04 22 System Column #1 00 00 00 00 09 2B 80 System Column #2 00 00 00 2D 00 84 System Column #3 51 Field1 'Q' 51 Field2 'Q' 51 Field3 'Q' 94 94 14 13 0C 06 Field Start Offsets /* Third Row */ 00 00 88 0D 00 74 Extra Bytes 00 00 00 00 04 23 System Column #1 00 00 00 00 09 2C System Column #2 80 00 00 00 2D 00 84 System Column #3 52 Field1 'R'
You won't need explanation if you followed everything I've said, but I'll add helpful notes for the three trickiest details.
Helpful Notes About "Field Start Offsets":
Notice that the sizes of the record's fields, in forward
order, are: 6, 6, 7, 2, 2, 2. Since each offset is for the
start of the "next" field, the hexadecimal offsets are 06, 0c
(6+6), 13 (6+6+7), 15 (6+6+7+2), 17 (6+6+7+2+2), 19
(6+6+7+2+2+2). Reversing the order, the Field Start Offsets of
the first record are: 19,17,15,13,0c,06.
Helpful Notes About "Extra Bytes":
Look at the Extra Bytes of the first record: 00 00 78
0D 02 BF. The fourth byte is 0D
hexadecimal, which is 1101 binary
... the 110 is the last bits of n_fields (110
binary is 6 which is indeed the number of fields in
the record) and the final 1 bit is 1byte_offs_flag. The fifth
and sixth bytes, which contain 02 BF,
constitute the "next" field. Looking at the original
hexadecimal dump, at address 0D42BF (which
is position 02BF within the page), you'll
see the beginning bytes of System Column #1 of the second row.
In other words, the "next" field points to the "Origin" of the
following row.
Helpful Notes About NULLs:
For the third row, I inserted NULLs in
FIELD2 and FIELD3. Therefore in the Field Start Offsets the
top bit is on for these fields (the values
are 94 hexadecimal, 94
hexadecimal, instead of 14
hexadecimal, 14 hexadecimal). And
the row is shorter because the NULLs take
no space.
