A .MYI file for a MyISAM
table contains the table's indexes.
The .MYI file has two parts: the header
information and the key values. So the next sub-sections will be
"The .MYI Header" and "The
.MYI Key Values".
The .MYI
Header
A .MYI file begins with a header, with
information about options, about file sizes, and about the "keys".
In MySQL terminology, a "key" is something that you create with
CREATE [UNIQUE] INDEX.
Program files which read and write .MYI
headers are in the ./myisam directory:
mi_open.c has the routines that write each
section of the header, mi_create.c has a
routine that calls the mi_open.c routines in
order, and myisamdef.h has structure
definitions corresponding to what we're about to describe.
These are the main header sections:
Section Occurrences ------- ----------- state Occurs 1 time base Occurs 1 time keydef (including keysegs) Occurs once for each key recinfo Occurs once for each field
Now we will look at each of these sections, showing each field.
We are going to use an example table throughout the description. To make the example table, we executed these statements:
CREATE TABLE T (S1 CHAR(1), S2 CHAR(2), S3 CHAR(3));
CREATE UNIQUE INDEX I1 ON T (S1);
CREATE INDEX I2 ON T (S2,S3);
INSERT INTO T VALUES ('1', 'aa', 'b');
INSERT INTO T VALUES ('2', 'aa', 'bb');
INSERT INTO T VALUES ('3', 'aa', 'bbb');
DELETE FROM T WHERE S1 = '2';
We took a hexadecimal dump of the resulting file,
T.MYI.
In all the individual descriptions below, the column labeled
“Dump From Example File” has the exact bytes that are
in T.MYI. You can verify that by executing
the same statements and looking at a hexadecimal dump yourself.
With Linux this is possible using od -h T.MYI;
with Windows you can use the command-line debugger.
Along with the typical value, we may include a comment. The comment usually explains why the value is what it is. Sometimes the comment is derived from the comments in the source code.
state
This section is written by mi_open.c,
mi_state_info_write().
Name Size Dump From Example File Comment
---- ---- ---------------------- -------
file_version 4 FE FE 07 01 from myisam_file_magic
options 2 00 02 HA_OPTION_COMPRESS_RECORD
etc.
header_length 2 01 A2 this header example has
0x01A2 bytes
state_info_length 2 00 B0 = MI_STATE_INFO_SIZE
defined in myisamdef.h
base_info_length 2 00 64 = MI_BASE_INFO_SIZE
defined in myisamdef.h
base_pos 2 00 D4 = where the base
section starts
key_parts 2 00 03 a key part is a column
within a key
unique_key_parts 2 00 00 key-parts+unique-parts
keys 1 02 here are 2 keys --
I1 and I2
uniques 1 00 number of hash unique
keys used internally
in temporary tables
(nothing to do with
'UNIQUE' definitions)
language 1 08 "language for indexes"
max_block_size 1 01
fulltext_keys 1 00 # of fulltext keys.
= 0 if version <= 4.0
not_used 1 00 to align to 8-byte
boundary
state->open_count 2 00 01
state->changed 1 39 set if table updated;
reset if shutdown (so
one can examine this
to see if there was an
update without proper
shutdown)
state->sortkey 1 FF "sorted by this key"
(not used)
state->state.records 8 00 00 00 00 00 00 00 02 number of actual,
un-deleted, records
state->state.del 8 00 00 00 00 00 00 00 01 # of deleted records
state->split 8 00 00 00 00 00 00 00 03 # of "chunks" (e.g.
records or spaces left
after record deletion)
state->dellink 8 00 00 00 00 00 00 00 07 "Link to next removed
"block". Initially =
HA_OFFSET_ERROR
state->state.key_file_length 8 00 00 00 00 00 00 0c 00 2048
state->state.data_file_length 8 00 00 00 00 00 00 00 15 = size of .MYD file
state->state.empty 8 00 00 00 00 00 00 00 00
state->state.key_empty 8 00 00 00 00 00 00 00 00
state->auto_increment 8 00 00 00 00 00 00 00 00
state->checksum 8 00 00 00 00 00 00 00 00
state->process 4 00 00 09 E6 from getpid(). process
of last update
state->unique 4 00 00 00 0B initially = 0
state->status 4 00 00 00 00
state->update_count 4 00 00 00 04 updated for each write
lock (there were 3
inserts + 1 delete,
total 4 operations)
state->key_root 8 00 00 00 00 00 00 04 00 offset in file where
I1 keys start, can be
= HA_OFFSET_ERROR
00 00 00 00 00 00 08 00 state->key_root occurs
twice because there
are two keys
state->key_del 8 FF FF FF FF FF FF FF FF delete links for keys
(occurs many times if
many delete links)
state->sec_index_changed 4 00 00 00 00 sec_index = secondary
index (presumably)
not currently used
state->sec_index_used 4 00 00 00 00 "which extra indexes
are in use"
not currently used
state->version 4 3F 3F EB F7 "timestamp of create"
state->key_map 8 00 00 00 03 "what keys are in use"
state->create_time 8 00 00 00 00 3F 3F EB F7 "time when database
created" (actually:
time when file made)
state->recover_time 8 00 00 00 00 00 00 00 00 "time of last recover"
state->check_time 8 00 00 00 00 3F 3F EB F7 "time of last check"
state->rec_per_key_rows 8 00 00 00 00 00 00 00 00
state->rec_per_key_parts 4 00 00 00 00 (key_parts = 3, so
00 00 00 00 rec_per_key_parts
00 00 00 00 occurs 3 times)
base
This section is written by mi_open.c,
mi_base_info_write(). The corresponding
structure in myisamdef.h is
MI_BASE_INFO.
In our example T.MYI file, the first byte of
the base section is at offset 0x00d4. That's
where it's supposed to be, according to the header field
base_pos (above).
Name Size Dump From Example File Comment
---- ---- ---------------------- -------
base->keystart 8 00 00 00 00 00 00 04 00 keys start at offset
1024 (0x0400)
base->max_data_file_length 8 00 00 00 00 00 00 00 00
base->max_key_file_length 8 00 00 00 00 00 00 00 00
base->records 8 00 00 00 00 00 00 00 00
base->reloc 8 00 00 00 00 00 00 00 00
base->mean_row_length 4 00 00 00 00
base->reclength 4 00 00 00 07 length(s1)+length(s2)
+length(s3)=7
base->pack_reclength 4 00 00 00 07
base->min_pack_length 4 00 00 00 07
base->max_pack_length 4 00 00 00 07
base->min_block_length 4 00 00 00 14
base->fields 4 00 00 00 04 4 fields: 3 defined,
plus 1 extra
base->pack_fields 4 00 00 00 00
base->rec_reflength 1 04
base->key_reflength 1 04
base->keys 1 02 was 0 at start
base->auto_key 1 00
base->pack_bits 2 00 00
base->blobs 2 00 00
base->max_key_block_length 2 04 00 length of block = 1024
bytes (0x0400)
base->max_key_length 2 00 10 including length of
pointer
base->extra_alloc_bytes 2 00 00
base->extra_alloc_procent 1 00
base->raid_type 1 00
base->raid_chunks 2 00 00
base->raid_chunksize 4 00 00 00 00
[extra] that is, filler 6 00 00 00 00 00 00
keydef
This section is written by mi_open.c,
mi_keydef_write(). The corresponding structure
in myisamdef.h is
MI_KEYDEF.
This is a multiple-occurrence structure. Since there are two
indexes in our example (I1 and I2), we will see that
keydef occurs two times below. There is a
subordinate structure, keyseg, which also
occurs multiple times (once within the keydef
for I1 and two times within the keydef for I2).
Name Size Dump From Example File Comment
---- ---- ---------------------- -------
/* key definition for I1 */
keydef->keysegs 1 01 there is 1 keyseg (for
column S1).
keydef->key_alg 1 01 algorithm = Rtree or
Btree
keydef->flag 2 00 49 HA_NOSAME +
HA_SPACE_PACK_USED +
HA_NULL_PART_KEY
keydef->block_length 2 04 00 that is, 1024
key def->keylength 2 00 06 field-count+sizeof(S1)
sizeof(ROWID)
keydef->minlength 2 00 06
keydef->maxlength 2 00 06
/* keyseg for S1 in I1 */
keyseg->type 1 01 /* I1(S1) size(S1)=1,
column = 1 */
= HA_KEYTYPE_TEXT
keyseg->language 1 08
keyseg->null_bit 1 02
keyseg->bit_start 1 00
keyseg->bit_end 1 00
[0] that is, filler 1 00
keyseg->flag 2 00 14 HA_NULL_PART +
HA_PART_KEY
keyseg->length 2 00 01 length(S1) = 1
keyseg->start 4 00 00 00 01 offset in the row
keyseg->null_pos 4 00 00 00 00
/* key definition for I2 */
keydef->keysegs 1 02 keysegs=2, for columns
S2 and S3
keydef->key_alg 1 01 algorithm = Rtree or
Btree
keydef->flag 2 00 48 HA_SPACE_PACK_USED +
HA_NULL_PART_KEY
keydef->block_length 2 04 00 that is, 1024
key def->keylength 2 00 0B field-count+ sizeof(all fields)+
sizeof(RID)
keydef->minlength 2 00 0B
keydef->maxlength 2 00 0B
/* keyseg for S2 in I2 */
keyseg->type 1 01 /* I2(S2) size(S2)=2,
column = 2 */
keyseg->language 1 08
keyseg->null_bit 1 04
keyseg->bit_start 1 00
keyseg->bit_end 1 00
[0] that is, filler 1 00
keyseg->flag 2 00 14 HA_NULL_PART +
HA_PART_KEY
keyseg->length 2 00 02 length(S2) = 2
keyseg->start 4 00 00 00 02
keyseg->null_pos 4 00 00 00 00
/* keyseg for S3 in I2 */
keyseg->type 1 01 /* I2(S3) size(S3)=3,
column = 3 */
keyseg->language 1 08
keyseg->null_bit 1 08
keyseg->bit_start 1 00
keyseg->bit_end 1 00
[0] thatis, filler 1 00
keyseg->flag 2 00 14 HA_NULL_PART +
HA_PART_KEY
keyseg->length 2 00 03 length(S3) = 3
keyseg->start 4 00 00 00 04
keyseg->null_pos 4 00 00 00 00
recinfo
The recinfo section is written by
mi_open.c,
mi_recinfo_write(). The corresponding structure
in myisamdef.h is
MI_COLUMNDEF.
This is another multiple-occurrence structure. It appears once for each field that appears in a key, including an extra field that appears at the start and has flags (for deletion and for null fields).
Name Size Dump From Example File Comment ---- ---- ---------------------- ------- recinfo->type 2 00 00 extra recinfo->length 2 00 01 recinfo->null_bit 1 00 recinfo->null_pos 2 00 00 recinfo->type 2 00 00 I1 (S1) recinfo->length 2 00 01 recinfo->null_bit 1 02 recinfo->null_pos 2 00 00 recinfo->type 2 00 00 I2 (S2) recinfo->length 2 00 02 recinfo->null_bit 1 04 recinfo->null_pos 2 00 00 recinfo->type 2 00 00 I2 (S3) recinfo->length 2 00 03 recinfo->null_bit 1 08 recinfo->null_pos 2 00 00
We are now at offset 0xA2 within the file
T.MYI. Notice that the value of the third
field in the header, header_length, is 0xA2.
Anything following this point, up till the first key value, is
filler.
The .MYI Key
Values
And now we look at the part which is not the information header: we look at the key values. The key values are in blocks (MySQL's term for pages). A block contains values from only one index. To continue our example: there is a block for the I1 key values, and a block for the I2 key values.
According to the header information
(state->key_root above), the I1 block starts
at offset 0x0400 in the file, and the I2 block starts at offset
0x0800 in the file.
At offset 0x0400 in the file, we have this:
Name Size Dump From Example File Comment
---- ---- ---------------------- -------
(block header) 2 00 0E = size (inclusive)
(first bit of word =
0 meaning this is a
B-Tree leaf, see the
mi_test_if_nod macro)
(first key value) 2 01 31 Value is "1" (0x31).
(first key pointer) 4 00 00 00 00 Pointer is to Record
#0000.
(second key value) 2 01 33 Value is "3" (0x33).
(second key pointer) 4 00 00 00 02 Pointer is to Record
#0002.
(junk) 1010 .. .. .. .. .. .. .. rest of the 1024-byte
block is unused
At offset 0800x in the file, we have this:
Name Size Dump From Example File Comment
---- ---- ---------------------- -------
(block header) 2 00 18 = size (inclusive)
(first key value) 7 01 61 61 01 62 20 20 Value is "aa/b "
(first key pointer) 4 00 00 00 00 Pointer is to Record
#0000.
(second key value) 7 01 61 61 01 62 62 62 Value is "aa/bbb"
(second key pointer) 4 00 00 00 02 Pointer is to Record
#0002.
(junk) 1000 .. .. .. .. .. .. .. rest of the 1024-byte
block is unused
From the above illustrations, these facts should be clear:
Each key contains the entire contents of all the columns,
including trailing spaces in CHAR columns.
There is no front truncation. There is no back truncation.
(There can be space truncation if
keyseg->flag
HA_SPACE_PACK flag is on.)
For fixed-row tables: The pointer is a fixed-size (4-byte)
number which contains an ordinal row number. The first row is
Record #0000. This item is analogous to the ROWID, or RID (row
identifier), which other DBMSs use. For dynamic-row tables:
The pointer is an offset in the .MYD
file.
The normal block length is 0x0400 (1024) bytes.
These facts are not illustrated, but are also clear:
If a key value is NULL, then the first byte
is 0x00 (instead of 001 as in the preceding examples) and
that's all. Even for a fixed CHAR(3)
column, the size of the key value is only 1 byte.
Initially the junk at the end of a block is filler bytes,
value = A5. If MySQL shifts key values up after a
DELETE, the end of the block is not
overwritten.
A normal block is at least 65% full, and typically 80% full.
(This is somewhat denser than the typical B-tree algorithm
would cause, it is thus because myisamchk
-rq will make blocks nearly 100% full.)
There is a pool of free blocks, which increases in size when deletions occur. If all blocks have the same normal block length (1024), then MySQL will always use the same pool.
The maximum number of keys is 32
(MI_MAX_KEY). The maximum number of
segments in a key is 16 (MI_MAX_KEY_SEG).
The maximum key length is 500
(MI_MAX_KEY_LENGTH). The maximum block
length is 16384 (MI_MAX_KEY_BLOCK_LENGTH).
All these MI_... constants are expressed by #defines in the
myisamdef.h file.
Some notes about MyISAM file handling:
If a table is never updated, MySQL will never touch the table files, so it would never be marked as closed or corrupted.
If a table is marked readonly by the OS, it will only be opened in readonly mode. Any updates to it will fail.
When a normal table is opened for reading by a
SELECT, MySQL will open it in read/write
mode, but will not write anything to it.
A table can be closed during one of the following events:
Out of space in table cache
Someone executed flush tables
MySQL was shut down
flush_time expired (which causes an automatic flush-tables to be executed)
When MySQL opens a table, it checks if the table is clean.
If it isn't and the server was started with the
--myisam-recover option, check the table
and try to recover it if it's crashed. (The safest automatic
recover option is probably
--myisam-recover=BACKUP.)
