MySQL 9.0.0
Source Code Documentation
|
Redo log contains multiple log files, each has the same format. Consecutive files have data for consecutive ranges of lsn values. When a file ends at end_lsn, the next log file begins at the end_lsn. There is a fixed number of log files, they are re-used in circular manner. That is, for the last log file, the first log file is a successor.
The log file names are: _::ib_redo0_, _::ib_redo1_, ... and they are stored in subdirectory innodb_redo, which is located inside the directory specified by the innodb_log_group_home_dir (or in the datadir if not specified).
Whenever a new log file is being created, it is created first with the _tmp suffix in its name. When the file is prepared, it becomes renamed (the suffix is removed from the name).
When a new data directory is being initialized, all log files that are being created, have LOG_HEADER_FLAG_NOT_INITIALIZED flag enabled in the log_flags field in the header. After the data directory is initialized, this flag is disabled (file header is re-flushed for the newest log file then).
File header contains the log_uuid field. It is a randomly chosen value when the data directory is being initialized. It is used to detect situation, in which user mixed log files from different data directories.
File header contains also start_lsn - this is start_lsn of the first log block within that file.
Log file starts with a header of LOG_FILE_HDR_SIZE bytes. It contains:
Binding of an offset within the file to the lsn value.
This binding allows to map any lsn value which is represented within the file to corresponding lsn value.
Two checkpoint blocks - LOG_CHECKPOINT_1 and LOG_CHECKPOINT_2.
Each checkpoint block contains OS_FILE_LOG_BLOCK_SIZE bytes:
checkpoint_lsn - lsn to start recovery at.
log.buf_size - size of the log buffer when the checkpoint write was started.
It remains a mystery, why do we need that. It's neither used by the recovery, nor required for MEB. Some rumours say that maybe it could be useful for auto-config external tools to detect what configuration of MySQL should be used.
There are two checkpoint headers, because they are updated alternately. In case of crash in the middle of any such update, the alternate header would remain valid (so it's the same reason for which double write buffer is used for pages).
After the header, there are consecutive log blocks. Each log block has the same format and consists of OS_FILE_LOG_BLOCK_SIZE bytes (512). These bytes are enumerated by lsn values.
Each log block contains:
This is a block number. Consecutive blocks have consecutive numbers. Hence this is basically lsn divided by OS_FILE_LOG_BLOCK_SIZE. However it is also wrapped at 1G (due to limited size of the field). It should be possible to wrap it at 2G (only the single flush bit is reserved as the highest bit) but for historical reasons it is 1G.
Number of bytes within the log block. Possible values:
value within [LOG_BLOCK_HDR_SIZE, OS_FILE_LOG_BLOCK_SIZE - LOG_BLOCK_TRL_SIZE), which means that this is the last block and it is an incomplete block.
This could be then considered an offset, which points to the end of the data within the block. This value includes LOG_BLOCK_HDR_SIZE bytes of the header.
Offset within the log block to the beginning of the first group of log records that starts within the block or 0 if none starts. This offset includes LOG_BLOCK_HDR_SIZE bytes of the header.
Log epoch number. Set by the log writer thread just before a write starts for the block. For details
It could be used during recovery to detect that we have read old block of redo log (tail) because of the wrapped log files.
data part - bytes up to data_len byte.
Actual data bytes are followed by 0x00 if the block is incomplete.
checksum
Algorithm used for the checksum depends on the configuration. Note that there is a potential problem if a crash happened just after switching to "checksums enabled". During recovery some log blocks would have checksum = LOG_NO_CHECKSUM_MAGIC and some would have a valid checksum. Then recovery with enabled checksums would point problems for the blocks without valid checksum. User would have to disable checksums for the recovery then.