WL#7142: InnoDB: Simplify tablespace discovery during crash recovery

Status: Complete

Description
Requirements
High Level Architecture
Low Level Design

The objective of this worklog is to eliminate the use of the file system as a
‘data dictionary’ during redo log processing (before applying redo log):

(1) Do not read the first page of all $datadir/*/*.ibd files
(2) Do not check the contents of $datadir/*/*.isl files

After these changes, *.isl files (introduced in WL#5980) will still be
consulted when opening tables after the redo log has been applied.
Also, the existence of *.ibd files will be checked in
dict_check_tablespaces_and_store_max_id(). The *.isl files and the
remaining scan for *.ibd files would be removed as part of the data dictionary work.

The changes (1) and (2) will improve reliability as follows:

(3) We will ignore extra *.ibd files that are not attached to the
InnoDB instance. For example, if the system crashes before the completion
of IMPORT TABLESPACE, there could be files with duplicate space_id that
could currently cause trouble. Thanks the MLOG_FILE_NAME redo log records
introduced in this worklog, redo log apply can sometimes safely ignore such
files, and sometimes issue an error message, telling how to resolve manually.

(4) We will not silently discard redo log records if some *.ibd file is
missing without the redo log containing a MLOG_FILE_DELETE record. For
example, if a file rename went bad, the DBA can manually rename the file and
restart crash recovery. In innodb_force_recovery mode, missing *.ibd files will
continue to be ignored.

(5) Failure scenarios related to inconsistent *.isl files will be
eliminated during redo log apply. Redo log records will contain
references to *.ibd file names; the *.isl files will only be used
after redo log apply when opening tables.

This worklog covers changes to InnoDB redo log processing. It can be
implemented independently of the Global Data Dictionary.

The InnoDB redo log format will be changed as follows:

New redo log record types:

MLOG_FILE_NAME(space_id, first_page_number, filename): Identifies a data file.
MLOG_FILE_RENAME2(space_id, first_page_number, filename, new_filename): Rename.
MLOG_CHECKPOINT (1 byte): Indicates the end of log checkpoint activity.
At least one MLOG_CHECKPOINT must be present after the latest log checkpoint,
or the entire redo log will be ignored.

Repurposed redo log record type (no format change):

MLOG_FILE_DELETE(space_id, first_page_number, filename): Delete a file.
Also, identifies a data file during redo log scan.

For future compatibility with multi-file tablespaces, the new redo log records
will identify the first page number of each file. The first implementation will
write and expect first_page_number=0.

All existing file-based redo log records except MLOG_FILE_DELETE
will be removed and replaced as follows:

MLOG_FILE_CREATE, MLOG_FILE_CREATE2: Replaced with MLOG_FILE_NAME.
MLOG_FILE_RENAME(space_id, table, new_table): Replaced with MLOG_FILE_RENAME2.

FR1. Redo log can be applied to any tablespace without having access
to the data dictionary contents.

FR2. Redo log can be scanned and applied without searching the file system
for the location and space_id of all tablespace files upfront.
(Only the tablespaces referred to by redo log records since the latest
checkpoint will be accessed. Clean tablespaces will not be accessed.)

FR3. We will have applied the redo log before recovering transactions.

Redo log application will be completely detached from data dictionary
changes. To support the discovery and renaming of files, we will introduce
a file-level redo log record MLOG_FILE_NAME(space_id,name).

A mini-transaction that has modified any page (space_id,page_no) in a
persistent tablespace file must emit an MLOG_FILE_NAME(space_id,name)
record for the file before the MLOG_MULTI_REC_END marker, unless a record
MLOG_FILE_NAME(space_id,name) was already emitted since the latest redo log
checkpoint. The MLOG_FILE_NAME will also be emitted before renaming a
file. Based on these records, the redo log scanner will construct a
mapping of possible file names for a given space_id since the latest
redo log checkpoint. The space_id will be checked by opening the file.

Special handling is needed on a redo log checkpoint, because upon completing
a redo log checkpoint there may exist buffered log entries for tablespaces
that depend on MLOG_FILE_NAME records. Before starting the log header update
to signal a checkpoint, we will re-append MLOG_FILE_NAME records for all
tablespaces that were modified since the end of the previous checkpoint.
Finally, we will append a MLOG_CHECKPOINT marker record and will mark as ‘clean’
those tablespaces that were not modified after our checkpoint.
After this, we will append all pending log records to the redo log file, and
actually do the checkpoint.

NOTE: We cannot re-append MLOG_FILE_NAME records for tablespaces that were
dropped since the previous checkpoint. Therefore, the tablespace discovery
will also gather information from MLOG_FILE_DELETE records, noting that the
tablespace has been deleted.

On crash recovery, we will first scan the log up to the MLOG_CHECKPOINT
record. If there is no MLOG_CHECKPOINT record, it means that the system
crashed before the log checkpoint completed. In this case, no redo log will
be applied, and we will recover the system as it was at the checkpoint LSN.
There must be only one MLOG_CHECKPOINT record since the latest checkpoint.
Seeing multiple MLOG_CHECKPOINT records is a fatal error (corrupted redo log).

Until we scan an MLOG_CHECKPOINT log record, we will allow redo log records
refer to tablespaces for which MLOG_FILE_NAME or MLOG_FILE_DELETE has not been
emitted upfront. Once we see the MLOG_CHECKPOINT record, there must be no
missing MLOG_FILE_NAME or MLOG_FILE_DELETE for the records scanned so far.

After the occurrence of the MLOG_CHECKPOINT record, every log record that
refers to a non-predefined tablespace must be preceded by a corresponding
MLOG_FILE_NAME record. (It should not make sense to have redo log records
after a MLOG_FILE_DELETE, but we do allow this, ignoring the records.)

Long term, InnoDB crash recovery will not apply any file-level redo log
records except for MLOG_FILE_RESIZE. MLOG_FILE_RENAME2 records
will temporarily be applied. Ultimately, the rollback of RENAME will be handled
via the DDL_LOG.

MySQL Enterprise Backup will continue to apply the MLOG_FILE_* records,
including MLOG_FILE_RENAME2. After apply-log has been executed on a
backup copy and MySQL is started on the copy, InnoDB recovery will continue
from PHASE 1 step 2 below. If a transaction that renamed tablespaces was rolled
back, the apply-log would have performed the renaming based on
MLOG_FILE_RENAME2 records, and InnoDB would roll back the renames based on
DDL_LOG.

Because there are combined DDL+DML transactions (such as CREATE TABLE…SELECT)
we will have to do the redo log apply in 3 phases.

PHASE 1: Recover the data dictionary and all undo logs.

STEP 0: Scan all redo log since the latest checkpoint.

This will construct a complete map of space_id↦filename that will
be consulted in subsequent redo log apply (STEP 1, STEP 4, STEP 6).

If there is no MLOG_CHECKPOINT marker, discard the redo log
(all tablespaces should be in clean state as of the checkpoint).

If there are missing or duplicate *.ibd files referred to by the redo log,
refuse startup. The DBA can delete or rename files, or force recovery
(causing redo log for missing tablespaces to be discarded).

STEP 1: Apply the redo log on the system tablespaces (or all tablespaces).

If not all scanned redo log records fit in memory at once, the log will have
to be applied in batches, and it will have to be applied on all tablespaces.

NOTE: Currently, STEP 1 will recover all tablespaces, and there is no
STEP 3, STEP 4 or STEP 6.

STEP 2: Recover all incomplete transactions from undo logs.

PHASE 2: Roll back incomplete transactions that performed DDL.

STEP 3: (optional, to speed up STEP 4, STEP 6)
Drop to-be-dropped tablespaces, and discard the redo log for them.

NOTE: This is not implemented yet

We could execute this step if the server was killed right after
committing DROP TABLE, ALTER TABLE or TRUNCATE TABLE, before
the post-commit step that actually frees the space.
(For ALTER and TRUNCATE, this refers to the "old copy" of the table.)

STEP 4: Start applying the remaining redo log records.
Later, this can be in the background and can skip
pages that are only modified by DML-only transactions (not modified by any
DDL-only or DDL+DML transactions).

NOTE: STEP 4 is currently already performed as part of STEP 1.

STEP 5: Roll back any incomplete DDL-only or DDL+DML transactions.

NOTE: Before Global Data Dictionary, there are no DDL+DML transactions in
InnoDB. So, at STEP 5 we currently roll back an incomplete DDL transaction
(there can be at most one).

STEP 6: Complete the applying of any remaining redo log records.

NOTE: STEP 6 is currently already performed as part of STEP 1.

NOTE: We could cover all transactions in STEP 4 and STEP 5 above.
The reason why we defer the rollback of DML-only transactions is
because we can. An existing feature of InnoDB crash recovery
is to allow connections as soon as possible, rolling back incomplete
non-DDL transactions in a background thread.

STEP 7: Apply any operations from the DDL_LOG.

NOTE: As of now STEP 7 will drop orphan auxiliary tables for FULLTEXT INDEX, and
drop incomplete or delete-marked indexes (index name starting with
TEMP_INDEX_PREFIX).

PHASE 3: Start non-critical background processes.

STEP 8: Start the background tasks on transactions.
8.1 Start the rollback of incomplete DML-only transactions.
8.2 Start the purge of delete-marked records and undo logs.

NOTE: Rollback and purge will have to perform READ COMMITTED of the DD tables
in order to look up the table definitions by dd.tables.se_private_id.

[End of tasks that could be controlled by maintenance mode.]

STEP 9: Start accepting connections.

EXAMPLE: Recovering from a DDL operation when also "Crash-safe DDL with the
global data dictionary" is implemented.

	ALTER TABLE t ... ALGORITHM=COPY

will be internally implemented like

	BEGIN;
	CREATE TABLE #sql1;
	INSERT INTO #sql1 SELECT ... FROM t;
	RENAME TABLE t TO #sql2, #sql1 TO t;
	DROP TABLE #sql2;
	COMMIT;

In more detail, the filesystem operations and the DDL_LOG operations be as follows:

BEGIN;
  -- "prepare" of CREATE TABLE #sql1
  BEGIN; -- InnoDB-internal subtransaction
    INSERT INTO ddl_log SET type='DELETE', old_file_name='#sql1.ibd';
  COMMIT;
  DELETE FROM ddl_log WHERE id=LAST_INSERT_ID();
  creat("#sql1.ibd");
  -- Finally at some point before the COMMIT below, update the data dictionary
  -- for the CREATE TABLE (no commit yet!)
  INSERT INTO DD.* VALUES ...;

  INSERT INTO #sql1 SELECT ... FROM t;

  -- RENAME TABLE t TO #sql2, #sql1 TO t;
  -- "backward" logging for the DDL rollback
  BEGIN; -- InnoDB-internal subtransaction
    INSERT INTO ddl_log SET type='RENAME',
    old_file_name='#sql2.ibd', new_file_name='t.ibd';
  COMMIT;
  DELETE FROM ddl_log WHERE id=LAST_INSERT_ID();
  write MLOG_FILE_NAME; mtr_commit();
  rename("t.ibd", "#sql2.ibd")
  -- "backward" logging for the DDL rollback
  BEGIN;
    INSERT INTO ddl_log SET type='RENAME',
    old_file_name='t.ibd', new_file_name='#sql1.ibd';
  COMMIT;
  DELETE FROM ddl_log WHERE id=LAST_INSERT_ID();
  write MLOG_FILE_NAME; mtr_commit();
  rename("#sql1.ibd", "t.ibd")
  UPDATE DD.* ...;

  -- "commit" of DROP TABLE #sql2
  INSERT INTO ddl_log SET type='DELETE', name='#sql1.ibd';
COMMIT /* marks the DDL operation committed */;

"post-commit" of DROP TABLE #sql2:
BEGIN;
  DELETE FROM ddl_log WHERE type='DELETE' AND ...;
  MLOG_FILE_DELETE(old_space_id, "#sql2.ibd")
  unlink("#sql2.ibd");
COMMIT;

In the redo log, this will generate the following redo log records and
file system operations for the old_space_id (t.ibd, which is renamed
to #sql2.ibd and then dropped) and new_space_id (#sql1.ibd, which is
renamed to t.ibd):

MLOG_FILE_NAME(new_space_id, "#sql1.ibd") -- for MEB tablespace discovery
... commit of the DDL_LOG.type='DELETE' write for undoing the below
creat("#sql1.ibd")
... optional: some redo log records for the bulk INSERT in new_space_id
... not needed (not even for page allocations) if we flush new_space_id
... before the main COMMIT of the DDL transaction
MLOG_FILE_NAME(old_space_id, "t.ibd") -- once after latest log checkpoint
MLOG_FILE_NAME(old_space_id, "#sql2.ibd") -- flushed before the rename!
... commit of the DDL_LOG.type='RENAME' write for undoing the below
rename("t.ibd", "#sql2.ibd")
MLOG_FILE_RENAME2(old_space_id,FROM t.ibd,TO #sql2.ibd);
MLOG_FILE_NAME(new_space_id, "#sql1.ibd") -- once after latest log checkpoint
MLOG_FILE_NAME(new_space_id, "t.ibd") -- flushed before the rename!
... commit of the DDL_LOG.type='RENAME' write for undoing the below
rename("#sql1.ibd", "t.ibd")
MLOG_FILE_RENAME2(new_space_id,FROM #sql1.ibd,TO t.ibd);
... commit of the DDL operation
... (deletes above DDL_LOG, writes type='DELETE' for removing old_space_id)
MLOG_FILE_DELETE(old_space_id,"#sql2.ibd")
unlink("#sql2.ibd")
... commit of the "post-commit" operation

Now, let us consider redo log apply (PHASE 1 step 1) with this example.

In the above example, InnoDB would first perform the file system
operation and then write redo log about it if it succeeded.

The Hot Backup in MySQL Enterprise Backup (MEB) must obviously replay all of
MLOG_FILE_DELETE
MLOG_FILE_RENAME2
because it is creating a copy of the ‘live’ file system.

The following records will no longer be written, because also MEB will use
MLOG_FILE_NAME for discovering ‘dirty’ tablespaces:
MLOG_FILE_CREATE
MLOG_FILE_CREATE2

If redo log apply sees a mtr_commit() that included a file operation,
it means that the file system operation would already have been performed
successfully.  So, it is not necessary to replay the file operations
in normal InnoDB recovery.

After DDL_LOG based recovery is in place, InnoDB redo log apply will scan
MLOG_FILE_DELETE and MLOG_FILE_RENAME2 records but will not replay them. (InnoDB
will be replaying MLOG_FILE_RENAME2, which replaces MLOG_FILE_RENAME.)

NOTE: Because the MLOG_FILE_DELETE and MLOG_FILE_NAME records
will be used for reconstructing the space_id→filename mapping, we must
emit and flush a MLOG_FILE_DELETE record before attempting to delete a file,
even if we do not know yet if the deleting will succeed. This is in
preparation for the clean crash recovery semantics introduced in " Crash-safe DDL".

In the past, we would emit the MLOG_FILE_DELETE asynchronously some time
after deleting a file, and silently ignore redo log that were emitted for
missing tablespace files.

If the server is killed before the mtr_commit() of a file operation gets
flushed to the redo log, the redo log scan might not see the operations, even
though they had been performed in the file system (depending on whether
file system recovery was needed, and how it works). Currently, InnoDB fails
to roll back a RENAME operation in the file system, and it can fail to delete
*.ibd files when recovering from a crash during ALTER TABLE.

The DDL_LOG based rollback would return the data directory to a
consistent state:

• In case of creating a file (MLOG_FILE_NAME), the commit of inserting a
DDL_LOG.type='DELETE' record for undoing the creation would already have been
committed and the commit would have been flushed to the redo log, before we
start creating the file.
• In case of MLOG_FILE_RENAME2, the commit of inserting a
DDL_LOG.type='RENAME' record for undoing the rename would already have
been flushed to the redo log, before we start renaming the file.
• In case of MLOG_FILE_DELETE, as noted above we will write out the
MLOG_FILE_DELETE record before actually deleting. Even if we did not do
this, the removal of the DDL_LOG.type='DELETE' record is not committed
until after we have deleted the file (and written out a
MLOG_FILE_DELETE record).

After InnoDB crash recovery step 1 (redo log apply), in InnoDB
(as InnoDB will be ignoring the operations MLOG_FILE_RENAME2,
MLOG_FILE_DELETE), we will end up with a set of files that may be
subject to some ‘rollback’ or ‘roll-forward’ operations.

There could be up to one file system operation that was not covered by
a MLOG_FILE_RENAME2 record. The operation should be covered by a
flushed commit of inserting a corresponding DDL_LOG.type.='RENAME', so
that the operation can be rolled back or rolled forward in recovery
PHASE 2 step 3 or 7.

If we are starting up MySQL on a restored hot backup after
--apply-log, we should have a situation that is similar to normal
InnoDB crash recovery step 1.

In summary, it should not make a difference if PHASE 1 step 1 was
executed by MySQL/InnoDB startup, or if the step was avoided because
the tablespace files were ‘cleaned’ by MEB --apply-log.  Either way,
all subsequent recovery steps will lead to a consistent result.

WL#7142 InnoDB: Simplify tablespace discovery during crash recovery

When the setting innodb_file_per_table=ON was introduced in MySQL 4.1,
InnoDB crash recovery was changed so that the directories will be
searched for *.ibd files if any redo needs to be applied.

The scanning and opening of all *.ibd files (including ones for which
no redo log needs to be applied) can be very slow, especially on
deployments that contain a large number of *.ibd files. Furthermore,
if we allow a more liberal placement of tablespace files in the file
system, we might have to extend the search to an even broader range of
directories.

This worklog eliminates the *.ibd file scan by guaranteeing the
following:

If there are redo log records for any non-predefined tablespace, there
will also be an MLOG_FILE_NAME record.

The InnoDB redo log format will be changed as follows:

MLOG_FILE_NAME(space_id, filename): A new redo log record.
Replaces MLOG_FILE_CREATE, MLOG_FILE_CREATE2.

MLOG_FILE_RENAME2(space_id, old, new): The names will be file names
(directory/databasename/tablename.ibd). Replaces MLOG_FILE_RENAME,
which used table names (databasename/tablename).

NOTE: We will write MLOG_FILE_NAME once since the latest redo log
checkpoint. Immediately after a checkpoint, the log may contain some
MLOG_FILE_NAME records that were "copied across the checkpoint" and a
MLOG_CHECKPOINT marker to signal the end of a checkpoint.

On redo log apply during crash recovery, we will scan the log up to
three times:

Recovery scan 1: Look for the first MLOG_FILE_CHECKPOINT marker since
the latest checkpoint.

If there is no MLOG_FILE_CHECKPOINT, we will skip the entire log. The
data files will correspond to the system state as of the checkpoint.

Recovery scan 2: Read the redo log since the latest checkpoint. Copy
scanned records to recv_sys->addr_hash, and construct a map of
recv_spaces, based on MLOG_FILE_NAME and MLOG_FILE_DELETE records.

Before applying the records from recv_sys->addr_hash, we will check if
any tablespace files are missing. If there are missing tablespaces, we
will refuse to start up, so that the DBA can intervene, for example to
manually rename files. This new safeguard of WL#7142 can be disabled
by setting innodb_force_recovery.

If not all redo log records in recv_sys->addr_hash, we will need a
third log scan:

Recovery scan 3: Read the redo log since the latest checkpoint. If
recv_sys->addr_hash fills up, apply the batch of log records and read
a new one.

mlog_id_t: Remove MLOG_FILE_CREATE, MLOG_FILE_CREATE2, MLOG_FILE_RENAME.
Add MLOG_FILE_NAME, MLOG_FILE_RENAME2, MLOG_CHECKPOINT.

MLOG_FILE_FLAG_TEMP: Remove. This was a flag for MLOG_FILE_CREATE*.

enum dict_check_t: Remove DICT_CHECK_ALL_LOADED. Crash recovery no
longer loads all tablespaces.

mtr_t::m_named_space: Associates a tablespace with a
mini-transaction. A mini-transaction may be associated with up to one
non-predefined tablespace. It may also modify predefined tablespaces
for change buffering and undo logging.

mtr_t::set_named_space(ulint space): Sets m_named_space.
This must be called when a mini-transaction is going to modify a
non-predefined tablespace.

mtr_t::is_named_space(ulint space): Checks if the mini-transaction is
associated with a given tablespace.

mtr_write_log_t: Add a parameter for the number of bytes to append.
mtr_write_log_t::operator(): Stop appending when the limit is reached.

mtr_t::Command::prepare_write(): Write MLOG_FILE_NAME records if
needed. This is executed as part of mtr_commit().
To write MLOG_FILE_NAME, we will invoke fil_names_write() for
non-predefined persistent tablespaces. After log_mutex_enter(),
discard the data appended by fil_names_write() based on the result of
fil_names_dirty(). Return the number of bytes to append, instead of a
Boolean. 0 means that finish_write() should not be called.

mtr_t::Command::finish_write(): Take the number of bytes to append as
a parameter.

mtr_t::commit_checkpoint(): A special method to emit redo log records
to the redo log buffer when the caller already invoked
log_mutex_enter(). This is only used by fil_names_clear().

fil_space_t::max_lsn: LSN of the most recent fil_names_write() call,
or 0 if the tablespace has not been dirtied since fil_names_clear().
Protected by log_sys->mutex or fil_system->mutex.

fil_space_t::named_spaces, fil_system_t::named_spaces: List of
tablespaces for which MLOG_FILE_NAME has been written since the latest
checkpoint. Protected by fil_system->mutex.

recv_sys_t: mlog_checkpoint_lsn: The LSN of the first scanned
MLOG_CHECKPOINT record, or 0 if none was read yet.

fil_space_create(): If a duplicate tablespace name is found, do
not silently free the existing tablespace, but instead return
an error.

fil_space_free(): Make this an externally callable function, to free a
tablespace from the cache when applying MLOG_FILE_DELETE.

fil_space_free_low(): Renamed from fil_space_free(). The new wrapper
fil_space_free() will acquire fil_system->mutex.

fil_op_log_parse_or_replay(): Change the order of parameters. Remove
log_flags, and rename parse_only to replay. We no longer attempt to
replay log records of a multi-item mini-transaction, unless the
MLOG_MULTI_REC_END was seen.

fil_delete_tablespace(): Write a MLOG_FILE_DELETE record before
attempting to delete the file.

fil_rename_tablespace(): Change the function signature. Take old_path,
new_name, new_path_in. MLOG_FILE_RENAME2 is logging file names, not
table names like MLOG_FILE_RENAME was. Also invoke fil_name_write().

enum fil_load_status: Outcomes of fil_load_single_table_tablespace().

fil_load_single_table_tablespace(): Do not exit on failure. Instead,
return a status value to the caller. Also, ignore *.isl files.
  
fil_load_single_table_tablespaces(): Remove. We no longer try to load
all *.ibd files.

fil_create_new_single_table_tablespace(): Do not write any
MLOG_FILE_CREATE or MLOG_FILE_CREATE2. Instead, invoke
fil_name_write() to write MLOG_FILE_NAME.

fil_mtr_rename_log(): Change the signature. Take dict_table_t instead
of names. Take a tmp_name.

fil_names_write_low(): Write MLOG_FILE_NAME record(s) for a
tablespace.
In fil_names_clear(), the fil_space_t will be protected by
fil_system->mutex.
In fil_names_write(), the fil_space_t will be protected by a buffer-fix
on some tablespace pages.

fil_names_write(): Look up a tablespace and write MLOG_FILE_NAME record(s).
This is speculatively called during mtr_commit() before log_mutex_enter().

fil_names_dirty(): Update space->max_lsn while only holding log_sys->mutex.
If max_lsn was 0, add the space to the named_spaces list, and tell the caller
not to discard the records that were appended by fil_names_write().

fil_names_clear(): Write MLOG_FILE_NAME records and MLOG_CHECKPOINT on
a log checkpoint or at system startup. If do_write=true, writes
MLOG_CHECKPOINT even if no MLOG_FILE_NAME was written.
Reset those fil_space_t::max_lsn for which fil_names_write() has not
been invoked after the checkpoint LSN. Return true to the caller if
any redo log was written.

fil_op_write_log(): Replace log_flags with first_page_no, and replace
table names with file paths. The parameter first_page_no is currently
being passed as 0, because we do not have non-predefined multi-file
tablespaces yet.

fil_name_write(): Write an MLOG_FILE_NAME record for a file.

Datafile::open_read_only(): Add the parameter bool strict.

Datafile::validate_for_recovery(), Datafile::validate_first_page():
Return DB_TABLESPACE_EXISTS on duplicate space_id.

Datafile::init(): Add a variant that takes ownership of the "name",
and allows filepath to be initialized.
  
Datafile::shutdown(): Remove a redundant check for m_name!=NULL.
free(NULL) is documented as no-op in the C standard.

is_predefined_tablespace(): Check if a tablespace is a predefined one
(system tablespace, undo tablespace or shared temporary tablespace).

enum recv_addr_state: Add RECV_DISCARDED, so that buffered redo log
records can be retroactively deleted if an MLOG_FILE_DELETE was
later recovered for a tablespace.

btr_free_but_not_root(), btr_free_root(): Call fsp_names_write().

btr_cur_ins_lock_and_undo(), btr_cur_optimistic_insert(),
btr_cur_pessimistic_insert(), btr_cur_update_in_place(),
btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
btr_cur_del_mark_set_clust_rec_log(),
btr_cur_del_mark_set_clust_rec(), btr_cur_optimistic_delete_func(),
btr_cur_pessimistic_delete(): Call fsp_names_write() after successful
locking and undo logging.

btr_store_big_rec_extern_fields(), btr_free_externally_stored_field(),
row_ins_index_entry_big_rec_func(): Call fsp_names_write().

dict_build_tablespace(), dict_create_index_tree_step(),
dict_recreate_index_tree(), fil_reinit_space_header(): Call
fsp_names_write().

page_cur_insert_rec_write_log(),
page_copy_rec_list_to_created_page_write(),
page_cur_delete_rec_write_log(), page_cur_delete_rec(), page_create():
Assert that fsp_names_write() has been called.

dict_table_rename_in_cache(): Pass old_path to
fil_rename_tablespace().

dict_check_tablespaces_and_store_max_id(): Remove the logic for
DICT_CHECK_ALL_LOADED. We could probably remove this entire function,
given that the maximum is also stored in the DICT_HDR page.

mlog_write_initial_log_record_low(): Replaces
mlog_write_initial_log_record_for_file_op(). If some page redo log is
being written, assert that fsp_names_write() has been called.

log_checkpoint(): Before invoking log_write_up_to(), invoke
fil_names_clear() to copy any MLOG_FILE_NAME records across the
checkpoint. Flush the log up to the MLOG_CHECKPOINT marker, instead of
only up to the checkpoint LSN. Without this step, the log between
oldest_lsn and log_sys->lsn would be essentially corrupted (missing
MLOG_FILE_NAME records on redo log apply). When the redo log scanner
sees the first MLOG_CHECKPOINT since the latest checkpoint, it knows
that there must be no missing MLOG_FILE_NAME record for any page
operation on a non-predefined tablespace. If the MLOG_CHECKPOINT
marker is missing, no redo log will be applied, and the system would
be at the state of the checkpoint.

log_reserve_and_write_fast(): Do not write MLOG_LSN after
a MLOG_CHECKPOINT marker, so that we will not get bogus
warnings about the data files being newer than the redo log.

fil_name_parse(): New function, to update the recv_spaces map based on
MLOG_FILE_NAME and MLOG_FILE_DELETE records during recovery.

recv_parse_or_apply_log_rec_body(), recv_parse_log_rec(): Add the
parameter "apply". Do not apply file-level redo log records unless the
entire mini-transaction has been recovered. Fail if an MLOG_FILE_NAME
record is missing for a page-level operation.

recv_recover_page_func(): Assert that no LSN is after the latest
scanned redo log LSN.

recv_parse_log_rec(): Check for some more log corruption.

recv_parse_log_recs(): Add a parameter "store_to_hash" to control
whether the records should be stored into recv_sys->addr_hash.
Add a parameter "apply" to specify whether log records should be applied
(apply=false during the first scan for MLOG_CHECKPOINT). Return true
if an MLOG_CHECKPOINT record was seen for the first time.
Improve DBUG_PRINT output, and detect some more log corruption.

recv_scan_log_recs(): Add a parameter "store_to_hash" to control
whether the records should be stored into recv_sys->addr_hash.

recv_group_scan_log_recs(): Initialize the variables and data
structures to begin reading redo log records. Add a parameter
"last_phase" that is set when a multi-pass recovery is needed and we
are scanning the redo log for a third time. In last_phase, we will
invoke recv_apply_hashed_log_recs() to empty recv_sys->addr_hash
between passes. If last_phase=false, we would stop filling
recv_sys->addr_hash, only processing file-level redo log records.

recv_init_crash_recovery(): Split some code into
recv_init_crash_recovery_spaces(), to be invoked after the first call
to recv_group_scan_log_recs().

recv_recovery_from_checkpoint_start(): Invoke
recv_group_scan_log_recs() up to 3 times if needed.
After processing all redo log, write an MLOG_CHECKPOINT marker
so that in case we will crash before making a checkpoint, the log
will be replayed by subsequent crash recovery.

checkpoint_now_set(): Avoid an infinite loop in case an MLOG_CHECKPOINT
marker is the only thing that was written since the latest checkpoint.