WL#6045: Improve Innochecksum

Affects: Server-5.7 — Status: Complete

Description
High Level Architecture
Low Level Design

This WL is to enhance the functionality of innochecksum utility.
More details for current scenario check on,                        
http://dev.mysql.com/doc/refman/5.6/en/innochecksum.html

WL's Requirements:
R01 - It shall be possible to specify the checksum algorithm to innochecksum   
utility.
R02 - innochecksum shall provide option to rewrite the current checksum using
the specified algorithm.
R03 - innochecksum shall provide option to rewrite checksum even if the current
checksum is invalid.
R04 - innochecksum shall allow to configure the maximum checksum mismatch
allowed before terminating the program.
R05 - innochecksum shall operate on multiple tablespace files.
R06 - innochecksum shall operate on multiple files in the same tablespace.
R07 - innochecksum shall operate on files greater than 2GB.
R08 - Innochecksum shall chart out page type summaries for each page in the
filespace, and would dump the info on standard output(stdout) or standard   
error(stderr) as it goes.
R09 - Debug option for innochecksum tool shall change & must work similar as
mysqld debug option works.

a1) Introduce a new option for innochecksum named “strict-check ( -C for short
option)” for strictly checksum verification mention by user from possible values
for “innochecksum_algorithms”.
Example:
     [1] ./bin/innochecksum --strict-check=innodb t1.ibd
     [2] ./bin/innochecksum -C crc32 t1.ibd
Both above example go for strict_* specified checksum algorithm for verification.
     [3] ./bin/innochecksum t1.ibd
This will go for all(innodb, crc32 & none) checksum algorithm for match.

a2)Introduce the new option as "no-check (-n for short option)" for
innochecksum.It will bypass/ignore the checksum verification.This option only
must be accepted when --write option is also provided as to fix the
invalid checksum. If --write option is not mention along with --no-check then
abort the program.
Example:
      [1] ./bin/innochecksum --no-check t1.ibd
In this case, terminate the program with error message that, --no-check must be
associated with --write option.

      [2] ./bin/innochecksum --strict-check=innodb --no-check t1.ibd
As this is the conflict case,so terminate the program with error message.

      [3] ./bin/innochecksum --no-check --write=innodb t1.idb

Note:
[A]When no --strict_check option is specified, any of the checksum algorithms
are allowed to match.
[B] --strict-check & --no-check can't  be use together,as it conflict case, so
in this case abort the program.Error message as
"Error: --strict-check cannot be used together with --no-check."

b) Introduce a new option for rewriting the checksum named as “write ( -w for
short option)” for checksum rewriting mention by user from possible values for
“innochecksum_algorithms”.
Note: If the checksum_verification fails, then the innochecksum terminate
without rewriting the user opted checksum for “write” option, when we reached
the maximum mismatches(--allow-mismatches option) allowed.
Example:
./bin/innochecksum --strict-check=innodb --write=crc32 t1.ibd
./bin/innochecksum  -w crc32 t1.ibd

Note: --write option must be mention for carrying out rewrite of the checksum.

c)Introduce new option as “--page-type-summary”(-S for short option) &
“--page-type-dump” (-D for short option).
“--page-type-summary” is for chart out page type summaries for each page in the
filespace, and “--page-type-dump” would dump the info on stdout or stderr as it
goes.

d)Introduce the tolerance rate for maximum checksum mismatch as
"--allow-mismatches" (-a for short option). The default value is set to be 0. 
Uses Cases::
[1] If --allow-mismatches=N (where N>=0) then we allow up to N mismatches and
abort at the "N+1"th.
Check out the example mention in below LLD.

e) The file access calls used are simple and have problems.  They should be
replaced with those in os0file.c so that files larger than 2Gb can be read on 32
bit systems and so that innochecksum can take an advisory lock on the file, like
InnoDB does. That would prevent failures (reads of half-written pages) when
InnoDB is accessing the same file.
Example:
The C standard library ftell() and fseek() use the 'long' type for file offsets.
This limits the file size to 2^31 bytes on 32-bit systems.We have to use
something else like fgetpos() ,fsetpos() which have no issue with the file size.

f) Read from stdin for innochecksum tool.
Example:
    i) cat t1.ibd | ./bin/innochecksum  -
    Note: "-" option specify that the read is from standard input.
    if "-" option is missing when read froms stdin is expected, then           
innochecksum will throw an error of innochecksum usage.

    ii) cat t1.ibd | ./bin/innochecksum --write=crc32 -
    Note: The rewrite of mention checksum algorithm for --write goes to stdout.

------------------------------------
Following change in behaviour from old innochecksum tool:

a) Debug option: --debug ( -d for short option)

  1) This can be configured just like mysqld debug option is used.
  Now all output from innochecksum is directed to trace file(which is passed as 
  part of debug option).

  Default location of trace file:
  unix like system:
  -----------------
  /tmp/innochecksum.trace
   Example: ./bin/innochecksum --debug t1.ibd
            ./bin/innochecksum --debug=d:o,/tmp/inno.ibd t1.ibd  

   Windows:
   --------
   [To be tested]

   2) If debug option is not passed, no trace file is created.

b) 
   Innochecksum shall take advisory lock, with --write option, exclusive lock
   is obtained and *without* --write option read lock is obtained on ibd file.

   1) Innochecksum can be run on concurrently on the same ibd only if
      innochecksum tool obtains *read* lock (i.e in read only mode)

   2) If innochecksum obtains exclusive lock (i.e with --write option),
      the other instances of innochecksum tool on the same .ibd file will
      fail the following error 
    "fcntl: Resource temporarily unavailable and terminate."

   3) If innochecksum obtains any lock on the .ibd file and if server 
      tries to acquire the exclusive lock on the same .ibd file,
      it will crash as it cannot acquire lock.
      
      The error message is:
       "[ERROR] InnoDB: Unable to lock .ibd, error: 11"

     Note: server will try to acquire the lock on .ibd file only if it tries to
     access the file (any DDL, DML queries). Until then there is no lock obtained
     on the .ibd file.

   4) If server acquires the lock first, innochecksum will terminate with 
      the following error.
     "fcntl: Resource temporarily unavailable".

1) Possible value for checksum verification for --strict_check as:
"innodb","crc32","none".
/** Possible values for "check_algorithms" for strictly verify checksum. */
static const char *innochecksum_algorithms[]=
{
  "crc32",
  "crc32",
  "innodb",
  "innodb",
  "none",
  "none",
  NullS
};
This is make in this way as to make compatible with srv_checksum_algorithm_t.
Note:For each specified algorithm,innochecksum tool check for strictly checksum
algorithm for --strict_check.
"none" option for checksum algorithm, it will require the magic 0xdeadbeef
checksum, rather than disable the checksum validation.

2) Possible value for rewriting checksum for --write as:
"innodb","crc32","none".
We will rewrite whole pages on disk, not only the checksum fields. If the write
is not going to change the on disk contents, then we should skip the write to
minimize IO. We should leave all-zero pages untouched.

a) For checksum verification, use the 
buf_page_is_corrupted() functionality both for compressed 
and uncompressed page.
For compressed page verification, inside the buf_page_is_corrupted(),it call for
page_zip_verify_checksum().

Modification::
In the buf_page_is_corrupted() and page_zip_verify_checksum(). 
--As innochecksum tool print the message/information when "debug" option is
enable by user.So add the print message in both the function to print the
status/information only for innochecksum tool.
It going to use of #ifdef UNIV_INNOCHECKSUM, for printing the information about
the checksums.

b) Function prototype for rewriting the checksum as for “write” option.
/********************************************************************//**
Rewrite the checksum for the page. */
UNIV_INTERN
int
update_checksum(
/*=============*/
	byte*	page,		   /*!< in/out: page */
	ulong physical_page_size,  /* Page size in bytes on disk. */
	bool iscompressed)	   /*!< in/out:  enable/disable for  
compressed/uncompressed page resp. */
	

c) For page type summaries for each page in filespace.
/********************************************************************//**
void
parse( 
/*===*/ 
        const byte* page, /* in: buffer page */
        FILE* f)          /*! < in/out: output stream for diagnostics */


d)For --page-type-summary option.
It will print the type of page from input ibd file. 

Various Page types are::
=========================
FIL_PAGE_TYPE_ALLOCATED 
FIL_PAGE_INDEX 
FIL_PAGE_UNDO_LOG
FIL_PAGE_INODE  
FIL_PAGE_IBUF_FREE_LIST  
FIL_PAGE_IBUF_BITMAP
FIL_PAGE_TYPE_SYS 
FIL_PAGE_TYPE_TRX_SYS
FIL_PAGE_TYPE_FSP_HDR
FIL_PAGE_TYPE_XDES
FIL_PAGE_TYPE_BLOB
FIL_PAGE_TYPE_ZBLOB
=======================

*********************************************************************
[A]Example Output format for page-type-summary:
./bin/innochecksum --page-type-summary ./data/test/tab1.ibd

File::./data/test/tab1.ibd
================PAGE TYPE SUMMARY=====================
2	FIL_PAGE_INDEX
0	FIL_PAGE_UNDO_LOG
1	FIL_PAGE_INODE
0	FIL_PAGE_IBUF_FREE_LIST
2	FIL_PAGE_TYPE_ALLOCATED
1	FIL_PAGE_IBUF_BITMAP
0	FIL_PAGE_TYPE_SYS
0	FIL_PAGE_TYPE_TRX_SYS
1	FIL_PAGE_TYPE_FSP_HDR
0	FIL_PAGE_TYPE_XDES
0	FIL_PAGE_TYPE_BLOB
0	FIL_PAGE_TYPE_ZBLOB
0	other
undo type: 0 insert, 0 update, 0 other
undo state: 0 active, 0 cached, 0 to_free, 0to_purge, 0 prepared, 0 other

**************************************************************************

[B] Example Output format for --page-type-dump
./bin/innochecksum --page-type-dump /tmp/a.txt ./data/test/tab1.ibd

cat /tmp/a.txt

Filename::./data/test/tab1.ibd
==============================================================================
        PAGE_NO |       PAGE_TYPE               |       EXTRA INFO
==============================================================================
#::       0     |       File Space Header       |       -       
#::       1     |       Insert Buffer Bitmap    |       -       
#::       2     |       Inode Page              |       -       
#::       3     |       Index page              |       index id=22, page
level=0,No. of records=1, garbage=0
#::       4     |       Index page              |       index id=23, page
level=0,No. of records=1, garbage=0
#::       5     |       Freshly allocated page  |       -       
#::       6     |       Freshly allocated page  |       -       

****************************************************************************

NOW STEPS INVOLVE:

Phase 1 Implementation:
1) Open the data file, determine the logical & physical page size from
get_page_size() , & also enabled the compressed variable if the page is
compressed via usage of fsp_flags_get_zip_size().

2)If --no-check is not given then,verify the checksum from
buf_page_is_calculated() both for compressed and uncompressed page.
Note: --no_check must be given along with --write option, otherwise terminate
the program with warning.
   
3) For user specified -–write option & no --allow-mismatches is given.

 The page checksums will be converted sequentially, starting from page 0. If the
checksum verification fails, then the program execution will stop, without
checking or rewriting the checksums of any subsequent pages. As by default value
of --allow-mismatches is 0 (means to terminate the program on first checksum
mismatch). 
[a]Example:We have a file with 1000 pages. on page 600 there is a checksum
mismatch. you would convert all pages 0..599 to the new checksum, and leave
pages 600..999 as is.

[b] If --allow-mismatches=1 then
Example:
But if there is also mismatch say at 700 page, then u have also updated checksum
from  0-599 & then from 601-699. Then terminate the program as checksum mismatch
count equal to --allow-mismatches, so then page 600 & 700-999 is left as it is.
As here max_error_count=2

Phase 2 Implementation:
a)Lock the ibd data files while rewriting for checksum & use the file access
call from os0file.cc.
b) Multiple files option(verification & rewriting checksum) for innochecksum as
currently it only deal with one user opted file.