WL#3570: Online backup: Default algorithm backup & restore (blocking)
Affects: Server-6.0
—
Status: Complete
The algorithm used if the engine does not provide any backup algorithm of its own. This is a blocking (i.e. it is not online) algorithm. The idea is to use handlerton methods to scan each table and get all its rows in binary format. These rows will be stored in the backup image and upon restore will be inserted into tables using handlerton write methods. To avoid the issue of data changing while tables are scanned. All tables will be blocked during the backup process.
The default backup mechanism should be implemented as a regular backup engine providing backup and restore drivers and implementing the backup API. The backup image will consist of several streams corresponding to the tables. Each stream will be a sequence of table rows as returned by handlerton read methods. Note: apparently there are two row formats used by storage engines which differ in the way null field values are represented. For the backup image *only one format should be used*. So, rows stored in a different format should be repacked accordingly. Mats is a person who knows about this issue. Since in this version default backup is blocking, it can pretend to be "at begin" type. Thus no data will be sent in the initial phase. BACKUP API implementation: ------------------------- - init_size Should be zero (but can be anything really). - size One can consider using statistical info to estimate number of rows in each table and thus find out approximate size of the whole backup image (not in the first version, though). - prepare The locking of the tables should be done here. No clear idea what is the best way of doing this... - get_data First call can return empty block for stream 0 indicating end of the initial phase. The following calls should return rows from tables assigned to corresponding streams. - lock Do nothing. - unlock. Do nothing. - cancel Cancel the process of creating backup image: unlock tables, clean engines state. RESTORE API implementation -------------------------- - prepare Do nothing (right now it is assumed that backup kernel locks all tables to be restored). - send_data Unpack table rows and insert them into tables. One can assume that data blocks will be received in the same order in which they were sent when the backup image was created. Thus if one decides to backup tables one by one, they can be also restored one by one. - continue Not sure what needs to be done. After this call the engine should be ready for normal operation with all the newly inserted rows. - cancel Interrupt restore process. All rows alread put into the tables will remain there.
Implement in: namespace backup { // Default backup engine class Default_Engine: public Engine {...}; // Backup driver for default engine class Default_Engine::Backup: public Engine::Backup {...}; // Restore driver for default engine class Default_Engine::Restore: public Engine::Restore {...}; }; An example of how to open and scan table using its handler can be found in sql_help.cc. --------------------------------------- The following is a simplified segment of the default backup algorithm: int default_backup(THD *thd, TABLE_LIST *t) { t->table = open_ltable(thd, t, TL_READ_NO_INSERT); t->table->file->extra(HA_EXTRA_RETRIEVE_ALL_COLS); t->table->file->ha_rnd_init(1); while (!t->table->file->rnd_next(t->table->record[0])) { /* write record buffer to file */ } t->table->file->ha_rnd_end(); } This rough outline of my algorithm is missing the normal locking and other error handling (it's there, I just omitted it for brevity). I am pretty sure all of the storage engines support these handlerton methods. The only one I'm not sure about is NDB, but it appears to support them.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.