WL#4618: RBR: extended table metadata in the binary log
Affects: Server-8.0
—
Status: Complete
EXECUTIVE SUMMARY ================= This worklog extends the table metadata that is written to the binary log in row based logging. The extended metadata serves two major purposes: * Slave needs the metadata to transfer data smoothly when its table structure is different from master's. * External software needs the metadata to decode row events and store the data into external databases(e.g. data warehouse). SCENARIOS ========= SCENARIO 1 ---------- * User Story As a DBA I want to setup MySQL to replicate successfully in ROW mode so that replication with master and replicas that have different character sets for the same column is possible. E.g.: master> CREATE TABLE t1 (a varchar(10) character set latin1); slave> CREATE TABLE t1 (a varchar(10) character set utf8); * Details MySQL currently supports statement based replication between tables with columns having slightly different data types. It includes columns with different character sets. If you create this table on master: CREATE TABLE t1 (a varchar(10) character set latin1); and this table on slave: CREATE TABLE t1 (a varchar(10) character set utf8); then statement based replication works fine. When slave re-parses a statement, it converts strings to utf8 from character_set_client of the event to utf8. Row based replication doesn't work because it operates with binary images. So the strings in binary log are represented in latin1 character set, and no conversion happens to utf8. Possible solution how to make it work: - Extend table map event to transfer character set ID together with other column meta data, like data type, length and so on. - If character set of the data in the image is not the same with character of the column, then apply character set conversion. SCENARIO 2 ---------- * User Story As a DBA I want to setup MySQL to replicate successfully in ROW mode so that replication with master and replicas that have different signedness for the same column is possible. E.g.: @@slave_type_conversions='ALL_NON_LOSSY' MASTER> CREATE TABLE t (c1 INT UNSIGNED); SLAVE> ALTER TABLE t CHANGE COLUMN c1 c1 BIGINT SIGNED; MASTER> INSERT INTO t(c1) values (3916586877); * Details If using different signed columns on master and slave, the slave may go out of sync because there is no information shipped with the table metadata regarding the signedness of the field. For example (if using @@slave_type_conversions='ALL_NON_LOSSY'): MASTER> CREATE TABLE t (c1 INT UNSIGNED); SLAVE> ALTER TABLE t CHANGE COLUMN c1 c1 BIGINT SIGNED; MASTER> INSERT INTO t(c1) values (3916586877); MASTER> SELECT * FROM t; ---> 3916586877 SLAVE> SELECT * FROM t; ---> -378380419 SCENARIO 3 ---------- * User Story As a developer I want a ROW based MySQL replication stream to contain all metadata so that I am able to write applications that mine and integrate the ROW based replication stream into third party tools or databases, without having to search for metadata elsewhere other than the binary log itself. * Details Some [3] have, for instance, created their own Table_metadata_log_event, which contains information on SIGNEDNESS, column names, column comments, etc. When decoding RBR events, the external programs need to know the metadata of them(e.g. ENUM string values, Column Names, PK etc.). It could get the metadata by querying the MySQL Server which generated the binary log. But that is invalid if the data structure is already updated. So it is better to log the metadata. SCENARIO 4 ---------- * User Story As a MySQL Replication developer, I want to be able to leverage the new metadata so that the ROW based replication applier is able to cope with different schemas on master and replicas. This enables is a stepping stone to better support rolling schemas upgrades when replicating using ROW based logs. * Details There is a online DDL solution, which first changes slave's data structure and then change master's data structure. So the table on slave may have more or less columns, different column types and different column orders. If required metadata is logged, then slave can support the online DDL solution better. However, making the applier aware of the additional metadata when applying ROW events is out of this worklog scope.
FUNCTIONAL REQUIREMENTS ======================= F1. SIGNED/UNSIGNED flags for numbers The binary log SHALL contain one flag per numerical column type stating whether it is signed or unsigned. The following types SHALL have an associated flag: - TINYINT/SMALLINT/MEDIUMINT/INT/BIGINT - DECIMAL/FLOAT/DOUBLE - TINYINT/SMALLINT/MEDIUMINT/INT/BIGINT - DECIMAL/FLOAT/DOUBLE F2. CHARACTER SET The binary log SHALL contain character set information of character columns. The following types SHALL have character set information. CHAR/VARCHAR/TEXT Note: JSON is a character type. But the requirement doesn't include JSON type, since it just uses utf8mb4. F3. Column Name The binary log SHALL contain column names of tables. F4. String value of SET The binary log SHALL contain the string values of SET columns. F5. String value of ENUM The binary log SHALL contain the string values of ENUM columns. F6. PRIMARY KEY The binary log SHALL contain primary key information of tables. Primary key information includes: - Fields in defined order - Prefix length used in the index defination. F7. SQL types of every columns The binary log SHALL contain information to describe SQL types of columns. The information SHALL be able describe the exact SQL types of columns. NOTE: Some SQL types are just aliases of other SQL types. Server just store them as the real SQL types. So the requirement doesn't include information of alias SQL types. Below are alias SQL types: - BOOL is just an alias of TINYINT. - NUMERIC is just an alias of DECIMAL. NON-FUNCTIONAL REQUIREMENTS =========================== NF1. Cross-version replication OLD -> NEW MUST work. OLD here means (5.5, 5.6, 5.7) and NEW means (5.8). NF2. There SHALL not be any new restrictions regarding replicating form NEW -> OLD due to this worklog, even when the new metadata is replicated. OLD here means (5.5, 5.6, 5.7) and NEW means (5.8). NF3. The design should support matadata expending in future. BACKGROUND ========== Below is some background information you may be interested. But you don't need to review it. Libing, here are a few notes I (Luis) have on the requirements: Specific requirements from one of our users: - Column metadata - length - sql type - scale - # of decimal digits in fractional part of decimal data type - character set - PK, UK, non-null UK, ... - column name - sql type name - column comment - table metadata - table identifier - # of columns Specific requirements from another of our users: - column metadata: High prio: - nullable or not - PK or not - signed/unsigned - sql type - max character lengths - max octect length - character set - column name Less prio: - enumeration string names (not just the native values - integers) - unique index or not Other requirements for ROLLING SCHEMA CHANGES and RBR: - Apply rows relying on column names rather than order of columns in the event - lift the requirement that columns are added or removed at the end of the table. - Fire triggers on the slave when applying RBR (out of this worklog scope). Schema change is roughly something like this: - create new table copy of original table on the slave - create log table on the slave - install triggers on original table - copy original table to new table - fire triggers on original table to log changes to log table - !! RBR does not fire triggers, SBR does though !! - apply log once copy original -> new is done - rename original to original_obsolete, new to original - drop original_obsolete and log (Sometimes a log table is not needed and triggers are fired to the new table directly. This can be done on the master or on the slaves. On the master takes capacity from the writes and generates a lot of IO.)
HIGH LEVEL SPECIFICATION ======================== - Extending Table_map_log_event Table_map_log_event already includes part of the metadata required by the requirements, so this design will extend Table_map_log_event to contain all the metadata described in the requirements. - Some metadata should be optional Some sorts of metadata(e.g. column names) which occupy farely big storage are not reqired in many application environments. They are designed as optional metadata. And users can decide the metadata binlog behavior through an system variable. - Metadata for requirements "F7. SQL types of every columns" and "F2. Character set" Currently, Table_map_log_event contains a binlog_type for each column. Most of them are one-to-one correspond to SQL types. Below is the map between binlog_type and SQL type: Binlogged Type SQL Type ---------------- ------------ MYSQL_TYPE_TINY TINYINT/BOOL MYSQL_TYPE_SHORT SMALLINT MYSQL_TYPE_INT24 MEDIUMINT MYSQL_TYPE_LONG INT MYSQL_TYPE_LONGLONG BIGINT MYSQL_TYPE_NEWDECIMAL DECIMAL/NUMERIC MYSQL_TYPE_FLOAT FLOAT MYSQL_TYPE_DOUBLE DOUBLE MYSQL_TYPE_BIT BIT MYSQL_TYPE_YEAR YEAR MYSQL_TYPE_NEWDATE DATE MYSQL_TYPE_TIME2 TIME MYSQL_TYPE_DATETIME2 DATETIME MYSQL_TYPE_TIMESTAMP2 TIMESTAMP MYSQL_TYPE_STRING CHAR/BINARY/SET/ENUM MYSQL_TYPE_VARCHAR VARCHAR/BINARY MYSQL_TYPE_BLOB TINYBLOB/BLOB/MEDIUMBLOB/LOGBLOB/ TINYTEXT/TEXT/MEDIUMTEXT/LONGTEXT MYSQL_TYPE_GEOMETRY GEOMETRY MYSQL_TYPE_JSON JSON The binlog types of SET and ENUM are MYSQL_TYPE_STRING, and their real type(MYSQL_TYPE_SET, MYSQL_TYPE_ENUM) is logged in metadata field. So they are determinable. With current Table_map_log_event metadata, CHAR and BINARY are not distinguishable, VARCHAR and VARBINARY are not distinguishable. BLOB and TEST are not distinguishable. Since the only different between CHAR/VARCHAR/TEXT and BINARY/VARCHAR/BLOB are character set. This design will use character set to distinguish them. "Requirement F2. Character set" just requires character set for CHAR/VARCHAR/TEXT. But in this design, Table_map_log_event will also contain character set of BINARY/VARBINARY/BLOB columns. All geometry types are binlogged as MYSQL_TYPE_GEOMETRY, the real types are not binlogged. we should binlog them. Real Geometry types includes: - GEOMETRY - POINT - LINESTRING - PONYGON - MULTIPOINT - MULTILINESTRING - MULTIPONYGON - GEOMETRYCOLLETION INTERFACE SPECIFICATION ======================= I1 - A New System Variable Variable Name : binlog_row_metadata Scope: GLOBAL Type: ENUM Dynamic: Yes Values: MINIMAL includes SIGNED/UNSIGED flags, column character set and geometry type metadata. FULL All metadata is included. Default: MINIMAL By default mimial metadata is logged. In MINIAL mode, only SIGNED flags , column character set and geometry type are logged. To change the behavior permanentely, users need to update configuration file. I2 - Table_map_log_event Format +------------------------------------------------+ | | | Original fields of Table_map_log_event | | | +------------------------------------------------+ | New field | ... | New field | +------------------------------------------------+ | New field | ... | New field | +------------------------------------------------+ Original fields are the fields implemented before this worklog, they are not changed. All new metadata is implemented as new fields and appended at the end of original Table_map_log_event fields. * Cross Version Compatibility It guarantees no cross version compatibility problem. OLD - the server before this worklog NEW - the server includes this worklog OLD --> NEW NEW server can read the Table_map_log_event generated by OLD server. NEW --> OLD OLD server just read the original fields. New fields will be ignored. OLD server already has the logic, so we don't need to patch it. - Format of New Fields All new field use below format: +------+----------+---------+ | Type | Length | Value | +------+----------+---------+ Type: 1 Byte, unsiged integer Length: 1-4 Bytes, packed length format. Value: It varies agains Type. Type Values: 1 - UNSIGNED flag of numeric columns 2 - Default character set of string columns 3 - Character set of string columns 4 - Column name 5 - String value of SET columns 6 - String value of ENUM columns 7 - Primary key without prefix 8 - Primary key with prefix 9 - Geometry type of geometry columns 2 and 3 don't appear together. They are just two ways to pack character set information. When binlogging, it just log character set in the way which occupy less storage. 7 and 8 don't appear together. 7 is for the primary keys which just use some prefix of columns. 8 is for the primary keys which use complete columns. * Order of Fields The order of new fields are not defined. So the new fields can be logged in any order. E.g. +---------------+----------------------+ | Character set | SIGNED/UNSIGNED flag | +---------------+----------------------+ or +----------------------+---------------+ | SIGNED/UNSIGNED flag | Character set | +----------------------+---------------+ - SIGNED/UNSIGNED Flag +-------------------------+ | Flags | +-------------------------+ Flags: (Numeric Column Count+7)/8) Bytes. Each bit describes one numeric column's SIGNED/UNSIGNED flag. 0 - signed 1 - unsigned - Default character set of string columns +--------------+---------+--------------+---------+----------+-------------+ | Default | Column1 | Column1 | ... | ColumnN | ColumnN | | Collation ID | Index | Collation ID | | Index ID | Collation ID| +--------------+---------+--------------+---------+----------+-------------+ Collation ID: 1-4 Bytes, packed length format. Column Index: 1-2 Bytes, packed length format. This field is logged when most of the columns have same character sets. - Character set of string columns +--------------+---------+--------------+ | Column1 | ... | ColumnN | | Collation ID | ... | Collation ID | +--------------+---------+--------------+ Collation ID: 1-4 Bytes, packed length format. This field is logged when most of the columns have different character sets. - Column Name +---------+---------+--------+---------+---------+ | Column1 | Column1 | ... | ColumnN | ColumnN | | Length | Name | | Length | Name | +---------+---------+--------+---------+---------+ Length : 1 Byte. unsiged integer Name : Length Bytes. String without '\0' - String Value of SET Columns - String Value of ENUM Columns +---------+---------+--------+---------+---------+ | Column1 | Column1 | ... | ColumnN | ColumnN | | Length | Values | | Length | Name | +---------+---------+--------+---------+---------+ Length : 1-4 Bytes. packed length Values : Length Bytes. Value's internal structure is: +--------+--------+---------+----------------+ | Value1 | Value1 | ... | ValueN | ValueN | | Length | | | Length | | +--------+--------+---------+----------------+ Length: 1-4 Bytes. packed length. Value : String value without "\n" - Primary Index Without Prefix +--------------+--------+----------------+ | Column1 Index | ... | ColumnN Index | +--------------+--------+----------------+ Column Index : 1-3 Bytes. Packed length format of column index 0 < Column Index < Column Count - Primary Index With Prefix +---------+---------+---------+---------+---------+ | Column1 | Column1 | ... | ColumnN | ColumnN | | Index | Prefix | | Index | Prefix | +---------+---------+---------+---------+---------+ Column Index : 1-3 Bytes. Packed length format of column index 0 < Column Index < Column Count Column Prefix: 1-3 Bytes. Packed length format of column prefix length. 0 mean no prefix. - Geometry Types +---------------+--------+----------------+ | Geometry Type | ... | Geometry Type | +---------------+--------+----------------+ Geometry Type: 1-3 Bytes. Packed length format of geometry type. I3 - Prints the extra metadata through mysqlbinlog In this worklog, the feature is just for testing purpose. * Option --print-table-metadata Type BOOL, values: ON, OFF. * Output Format It has two parts: column information and primary key information. Each of them will be printed as a SQL single line comment. Column information looks like: # Columns(c1 INT, c2 MEDIUMINT UNSIGNED, ...) if column names are not binloged, colum names will not be printed. # Columns( INT, MEDIUMINT UNSIGNED, ...) Character is printed in one of the below format: - Every character column is followed by a character set # Columns(c1 CHAR(10) CHARSET latin1 COLLATION latin1_swedish_ci, c2 TEXT CHARSET utf8 COLLATE utf8_unicode_520_binary) - Charset columns may or may not followed by a charset set and there is a default charset for the columns without charset. # Columns(c1 CHAR(10), c2 TEXT CHARSET utf8 COLLATE utf8_unicode_520_binary) DEFAULT CHARSET latin1 COLLATE latin1_swedish_ci NOTE: if a character set has binary charset it will be printed as BINARY/VARBINARY/TINYBLOB/MEDIUMBLOB/BLOB/LONGBLOB types. Primary Key is printed in below format: # PRIMARY KEY(c1, c3(10), c2)
No framework changes in this worklog, since it just extends Table_map_log_event. So the LLD is pretty simple. - Table_map_log_event::write_data_body() Expands it to binlog the new metadata field. - Table_map_log_event::print() Expands it to print metadata detail for mysqlbinlog.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.