WL#4618: RBR: extended table metadata in the binary log
Affects: Server-8.0
—
Status: Complete
EXECUTIVE SUMMARY
=================
This worklog extends the table metadata that is written to the binary
log in row based logging. The extended metadata serves two major purposes:
* Slave needs the metadata to transfer data smoothly when its table structure
is different from master's.
* External software needs the metadata to decode row events and store
the data into external databases(e.g. data warehouse).
SCENARIOS
=========
SCENARIO 1
----------
* User Story
As a DBA I want to setup MySQL to replicate successfully in ROW
mode so that replication with master and replicas that have
different character sets for the same column is possible. E.g.:
master> CREATE TABLE t1 (a varchar(10) character set latin1);
slave> CREATE TABLE t1 (a varchar(10) character set utf8);
* Details
MySQL currently supports statement based replication between
tables with columns having slightly different data types.
It includes columns with different character sets.
If you create this table on master:
CREATE TABLE t1 (a varchar(10) character set latin1);
and this table on slave:
CREATE TABLE t1 (a varchar(10) character set utf8);
then statement based replication works fine.
When slave re-parses a statement, it converts strings to utf8
from character_set_client of the event to utf8.
Row based replication doesn't work because
it operates with binary images. So the strings in
binary log are represented in latin1 character set,
and no conversion happens to utf8.
Possible solution how to make it work:
- Extend table map event to transfer character set ID together
with other column meta data, like data type, length and so on.
- If character set of the data in the image is not the same with
character of the column, then apply character set conversion.
SCENARIO 2
----------
* User Story
As a DBA I want to setup MySQL to replicate successfully in ROW
mode so that replication with master and replicas that have
different signedness for the same column is possible. E.g.:
@@slave_type_conversions='ALL_NON_LOSSY'
MASTER> CREATE TABLE t (c1 INT UNSIGNED);
SLAVE> ALTER TABLE t CHANGE COLUMN c1 c1 BIGINT SIGNED;
MASTER> INSERT INTO t(c1) values (3916586877);
* Details
If using different signed columns on master and slave, the slave
may go out of sync because there is no information shipped with
the table metadata regarding the signedness of the field.
For example (if using @@slave_type_conversions='ALL_NON_LOSSY'):
MASTER> CREATE TABLE t (c1 INT UNSIGNED);
SLAVE> ALTER TABLE t CHANGE COLUMN c1 c1 BIGINT SIGNED;
MASTER> INSERT INTO t(c1) values (3916586877);
MASTER> SELECT * FROM t; ---> 3916586877
SLAVE> SELECT * FROM t; ---> -378380419
SCENARIO 3
----------
* User Story
As a developer I want a ROW based MySQL replication stream to
contain all metadata so that I am able to write applications that
mine and integrate the ROW based replication stream into third
party tools or databases, without having to search for metadata
elsewhere other than the binary log itself.
* Details
Some [3] have, for instance, created their own Table_metadata_log_event,
which contains information on SIGNEDNESS, column names, column comments,
etc.
When decoding RBR events, the external programs need to know the metadata of
them(e.g. ENUM string values, Column Names, PK etc.). It could get the
metadata by querying the MySQL Server which generated the binary log. But
that is invalid if the data structure is already updated. So it is better to
log the metadata.
SCENARIO 4
----------
* User Story
As a MySQL Replication developer, I want to be able to leverage
the new metadata so that the ROW based replication applier is
able to cope with different schemas on master and replicas. This
enables is a stepping stone to better support rolling schemas
upgrades when replicating using ROW based logs.
* Details
There is a online DDL solution, which first changes slave's data structure
and then change master's data structure. So the table on slave may have
more or less columns, different column types and different column orders.
If required metadata is logged, then slave can support the online DDL
solution better. However, making the applier aware of the additional
metadata when applying ROW events is out of this worklog scope.
FUNCTIONAL REQUIREMENTS
=======================
F1. SIGNED/UNSIGNED flags for numbers
The binary log SHALL contain one flag per numerical column
type stating whether it is signed or unsigned.
The following types SHALL have an associated flag:
- TINYINT/SMALLINT/MEDIUMINT/INT/BIGINT
- DECIMAL/FLOAT/DOUBLE
- TINYINT/SMALLINT/MEDIUMINT/INT/BIGINT
- DECIMAL/FLOAT/DOUBLE
F2. CHARACTER SET
The binary log SHALL contain character set information of character
columns.
The following types SHALL have character set information.
CHAR/VARCHAR/TEXT
Note: JSON is a character type. But the requirement doesn't
include JSON type, since it just uses utf8mb4.
F3. Column Name
The binary log SHALL contain column names of tables.
F4. String value of SET
The binary log SHALL contain the string values of SET columns.
F5. String value of ENUM
The binary log SHALL contain the string values of ENUM columns.
F6. PRIMARY KEY
The binary log SHALL contain primary key information of tables.
Primary key information includes:
- Fields in defined order
- Prefix length used in the index defination.
F7. SQL types of every columns
The binary log SHALL contain information to describe SQL
types of columns. The information SHALL be able describe the exact
SQL types of columns.
NOTE: Some SQL types are just aliases of other SQL types. Server just
store them as the real SQL types. So the requirement doesn't
include information of alias SQL types.
Below are alias SQL types:
- BOOL is just an alias of TINYINT.
- NUMERIC is just an alias of DECIMAL.
NON-FUNCTIONAL REQUIREMENTS
===========================
NF1. Cross-version replication OLD -> NEW MUST work. OLD here means
(5.5, 5.6, 5.7) and NEW means (5.8).
NF2. There SHALL not be any new restrictions regarding replicating
form NEW -> OLD due to this worklog, even when the new
metadata is replicated. OLD here means (5.5, 5.6, 5.7) and
NEW means (5.8).
NF3. The design should support matadata expending in future.
BACKGROUND
==========
Below is some background information you may be interested. But you don't need
to review it.
Libing, here are a few notes I (Luis) have on the requirements:
Specific requirements from one of our users:
- Column metadata
- length
- sql type
- scale
- # of decimal digits in fractional part of decimal data type
- character set
- PK, UK, non-null UK, ...
- column name
- sql type name
- column comment
- table metadata
- table identifier
- # of columns
Specific requirements from another of our users:
- column metadata:
High prio:
- nullable or not
- PK or not
- signed/unsigned
- sql type
- max character lengths
- max octect length
- character set
- column name
Less prio:
- enumeration string names (not just the native values - integers)
- unique index or not
Other requirements for ROLLING SCHEMA CHANGES and RBR:
- Apply rows relying on column names rather than order of columns in
the event - lift the requirement that columns are added or removed
at the end of the table.
- Fire triggers on the slave when applying RBR (out of this worklog
scope).
Schema change is roughly something like this:
- create new table copy of original table on the slave
- create log table on the slave
- install triggers on original table
- copy original table to new table
- fire triggers on original table to log changes to log table
- !! RBR does not fire triggers, SBR does though !!
- apply log once copy original -> new is done
- rename original to original_obsolete, new to original
- drop original_obsolete and log
(Sometimes a log table is not needed and triggers are fired to the
new table directly. This can be done on the master or on the
slaves. On the master takes capacity from the writes and generates a
lot of IO.)
HIGH LEVEL SPECIFICATION
========================
- Extending Table_map_log_event
Table_map_log_event already includes part of the metadata required by
the requirements, so this design will extend Table_map_log_event to
contain all the metadata described in the requirements.
- Some metadata should be optional
Some sorts of metadata(e.g. column names) which occupy farely big
storage are not reqired in many application environments.
They are designed as optional metadata. And users can decide
the metadata binlog behavior through an system variable.
- Metadata for requirements "F7. SQL types of every columns" and
"F2. Character set"
Currently, Table_map_log_event contains a binlog_type for each
column. Most of them are one-to-one correspond to SQL types.
Below is the map between binlog_type and SQL type:
Binlogged Type SQL Type
---------------- ------------
MYSQL_TYPE_TINY TINYINT/BOOL
MYSQL_TYPE_SHORT SMALLINT
MYSQL_TYPE_INT24 MEDIUMINT
MYSQL_TYPE_LONG INT
MYSQL_TYPE_LONGLONG BIGINT
MYSQL_TYPE_NEWDECIMAL DECIMAL/NUMERIC
MYSQL_TYPE_FLOAT FLOAT
MYSQL_TYPE_DOUBLE DOUBLE
MYSQL_TYPE_BIT BIT
MYSQL_TYPE_YEAR YEAR
MYSQL_TYPE_NEWDATE DATE
MYSQL_TYPE_TIME2 TIME
MYSQL_TYPE_DATETIME2 DATETIME
MYSQL_TYPE_TIMESTAMP2 TIMESTAMP
MYSQL_TYPE_STRING CHAR/BINARY/SET/ENUM
MYSQL_TYPE_VARCHAR VARCHAR/BINARY
MYSQL_TYPE_BLOB TINYBLOB/BLOB/MEDIUMBLOB/LOGBLOB/
TINYTEXT/TEXT/MEDIUMTEXT/LONGTEXT
MYSQL_TYPE_GEOMETRY GEOMETRY
MYSQL_TYPE_JSON JSON
The binlog types of SET and ENUM are MYSQL_TYPE_STRING, and their
real type(MYSQL_TYPE_SET, MYSQL_TYPE_ENUM) is logged in metadata
field. So they are determinable.
With current Table_map_log_event metadata, CHAR and BINARY are not
distinguishable, VARCHAR and VARBINARY are not distinguishable.
BLOB and TEST are not distinguishable.
Since the only different between CHAR/VARCHAR/TEXT and
BINARY/VARCHAR/BLOB are character set. This design will use
character set to distinguish them.
"Requirement F2. Character set" just requires character set
for CHAR/VARCHAR/TEXT. But in this design, Table_map_log_event
will also contain character set of BINARY/VARBINARY/BLOB columns.
All geometry types are binlogged as MYSQL_TYPE_GEOMETRY, the real types
are not binlogged. we should binlog them. Real Geometry types includes:
- GEOMETRY
- POINT
- LINESTRING
- PONYGON
- MULTIPOINT
- MULTILINESTRING
- MULTIPONYGON
- GEOMETRYCOLLETION
INTERFACE SPECIFICATION
=======================
I1 - A New System Variable
Variable Name : binlog_row_metadata
Scope: GLOBAL
Type: ENUM
Dynamic: Yes
Values:
MINIMAL includes SIGNED/UNSIGED flags, column character set
and geometry type metadata.
FULL All metadata is included.
Default: MINIMAL
By default mimial metadata is logged. In MINIAL mode, only SIGNED flags
, column character set and geometry type are logged.
To change the behavior permanentely, users need to update
configuration file.
I2 - Table_map_log_event Format
+------------------------------------------------+
| |
| Original fields of Table_map_log_event |
| |
+------------------------------------------------+
| New field | ... | New field |
+------------------------------------------------+
| New field | ... | New field |
+------------------------------------------------+
Original fields are the fields implemented before this worklog, they are
not changed. All new metadata is implemented as new fields and appended at
the end of original Table_map_log_event fields.
* Cross Version Compatibility
It guarantees no cross version compatibility problem.
OLD - the server before this worklog
NEW - the server includes this worklog
OLD --> NEW
NEW server can read the Table_map_log_event generated by OLD server.
NEW --> OLD
OLD server just read the original fields. New fields will be ignored.
OLD server already has the logic, so we don't need to patch it.
- Format of New Fields
All new field use below format:
+------+----------+---------+
| Type | Length | Value |
+------+----------+---------+
Type: 1 Byte, unsiged integer
Length: 1-4 Bytes, packed length format.
Value: It varies agains Type.
Type Values:
1 - UNSIGNED flag of numeric columns
2 - Default character set of string columns
3 - Character set of string columns
4 - Column name
5 - String value of SET columns
6 - String value of ENUM columns
7 - Primary key without prefix
8 - Primary key with prefix
9 - Geometry type of geometry columns
2 and 3 don't appear together. They are just two ways to pack character set
information. When binlogging, it just log character set in the way which
occupy less storage.
7 and 8 don't appear together. 7 is for the primary keys which just use
some prefix of columns. 8 is for the primary keys which use complete
columns.
* Order of Fields
The order of new fields are not defined. So the new fields can be logged
in any order. E.g.
+---------------+----------------------+
| Character set | SIGNED/UNSIGNED flag |
+---------------+----------------------+
or
+----------------------+---------------+
| SIGNED/UNSIGNED flag | Character set |
+----------------------+---------------+
- SIGNED/UNSIGNED Flag
+-------------------------+
| Flags |
+-------------------------+
Flags: (Numeric Column Count+7)/8) Bytes.
Each bit describes one numeric column's SIGNED/UNSIGNED flag.
0 - signed
1 - unsigned
- Default character set of string columns
+--------------+---------+--------------+---------+----------+-------------+
| Default | Column1 | Column1 | ... | ColumnN | ColumnN |
| Collation ID | Index | Collation ID | | Index ID | Collation ID|
+--------------+---------+--------------+---------+----------+-------------+
Collation ID: 1-4 Bytes, packed length format.
Column Index: 1-2 Bytes, packed length format.
This field is logged when most of the columns have same character sets.
- Character set of string columns
+--------------+---------+--------------+
| Column1 | ... | ColumnN |
| Collation ID | ... | Collation ID |
+--------------+---------+--------------+
Collation ID: 1-4 Bytes, packed length format.
This field is logged when most of the columns have different character sets.
- Column Name
+---------+---------+--------+---------+---------+
| Column1 | Column1 | ... | ColumnN | ColumnN |
| Length | Name | | Length | Name |
+---------+---------+--------+---------+---------+
Length : 1 Byte. unsiged integer
Name : Length Bytes. String without '\0'
- String Value of SET Columns
- String Value of ENUM Columns
+---------+---------+--------+---------+---------+
| Column1 | Column1 | ... | ColumnN | ColumnN |
| Length | Values | | Length | Name |
+---------+---------+--------+---------+---------+
Length : 1-4 Bytes. packed length
Values : Length Bytes. Value's internal structure is:
+--------+--------+---------+----------------+
| Value1 | Value1 | ... | ValueN | ValueN |
| Length | | | Length | |
+--------+--------+---------+----------------+
Length: 1-4 Bytes. packed length.
Value : String value without "\n"
- Primary Index Without Prefix
+--------------+--------+----------------+
| Column1 Index | ... | ColumnN Index |
+--------------+--------+----------------+
Column Index : 1-3 Bytes. Packed length format of column index
0 < Column Index < Column Count
- Primary Index With Prefix
+---------+---------+---------+---------+---------+
| Column1 | Column1 | ... | ColumnN | ColumnN |
| Index | Prefix | | Index | Prefix |
+---------+---------+---------+---------+---------+
Column Index : 1-3 Bytes. Packed length format of column index
0 < Column Index < Column Count
Column Prefix: 1-3 Bytes. Packed length format of column prefix length.
0 mean no prefix.
- Geometry Types
+---------------+--------+----------------+
| Geometry Type | ... | Geometry Type |
+---------------+--------+----------------+
Geometry Type: 1-3 Bytes. Packed length format of geometry type.
I3 - Prints the extra metadata through mysqlbinlog
In this worklog, the feature is just for testing purpose.
* Option --print-table-metadata
Type BOOL, values: ON, OFF.
* Output Format
It has two parts: column information and primary key information.
Each of them will be printed as a SQL single line comment.
Column information looks like:
# Columns(c1 INT, c2 MEDIUMINT UNSIGNED, ...)
if column names are not binloged, colum names will not be printed.
# Columns( INT, MEDIUMINT UNSIGNED, ...)
Character is printed in one of the below format:
- Every character column is followed by a character set
# Columns(c1 CHAR(10) CHARSET latin1 COLLATION latin1_swedish_ci,
c2 TEXT CHARSET utf8 COLLATE utf8_unicode_520_binary)
- Charset columns may or may not followed by a charset set and
there is a default charset for the columns without charset.
# Columns(c1 CHAR(10), c2 TEXT CHARSET utf8 COLLATE
utf8_unicode_520_binary) DEFAULT CHARSET latin1 COLLATE
latin1_swedish_ci
NOTE: if a character set has binary charset it will be printed as
BINARY/VARBINARY/TINYBLOB/MEDIUMBLOB/BLOB/LONGBLOB types.
Primary Key is printed in below format:
# PRIMARY KEY(c1, c3(10), c2)
No framework changes in this worklog, since it just extends Table_map_log_event. So the LLD is pretty simple. - Table_map_log_event::write_data_body() Expands it to binlog the new metadata field. - Table_map_log_event::print() Expands it to print metadata detail for mysqlbinlog.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.