Documentation Home
MySQL 5.7 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 35.6Mb
PDF (A4) - 35.6Mb
PDF (RPM) - 34.6Mb
EPUB - 8.7Mb
HTML Download (TGZ) - 8.4Mb
HTML Download (Zip) - 8.5Mb
HTML Download (RPM) - 7.3Mb
Eclipse Doc Plugin (TGZ) - 9.3Mb
Eclipse Doc Plugin (Zip) - 11.4Mb
Man Pages (TGZ) - 202.2Kb
Man Pages (Zip) - 307.5Kb
Info (Gzip) - 3.3Mb
Info (Zip) - 3.3Mb
Excerpts from this Manual

MySQL 5.7 Reference Manual  /  ...  /  The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding) The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)

The character set named utf8 uses a maximum of three bytes per character and contains only BMP characters. The utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:

  • For a BMP character, utf8 and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.

  • For a supplementary character, utf8 cannot store the character at all, whereas utf8mb4 requires four bytes to store it. Because utf8 cannot store the character at all, you have no supplementary characters in utf8 columns and need not worry about converting characters or losing data when upgrading utf8 data from older versions of MySQL.

utf8mb4 is a superset of utf8, so for an operation such as the following concatenation, the result has character set utf8mb4 and the collation of utf8mb4_col:

SELECT CONCAT(utf8_col, utf8mb4_col);

Similarly, the following comparison in the WHERE clause works according to the collation of utf8mb4_col:

SELECT * FROM utf8_tbl, utf8mb4_tbl
WHERE utf8_tbl.utf8_col = utf8mb4_tbl.utf8mb4_col;

Tip: To save space with utf8mb4, use VARCHAR instead of CHAR. Otherwise, MySQL must reserve four bytes for each character in a CHAR CHARACTER SET utf8mb4 column because that is the maximum possible length. For example, MySQL must reserve 40 bytes for a CHAR(10) CHARACTER SET utf8mb4 column.

User Comments
Sign Up Login You must be logged in to post a comment.