The utf8mb3
character set has these
characteristics:
Supports BMP characters only (no support for supplementary characters)
Requires a maximum of three bytes per multibyte character.
Applications that use UTF-8 data but require supplementary
character support should use utf8mb4
rather
than utf8mb3
(see
Section 1.9.1, “The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)”).
Exactly the same set of characters is available in
utf8mb3
and ucs2
. That is,
they have the same
repertoire.
utf8
is an alias for
utf8mb3
; the character limit is implicit,
rather than explicit in the name.
utf8mb3
can be used in CHARACTER
SET
clauses, and
utf8mb3_
in collation_substring
COLLATE
clauses, where
collation_substring
is
bin
, czech_ci
,
danish_ci
, esperanto_ci
,
estonian_ci
, and so forth. For example:
CREATE TABLE t (s1 CHAR(1)) CHARACTER SET utf8mb3;
SELECT * FROM t WHERE s1 COLLATE utf8mb3_general_ci = 'x';
DECLARE x VARCHAR(5) CHARACTER SET utf8mb3 COLLATE utf8mb3_danish_ci;
SELECT CAST('a' AS CHAR CHARACTER SET utf8) COLLATE utf8_czech_ci;
MySQL immediately converts instances of
utf8mb3
in statements to
utf8
, so in statements such as SHOW
CREATE TABLE
or SELECT CHARACTER_SET_NAME
FROM INFORMATION_SCHEMA.COLUMNS
or SELECT
COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS
, users
see the name utf8
or
utf8_
.
collation_substring
utf8mb3
is also valid in contexts other than
CHARACTER SET
clauses. For example:
mysqld --character-set-server=utf8mb3
SET NAMES 'utf8mb3'; /* and other SET statements that have similar effect */
SELECT _utf8mb3 'a';
For information about data type storage as it relates to multibyte character sets, see String Type Storage Requirements.