Every character string literal has a character set and a collation.
A character string literal may have an optional character set
COLLATE clause, to designate
it as a string that uses a particular character set and
Character set introducers and the
clause are implemented according to standard SQL
SELECT 'abc'; SELECT _latin1'abc'; SELECT _binary'abc'; SELECT _utf8'abc' COLLATE utf8_danish_ci;
expression is formally called an
introducer. It tells the parser,
“the string that follows uses character set
charset_name.” An introducer
does not change the string to the introducer character set
CONVERT() would do. It
does not change the string value, although padding may occur.
The introducer is just a signal.
MySQL determines the character set and collation of a character string literal in the following manner:
COLLATEare specified, character set
collation_namemust be a permitted collation for
_charset_nameis specified but
COLLATEis not specified, character set
charset_nameand its default collation are used. To see the default collation for each character set, use the
SHOW CHARACTER SETstatement.
_charset_nameis not specified but
COLLATEis specified, the connection default character set given by the
character_set_connectionsystem variable and collation
collation_namemust be a permitted collation for the connection default character set.
COLLATEis specified), the connection default character set and collation given by the
collation_connectionsystem variables are used.
A nonbinary string with
latin1character set and
SELECT _latin1'Müller' COLLATE latin1_german1_ci;
A nonbinary string with
utf8character set and its default collation (that is,
A binary string with
binarycharacter set and its default collation (that is,
A nonbinary string with the connection default character set and
utf8_general_cicollation (fails if the connection character set is not
SELECT 'Müller' COLLATE utf8_general_ci;
A string with the connection default character set and collation:
An introducer indicates the character set for the following
string, but does not change how the parser performs escape
processing within the string. Escapes are always interpreted
by the parser according to the character set given by
The following examples show that escape processing occurs
in the presence of an introducer. The examples use
SET NAMES (which changes
discussed in Section 11.1.4, “Connection Character Sets and Collations”), and
display the resulting strings using the
HEX() function so that the
exact string contents can be seen.
SET NAMES latin1;mysql>
SELECT HEX('à\n'), HEX(_sjis'à\n');+------------+-----------------+ | HEX('à\n') | HEX(_sjis'à\n') | +------------+-----------------+ | E00A | E00A | +------------+-----------------+
à (hexadecimal value
E0) is followed by
the escape sequence for newline. The escape sequence is
interpreted using the
latin1 to produce a literal
newline (hexadecimal value
happens even for the second string. That is, the
_sjis introducer does not affect the
parser's escape processing.
SET NAMES sjis;mysql>
SELECT HEX('à\n'), HEX(_latin1'à\n');+------------+-------------------+ | HEX('à\n') | HEX(_latin1'à\n') | +------------+-------------------+ | E05C6E | E05C6E | +------------+-------------------+
sjis, a character set in which the sequence
à followed by
\ (hexadecimal values
5C) is a valid multibyte character.
Hence, the first two bytes of the string are interpreted as a
sjis character, and the
\ is not interpreted as an escape
character. The following
6E) is not interpreted as part of an
escape sequence. This is true even for the second string; the
_latin1 introducer does not affect escape