Documentation Home
MySQL 5.6 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 31.4Mb
PDF (A4) - 31.5Mb
PDF (RPM) - 30.0Mb
EPUB - 7.7Mb
HTML Download (TGZ) - 7.5Mb
HTML Download (Zip) - 7.5Mb
HTML Download (RPM) - 6.5Mb
Eclipse Doc Plugin (TGZ) - 8.2Mb
Eclipse Doc Plugin (Zip) - 10.1Mb
Man Pages (TGZ) - 180.8Kb
Man Pages (Zip) - 291.9Kb
Info (Gzip) - 2.8Mb
Info (Zip) - 2.8Mb
Excerpts from this Manual

MySQL 5.6 Reference Manual  /  ...  /  Character String Literal Character Set and Collation

10.1.4.5 Character String Literal Character Set and Collation

Every character string literal has a character set and a collation.

A character string literal may have an optional character set introducer and COLLATE clause:

[_charset_name]'string' [COLLATE collation_name]

Examples:

SELECT 'string';
SELECT _latin1'string';
SELECT _latin1'string' COLLATE latin1_danish_ci;

For the simple statement SELECT 'string', the string has the character set and collation defined by the character_set_connection and collation_connection system variables.

The _charset_name expression is formally called an introducer. It tells the parser, the string that is about to follow uses character set X. Because this has confused people in the past, we emphasize that an introducer does not change the string to the introducer character set like CONVERT() would do. It does not change the string's value, although padding may occur. The introducer is just a signal. An introducer is also legal before standard hex literal and numeric hex literal notation (x'literal' and 0xnnnn), or before bit-field literal notation (b'literal' and 0bnnnn).

Examples:

SELECT _latin1 x'AABBCC';
SELECT _latin1 0xAABBCC;
SELECT _latin1 b'1100011';
SELECT _latin1 0b1100011;

MySQL determines a literal's character set and collation in the following manner:

  • If both _X and COLLATE Y are specified, character set X and collation Y are used.

  • If _X is specified but COLLATE is not specified, character set X and its default collation are used. To see the default collation for each character set, use the SHOW COLLATION statement.

  • Otherwise, the character set and collation given by the character_set_connection and collation_connection system variables are used.

Examples:

  • A string with latin1 character set and latin1_german1_ci collation:

    SELECT _latin1'Müller' COLLATE latin1_german1_ci;
    
  • A string with latin1 character set and its default collation (that is, latin1_swedish_ci):

    SELECT _latin1'Müller';
    
  • A string with the connection default character set and collation:

    SELECT 'Müller';
    

Character set introducers and the COLLATE clause are implemented according to standard SQL specifications.

An introducer indicates the character set for the following string, but does not change now how the parser performs escape processing within the string. Escapes are always interpreted by the parser according to the character set given by character_set_connection.

The following examples show that escape processing occurs using character_set_connection even in the presence of an introducer. The examples use SET NAMES (which changes character_set_connection, as discussed in Section 10.1.5, “Connection Character Sets and Collations”), and display the resulting strings using the HEX() function so that the exact string contents can be seen.

Example 1:

mysql> SET NAMES latin1;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT HEX('à\n'), HEX(_sjis'à\n');
+------------+-----------------+
| HEX('à\n') | HEX(_sjis'à\n') |
+------------+-----------------+
| E00A       | E00A            |
+------------+-----------------+
1 row in set (0.00 sec)

Here, à (hex value E0) is followed by \n, the escape sequence for newline. The escape sequence is interpreted using the character_set_connection value of latin1 to produce a literal newline (hex value 0A). This happens even for the second string. That is, the introducer of _sjis does not affect the parser's escape processing.

Example 2:

mysql> SET NAMES sjis;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT HEX('à\n'), HEX(_latin1'à\n');
+------------+-------------------+
| HEX('à\n') | HEX(_latin1'à\n') |
+------------+-------------------+
| E05C6E     | E05C6E            |
+------------+-------------------+
1 row in set (0.04 sec)

Here, character_set_connection is sjis, a character set in which the sequence of à followed by \ (hex values 05 and 5C) is a valid multibyte character. Hence, the first two bytes of the string are interpreted as a single sjis character, and the \ is not interpreted as an escape character. The following n (hex value 6E) is not interpreted as part of an escape sequence. This is true even for the second string; the introducer of _latin1 does not affect escape processing.


User Comments
  Posted by brusselsshrek on September 30, 2005
I was having the following error and this page seems to be the key to solving it:

Illegal mix of collations (utf8_bin,IMPLICIT) and (latin1_swedish_ci,COERCIBLE) for operation

When I looked at my table, I had specified the collation sequence as:

utf8_bin

yet in my SQL (in PHP code) I was simply specifying a literal (it was more fancy than "xyz" - arabic or chinese, but you will get the point):

SELECT * from my_table WHERE my_col = "xyz"

Now, the solution proved to be stating:

SELECT * from my_table WHERE my_col = _utf8"xyz"
  Posted by Greg Skluzacek on August 5, 2011
to the poster above, you may have also been able to solve your issue by doing a set names utf8 as I'm guessing your connection character set was set the default of latin1
Sign Up Login You must be logged in to post a comment.