Documentation Home
MySQL 5.7 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 35.0Mb
PDF (A4) - 35.1Mb
PDF (RPM) - 34.1Mb
EPUB - 8.6Mb
HTML Download (TGZ) - 8.3Mb
HTML Download (Zip) - 8.4Mb
HTML Download (RPM) - 7.2Mb
Eclipse Doc Plugin (TGZ) - 9.2Mb
Eclipse Doc Plugin (Zip) - 11.3Mb
Man Pages (TGZ) - 200.1Kb
Man Pages (Zip) - 305.3Kb
Info (Gzip) - 3.3Mb
Info (Zip) - 3.3Mb
Excerpts from this Manual

MySQL 5.7 Reference Manual  /  ...  /  Examples of the Effect of Collation

11.1.8.7 Examples of the Effect of Collation

Example 1: Sorting German Umlauts

Suppose that column X in table T has these latin1 column values:

Muffler
Müller
MX Systems
MySQL

Suppose also that the column values are retrieved using the following statement:

SELECT X FROM T ORDER BY X COLLATE collation_name;

The following table shows the resulting order of the values if we use ORDER BY with different collations.

latin1_swedish_cilatin1_german1_cilatin1_german2_ci
MufflerMufflerMüller
MX SystemsMüllerMuffler
MüllerMX SystemsMX Systems
MySQLMySQLMySQL

The character that causes the different sort orders in this example is the U with two dots over it (ü), which the Germans call U-umlaut.

  • The first column shows the result of the SELECT using the Swedish/Finnish collating rule, which says that U-umlaut sorts with Y.

  • The second column shows the result of the SELECT using the German DIN-1 rule, which says that U-umlaut sorts with U.

  • The third column shows the result of the SELECT using the German DIN-2 rule, which says that U-umlaut sorts with UE.

Example 2: Searching for German Umlauts

Suppose that you have three tables that differ only by the character set and collation used:

mysql> SET NAMES utf8;
mysql> CREATE TABLE german1 (
    ->   c CHAR(10)
    -> ) CHARACTER SET latin1 COLLATE latin1_german1_ci;
mysql> CREATE TABLE german2 (
    ->   c CHAR(10)
    -> ) CHARACTER SET latin1 COLLATE latin1_german2_ci;
mysql> CREATE TABLE germanutf8 (
    ->   c CHAR(10)
    -> ) CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Each table contains two records:

mysql> INSERT INTO german1 VALUES ('Bar'), ('Bär');
mysql> INSERT INTO german2 VALUES ('Bar'), ('Bär');
mysql> INSERT INTO germanutf8 VALUES ('Bar'), ('Bär');

Two of the above collations have an A = Ä equality, and one has no such equality (latin1_german2_ci). For that reason, you'll get these results in comparisons:

mysql> SELECT * FROM german1 WHERE c = 'Bär';
+------+
| c    |
+------+
| Bar  |
| Bär  |
+------+
mysql> SELECT * FROM german2 WHERE c = 'Bär';
+------+
| c    |
+------+
| Bär  |
+------+
mysql> SELECT * FROM germanutf8 WHERE c = 'Bär';
+------+
| c    |
+------+
| Bar  |
| Bär  |
+------+

This is not a bug but rather a consequence of the sorting properties of latin1_german1_ci and utf8_unicode_ci (the sorting shown is done according to the German DIN 5007 standard).


User Comments
Sign Up Login You must be logged in to post a comment.