MySQL collation names follow these conventions:
A collation name starts with the name of the character set with which it is associated, generally followed by one or more suffixes indicating other collation characteristics. For example,
utf8_general_ci
andlatin1_swedish_ci
are collations for theutf8
andlatin1
character sets, respectively. Thebinary
character set has a single collation, also namedbinary
, with no suffixes.A language-specific collation includes a language name. For example,
utf8_turkish_ci
andutf8_hungarian_ci
sort characters for theutf8
character set using the rules of Turkish and Hungarian, respectively.Collation suffixes indicate whether a collation is case and accent sensitive, or binary. The following table shows the suffixes used to indicate these characteristics.
Table 10.1 Collation Case Sensitivity Suffixes
Suffix Meaning _ai
Accent insensitive _as
Accent sensitive _ci
Case insensitive _cs
case-sensitive _bin
Binary For nonbinary collation names that do not specify accent sensitivity, it is determined by case sensitivity. If a collation name does not contain
_ai
or_as
,_ci
in the name implies_ai
and_cs
in the name implies_as
. For example,latin1_general_ci
is explicitly case insensitive and implicitly accent insensitive, andlatin1_general_cs
is explicitly case sensitive and implicitly accent sensitive.For the
binary
collation of thebinary
character set, comparisons are based on numeric byte values. For the_bin
collation of a nonbinary character set, comparisons are based on numeric character code values, which differ from byte values for multibyte characters. For more information, see Section 10.8.5, “The binary Collation Compared to _bin Collations”.For Unicode character sets, collation names may include a version number to indicate the version of the Unicode Collation Algorithm (UCA) on which the collation is based. UCA-based collations without a version number in the name use the version-4.0.0 UCA weight keys. For example:
utf8_unicode_520_ci
is based on UCA 5.2.0 weight keys (http://www.unicode.org/Public/UCA/5.2.0/allkeys.txt).utf8_unicode_ci
(with no version named) is based on UCA 4.0.0 weight keys (http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt).
For Unicode character sets, the
collations preserve the pre-5.1.24 ordering of the originalxxx
_general_mysql500_ci
collations and permit upgrades for tables created before MySQL 5.1.24 (Bug #27877).xxx
_general_ci