MySQL collation names follow these conventions:
- A collation name starts with the name of the character set with which it is associated, generally followed by one or more suffixes indicating other collation characteristics. For example, - utf8mb4_0900_ai_ciand- latin1_swedish_ciare collations for the- utf8mb4and- latin1character sets, respectively. The- binarycharacter set has a single collation, also named- binary, with no suffixes.
- A language-specific collation includes a locale code or language name. For example, - utf8mb4_tr_0900_ai_ciand- utf8mb4_hu_0900_ai_cisort characters for the- utf8mb4character set using the rules of Turkish and Hungarian, respectively.- utf8mb4_turkish_ciand- utf8mb4_hungarian_ciare similar but based on a less recent version of the Unicode Collation Algorithm.
- Collation suffixes indicate whether a collation is case-sensitive, accent-sensitive, or kana-sensitive (or some combination thereof), or binary. The following table shows the suffixes used to indicate these characteristics. - Table 1.1 Collation Suffix Meanings - Suffix - Meaning - _ai- Accent-insensitive - _as- Accent-sensitive - _ci- Case-insensitive - _cs- Case-sensitive - _ks- Kana-sensitive - _bin- Binary - For nonbinary collation names that do not specify accent sensitivity, it is determined by case sensitivity. If a collation name does not contain - _aior- _as,- _ciin the name implies- _aiand- _csin the name implies- _as. For example,- latin1_general_ciis explicitly case-insensitive and implicitly accent-insensitive,- latin1_general_csis explicitly case-sensitive and implicitly accent-sensitive, and- utf8mb4_0900_ai_ciis explicitly case-insensitive and accent-insensitive.- For Japanese collations, the - _kssuffix indicates that a collation is kana-sensitive; that is, it distinguishes Katakana characters from Hiragana characters. Japanese collations without the- _kssuffix are not kana-sensitive and treat Katakana and Hiragana characters equal for sorting.- For the - binarycollation of the- binarycharacter set, comparisons are based on numeric byte values. For the- _bincollation of a nonbinary character set, comparisons are based on numeric character code values, which differ from byte values for multibyte characters. For information about the differences between the- binarycollation of the- binarycharacter set and the- _bincollations of nonbinary character sets, see Section 1.8.5, “The binary Collation Compared to _bin Collations”.
- Collation names for Unicode character sets may include a version number to indicate the version of the Unicode Collation Algorithm (UCA) on which the collation is based. UCA-based collations without a version number in the name use the version-4.0.0 UCA weight keys. For example: - utf8mb4_0900_ai_ciis based on UCA 9.0.0 weight keys (http://www.unicode.org/Public/UCA/9.0.0/allkeys.txt).
- utf8mb4_unicode_520_ciis based on UCA 5.2.0 weight keys (http://www.unicode.org/Public/UCA/5.2.0/allkeys.txt).
- utf8mb4_unicode_ci(with no version named) is based on UCA 4.0.0 weight keys (http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt).
 
- For Unicode character sets, the - xxx_general_mysql500_ci- xxx_general_ci