A collation is a set of rules that defines how to compare and sort character strings. Each collation in MySQL belongs to a single character set. Every character set has at least one collation, and most have two or more collations.
A collation orders characters based on weights. Each character in a character set maps to a weight. Characters with equal weights compare as equal, and characters with unequal weights compare according to the relative magnitude of their weights.
MySQL supports several collation implementations, as discussed in Section 10.4.1, “Collation Implementation Types”. Some of these can be added to MySQL without recompiling:
Simple collations for 8-bit character sets
UCA-based collations for Unicode character sets
The following sections describe how to add collations of the first two types to existing character sets. All existing character sets already have a binary collation, so there is no need here to describe how to add one.
Summary of the procedure for adding a new collation:
Choose a collation ID
Add configuration information that names the collation and describes the character-ordering rules
Restart the server
Verify that the collation is present
The instructions here cover only collations that can be added without recompiling MySQL. To add a collation that does require recompiling (as implemented by means of functions in a C source file), use the instructions in Section 10.3, “Adding a Character Set”. However, instead of adding all the information required for a complete character set, just modify the appropriate files for an existing character set. That is, based on what is already present for the character set's current collations, add data structures, functions, and configuration information for the new collation.
If you modify an existing collation, that may affect the ordering of rows for indexes on columns that use the collation. In this case, rebuild any such indexes to avoid problems such as incorrect query results. For further information, see Section 2.19.3, “Checking Whether Tables or Indexes Must Be Rebuilt”.
The Unicode Collation Algorithm (UCA) specification: http://www.unicode.org/reports/tr10/
The Locale Data Markup Language (LDML) specification: http://www.unicode.org/reports/tr35/
MySQL Blog article “Instructions for adding a new Unicode collation”: http://blogs.mysql.com/peterg/2008/05/19/instructions-for-adding-a-new-unicode-collation/