This section describes the LDML syntax that MySQL recognizes. This is a subset of the syntax described in the LDML specification available at http://www.unicode.org/reports/tr35/, which should be consulted for further information. The rules described here are all supported except that character sorting occurs only at the primary level. Rules that specify differences at secondary or higher sort levels are recognized (and thus can be included in collation definitions) but are treated as equality at the primary level.
Characters named in LDML rules can be written in
nnnn is the hexadecimal
Unicode code point value. Within hexadecimal values, the
not case sensitive;
\u00e1 are equivalent. Basic Latin letters
a-z can also be
written literally (this is a MySQL limitation; the LDML
specification permits literal non-Latin1 characters in the
rules). Only characters in the Basic Multilingual Plane can be
specified. This notation does not apply to characters outside
the BMP range of
Index.xml file itself should be
written using ASCII encoding.
LDML has reset rules and shift rules to specify character ordering. Orderings are given as a set of rules that begin with a reset rule that establishes an anchor point, followed by shift rules that indicate how characters sort relative to the anchor point.
<reset> rule does not specify
any ordering in and of itself. Instead, it
“resets” the ordering for subsequent shift
rules to cause them to be taken in relation to a given
character. Either of the following rules resets subsequent
shift rules to be taken in relation to the letter
<t> shift rules define primary,
secondary, and tertiary differences of a character from
Use primary differences to distinguish separate letters.
Use secondary differences to distinguish accent variations.
Use tertiary differences to distinguish lettercase variations.
Either of these rules specifies a primary shift rule for