Documentation Home
MySQL 5.5 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 26.9Mb
PDF (A4) - 26.9Mb
PDF (RPM) - 25.3Mb
HTML Download (TGZ) - 6.3Mb
HTML Download (Zip) - 6.3Mb
HTML Download (RPM) - 5.3Mb
Man Pages (TGZ) - 157.6Kb
Man Pages (Zip) - 260.8Kb
Info (Gzip) - 2.6Mb
Info (Zip) - 2.6Mb
Excerpts from this Manual

MySQL 5.5 Reference Manual  /  ...  /  Adding a Collation for Full-Text Indexing

12.9.7 Adding a Collation for Full-Text Indexing

This section describes how to add a new collation for full-text searches. The sample collation is like latin1_swedish_ci but treats the '-' character as a letter rather than as a punctuation character so that it can be indexed as a word character. General information about adding collations is given in Section 10.13, “Adding a Collation to a Character Set”; it is assumed that you have read it and are familiar with the files involved.

To add a collation for full-text indexing, use the following procedure. The instructions here add a collation for a simple character set, which as discussed in Section 10.13, “Adding a Collation to a Character Set”, can be created using a configuration file that describes the character set properties. For a complex character set such as Unicode, create collations using C source files that describe the character set properties.

  1. Add a collation to the Index.xml file. The collation ID must be unused, so choose a value different from 62 if that ID is already taken on your system.

    <charset name="latin1">
    ...
    <collation name="latin1_fulltext_ci" id="62"/>
    </charset>
  2. Declare the sort order for the collation in the latin1.xml file. In this case, the order can be copied from latin1_swedish_ci:

    <collation name="latin1_fulltext_ci">
    <map>
    00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
    10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
    20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
    30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
    40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
    50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F
    60 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F
    50 51 52 53 54 55 56 57 58 59 5A 7B 7C 7D 7E 7F
    80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F
    90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F
    A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF
    B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF
    41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49
    44 4E 4F 4F 4F 4F 5D D7 D8 55 55 55 59 59 DE DF
    41 41 41 41 5C 5B 5C 43 45 45 45 45 49 49 49 49
    44 4E 4F 4F 4F 4F 5D F7 D8 55 55 55 59 59 DE FF
    </map>
    </collation>
  3. Modify the ctype array in latin1.xml. Change the value corresponding to 0x2D (which is the code for the '-' character) from 10 (punctuation) to 01 (small letter). In the following array, this is the element in the fourth row down, third value from the end.

    <ctype>
    <map>
    00
    20 20 20 20 20 20 20 20 20 28 28 28 28 28 20 20
    20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
    48 10 10 10 10 10 10 10 10 10 10 10 10 01 10 10
    84 84 84 84 84 84 84 84 84 84 10 10 10 10 10 10
    10 81 81 81 81 81 81 01 01 01 01 01 01 01 01 01
    01 01 01 01 01 01 01 01 01 01 01 10 10 10 10 10
    10 82 82 82 82 82 82 02 02 02 02 02 02 02 02 02
    02 02 02 02 02 02 02 02 02 02 02 10 10 10 10 20
    10 00 10 02 10 10 10 10 10 10 01 10 01 00 01 00
    00 10 10 10 10 10 10 10 10 10 02 10 02 00 02 01
    48 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
    10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
    01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
    01 01 01 01 01 01 01 10 01 01 01 01 01 01 01 02
    02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
    02 02 02 02 02 02 02 10 02 02 02 02 02 02 02 02
    </map>
    </ctype>
  4. Restart the server.

  5. To employ the new collation, include it in the definition of columns that are to use it:

    mysql> DROP TABLE IF EXISTS t1;
    Query OK, 0 rows affected (0.13 sec)
    
    mysql> CREATE TABLE t1 (
        -> a TEXT CHARACTER SET latin1 COLLATE latin1_fulltext_ci,
        -> FULLTEXT INDEX(a)
        -> ) ENGINE=MyISAM;
    Query OK, 0 rows affected (0.47 sec)
  6. Test the collation to verify that hyphen is considered as a word character:

    mysql> INSERT INTO t1 VALUEs ('----'),('....'),('abcd');
    Query OK, 3 rows affected (0.22 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> SELECT * FROM t1 WHERE MATCH a AGAINST ('----' IN BOOLEAN MODE);
    +------+
    | a    |
    +------+
    | ---- |
    +------+
    1 row in set (0.00 sec)

User Comments
User comments in this section are, as the name implies, provided by MySQL users. The MySQL documentation team is not responsible for, nor do they endorse, any of the information provided here.
  Posted by jacqueline turalba on February 21, 2013
This page is useful fixing bug.
I think you need to include the possible path of each file, not just its name. So anyone who is new to an OS can actually go to the folder where a file mentioned in the page is located.

Sign Up Login You must be logged in to post a comment.