Documentation Home
MySQL 5.5 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 27.2Mb
PDF (A4) - 27.2Mb
PDF (RPM) - 25.8Mb
HTML Download (TGZ) - 6.5Mb
HTML Download (Zip) - 6.6Mb
HTML Download (RPM) - 5.6Mb
Man Pages (TGZ) - 158.5Kb
Man Pages (Zip) - 262.0Kb
Info (Gzip) - 2.6Mb
Info (Zip) - 2.6Mb
Excerpts from this Manual

MySQL 5.5 Reference Manual  /  ...  /  Boolean Full-Text Searches

12.9.2 Boolean Full-Text Searches

MySQL can perform boolean full-text searches using the IN BOOLEAN MODE modifier. With this modifier, certain characters have special meaning at the beginning or end of words in the search string. In the following query, the + and - operators indicate that a word is required to be present or absent, respectively, for a match to occur. Thus, the query retrieves all the rows that contain the word MySQL but that do not contain the word YourSQL:

mysql> SELECT * FROM articles WHERE MATCH (title,body)
    -> AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+
| id | title                 | body                                |
+----+-----------------------+-------------------------------------+
|  1 | MySQL Tutorial        | DBMS stands for DataBase ...        |
|  2 | How To Use MySQL Well | After you went through a ...        |
|  3 | Optimizing MySQL      | In this tutorial we will show ...   |
|  4 | 1001 MySQL Tricks     | 1. Never run mysqld as root. 2. ... |
|  6 | MySQL Security        | When configured properly, MySQL ... |
+----+-----------------------+-------------------------------------+
Note

In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which

  • + stands for AND

  • - stands for NOT

  • [no operator] implies OR

Boolean full-text searches have these characteristics:

  • They do not use the 50% threshold.

  • They do not automatically sort rows in order of decreasing relevance. You can see this from the preceding query result: The row with the highest relevance is the one that contains MySQL twice, but it is listed last, not first.

  • They can work even without a FULLTEXT index, although a search executed in this fashion would be quite slow.

  • The minimum and maximum word length full-text parameters apply.

  • The stopword list applies.

The boolean full-text search capability supports the following operators:

  • +

    A leading plus sign indicates that this word must be present in each row that is returned.

  • -

    A leading minus sign indicates that this word must not be present in any of the rows that are returned.

    Note: The - operator acts only to exclude rows that are otherwise matched by other search terms. Thus, a boolean-mode search that contains only terms preceded by - returns an empty result. It does not return all rows except those containing any of the excluded terms.

  • (no operator)

    By default (when neither + nor - is specified) the word is optional, but the rows that contain it are rated higher. This mimics the behavior of MATCH() ... AGAINST() without the IN BOOLEAN MODE modifier.

  • > <

    These two operators are used to change a word's contribution to the relevance value that is assigned to a row. The > operator increases the contribution and the < operator decreases it. See the example following this list.

  • ( )

    Parentheses group words into subexpressions. Parenthesized groups can be nested.

  • ~

    A leading tilde acts as a negation operator, causing the word's contribution to the row's relevance to be negative. This is useful for marking noise words. A row containing such a word is rated lower than others, but is not excluded altogether, as it would be with the - operator.

  • *

    The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it should be appended to the word to be affected. Words match if they begin with the word preceding the * operator.

    If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short (as determined from the ft_min_word_len setting) or a stopword. This occurs because the word is not seen as too short or a stopword, but as a prefix that must be present in the document in the form of a word that begins with the prefix. Suppose that ft_min_word_len=4. Then a search for '+word +the*' will likely return fewer rows than a search for '+word +the':

    • The former query remains as is and requires both word and the* (a word starting with the) to be present in the document.

    • The latter query is transformed to +word (requiring only word to be present). the is both too short and a stopword, and either condition is enough to cause it to be ignored.

  • "

    A phrase that is enclosed within double quote (") characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase".

    If the phrase contains no words that are in the index, the result is empty. For example, if all words are either stopwords or shorter than the minimum length of indexed words, the result is empty.

The following examples demonstrate some search strings that use boolean full-text operators:

  • 'apple banana'

    Find rows that contain at least one of the two words.

  • '+apple +juice'

    Find rows that contain both words.

  • '+apple macintosh'

    Find rows that contain the word apple, but rank rows higher if they also contain macintosh.

  • '+apple -macintosh'

    Find rows that contain the word apple but not macintosh.

  • '+apple ~macintosh'

    Find rows that contain the word apple, but if the row also contains the word macintosh, rate it lower than if row does not. This is softer than a search for '+apple -macintosh', for which the presence of macintosh causes the row not to be returned at all.

  • '+apple +(>turnover <strudel)'

    Find rows that contain the words apple and turnover, or apple and strudel (in any order), but rank apple turnover higher than apple strudel.

  • 'apple*'

    Find rows that contain words such as apple, apples, applesauce, or applet.

  • '"some words"'

    Find rows that contain the exact phrase some words (for example, rows that contain some words of wisdom but not some noise words). Note that the " characters that enclose the phrase are operator characters that delimit the phrase. They are not the quotation marks that enclose the search string itself.


User Comments
User comments in this section are, as the name implies, provided by MySQL users. The MySQL documentation team is not responsible for, nor do they endorse, any of the information provided here.
  Posted by Roman Partyka on March 18, 2011
I have noticed strange behavior. I have a table 'mytable' with two columns: id (int), article (text). Column 'article' has FULLTEXT index on it. I want to find articles containing both 'word1' and 'word2'. 'word1' is very common (90% of the articles contain it).

Straightforward query:
SELECT id, article FROM mytable WHERE MATCH(article) AGAINST ('+word1 +word2' IN BOOLEAN MODE)
does return result, but is very slow.

At the same time, another query
SELECT id, article FROM mytable WHERE MATCH(article) AGAINST ('+word1' IN BOOLEAN MODE) AND MATCH(article) AGAINST ('+word2' IN BOOLEAN MODE)
returns the same result, but could be 100 times faster...

This is true for version 4.1

  Posted by Micah Stevenson on October 12, 2012
Another option for using full text boolean searches w/php and getting "exact phrases" working is to use the native php function html_entity_decode() to reverse the html entities on the user input.

Or, you can use Markus' example. Don't forget to sanitize user input :-)
Sign Up Login You must be logged in to post a comment.