Documentation Home
MySQL 5.7 Reference Manual
Related Documentation Download this Manual Excerpts from this Manual

MySQL 5.7 Reference Manual  /  ...  /  Full-Text Parser Plugins Full-Text Parser Plugins

MySQL has a built-in parser that it uses by default for full-text operations (parsing text to be indexed, or parsing a query string to determine the terms to be used for a search). The built-in full-text parser is supported with InnoDB and MyISAM tables.

A character-based ngram full-text parser that supports Chinese, Japanese, and Korean (CJK), and a word-based MeCab parser plugin that supports Japanese were introduced in MySQL 5.7.6, for use with InnoDB and MyISAM tables.

For full-text processing, parsing means extracting words (or tokens, in the case of an n-gram character-based parser) from text or a query string based on rules that define which character sequences make up a word and where word boundaries lie.

When parsing for indexing purposes, the parser passes each word to the server, which adds it to a full-text index. When parsing a query string, the parser passes each word to the server, which accumulates the words for use in a search.

The parsing properties of the built-in full-text parser are described in Section 12.9, “Full-Text Search Functions”. These properties include rules for determining how to extract words from text. The parser is influenced by certain system variables that cause words shorter or longer to be excluded, and by the stopword list that identifies common words to be ignored. For more information, see Section 12.9.4, “Full-Text Stopwords”, and Section 12.9.6, “Fine-Tuning MySQL Full-Text Search”.

The plugin API enables you to use a full-text parser other than the default built-in full-text parser. For example, if you are working with Japanese, you may choose to use the MeCab full-text parser. The plugin API also enables you to provide a full-text parser of your own so that you have control over the basic duties of a parser. A parser plugin can operate in either of two roles:

  • The plugin can replace the built-in parser. In this role, the plugin reads the input to be parsed, splits it up into words, and passes the words to the server (either for indexing or for token accumulation). The ngram and MeCab parsers introduced in MySQL 5.7.6 operate as replacements for the built-in full-text parser.

    You may choose to provide your own full-text parser if you need to use different rules from those of the built-in parser for determining how to split up input into words. For example, the built-in parser considers the text case-sensitive to consist of two words case and sensitive, whereas an application might need to treat the text as a single word.

  • The plugin can act in conjunction with the built-in parser by serving as a front end for it. In this role, the plugin extracts text from the input and passes the text to the parser, which splits up the text into words using its normal parsing rules. This parsing is affected by the innodb_ft_xxx or ft_xxx system variables and the stopword list.

    One reason to use a parser this way is that you need to index content such as PDF documents, XML documents, or .doc files. The built-in parser is not intended for those types of input but a plugin can pull out the text from these input sources and pass it to the built-in parser.

It is also possible for a parser plugin to operate in both roles. That is, it could extract text from noncleartext input (the front end role), and also parse the text into words (thus replacing the built-in parser).

A full-text plugin is associated with full-text indexes on a per-index basis. That is, when you install a parser plugin initially, that does not cause it to be used for any full-text operations. It simply becomes available. For example, a full-text parser plugin becomes available to be named in a WITH PARSER clause when creating individual FULLTEXT indexes. To create such an index at table-creation time, do this:

  doc CHAR(255),
  FULLTEXT INDEX (doc) WITH PARSER parser_name

Or you can add the index after the table has been created:


The only SQL change for associating the parser with the index is the WITH PARSER clause. Searches are specified as before, with no changes needed for queries.

When you associate a parser plugin with a FULLTEXT index, the plugin is required for using the index. If the parser plugin is dropped, any index associated with it becomes unusable. Any attempt to use a table for which a plugin is not available results in an error, although DROP TABLE is still possible.

For more information about full-text plugins, see Section, “Writing Full-Text Parser Plugins”. MySQL 5.7 supports full-text plugins with MyISAM and InnoDB. InnoDB support for full-text plugins was added in MySQL 5.7.3.

Download this Manual
User Comments
Sign Up Login You must be logged in to post a comment.