The built-in MySQL full-text parser uses the white space between
        words as a delimiter to determine where words begin and end,
        which is a limitation when working with ideographic languages
        that do not use word delimiters. To address this limitation,
        MySQL provides an ngram full-text parser that supports Chinese,
        Japanese, and Korean (CJK). The ngram full-text parser is
        supported for use with InnoDB and
        MyISAM.
MySQL also provides a MeCab full-text parser plugin for Japanese, which tokenizes documents into meaningful words. For more information, see Section 14.9.9, “MeCab Full-Text Parser Plugin”.
        An ngram is a contiguous sequence of
        n characters from a given sequence of
        text. The ngram parser tokenizes a sequence of text into a
        contiguous sequence of n characters.
        For example, you can tokenize “abcd” for different
        values of n using the ngram full-text
        parser.
      
n=1: 'a', 'b', 'c', 'd'
n=2: 'ab', 'bc', 'cd'
n=3: 'abc', 'bcd'
n=4: 'abcd'The ngram full-text parser is a built-in server plugin. As with other built-in server plugins, it is automatically loaded when the server is started.
        The full-text search syntax described in
        Section 14.9, “Full-Text Search Functions” applies to the ngram parser
        plugin. Differences in parsing behavior are described in this
        section. Full-text-related configuration options, except for
        minimum and maximum word length options
        (innodb_ft_min_token_size,
        innodb_ft_max_token_size,
        ft_min_word_len,
        ft_max_word_len) are also
        applicable.
Configuring ngram Token Size
The ngram parser has a default ngram token size of 2 (bigram). For example, with a token size of 2, the ngram parser parses the string “abc def” into four tokens: “ab”, “bc”, “de” and “ef”.
        ngram token size is configurable using the
        ngram_token_size configuration
        option, which has a minimum value of 1 and maximum value of 10.
      
        Typically, ngram_token_size is
        set to the size of the largest token that you want to search
        for. If you only intend to search for single characters, set
        ngram_token_size to 1. A
        smaller token size produces a smaller full-text search index,
        and faster searches. If you need to search for words comprised
        of more than one character, set
        ngram_token_size accordingly.
        For example, “Happy Birthday” is
        “生日快乐” in
        simplified Chinese, where
        “生日” is
        “birthday”, and
        “快乐” translates
        as “happy”. To search on two-character words such
        as these, set ngram_token_size
        to a value of 2 or higher.
      
        As a read-only variable,
        ngram_token_size may only be
        set as part of a startup string or in a configuration file:
- Startup string: - mysqld --ngram_token_size=2
- Configuration file: - [mysqld] ngram_token_size=2
          The following minimum and maximum word length configuration
          options are ignored for FULLTEXT indexes
          that use the ngram parser:
          innodb_ft_min_token_size,
          innodb_ft_max_token_size,
          ft_min_word_len, and
          ft_max_word_len.
Creating a FULLTEXT Index that Uses the ngram Parser
        To create a FULLTEXT index that uses the
        ngram parser, specify WITH PARSER ngram with
        CREATE TABLE,
        ALTER TABLE, or
        CREATE INDEX.
      
        The following example demonstrates creating a table with an
        ngram FULLTEXT index,
        inserting sample data (Simplified Chinese text), and viewing
        tokenized data in the Information Schema
        INNODB_FT_INDEX_CACHE table.
      
mysql> USE test;
mysql> CREATE TABLE articles (
      id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
      title VARCHAR(200),
      body TEXT,
      FULLTEXT (title,body) WITH PARSER ngram
    ) ENGINE=InnoDB CHARACTER SET utf8mb4;
mysql> SET NAMES utf8mb4;
INSERT INTO articles (title,body) VALUES
    ('数据库管理','在本教程中我将向你展示如何管理数据库'),
    ('数据库应用开发','学习开发数据库应用程序');
mysql> SET GLOBAL innodb_ft_aux_table="test/articles";
mysql> SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE ORDER BY doc_id, position;
        To add a FULLTEXT index to an existing table,
        you can use ALTER TABLE or
        CREATE INDEX. For example:
      
CREATE TABLE articles (
      id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
      title VARCHAR(200),
      body TEXT
     ) ENGINE=InnoDB CHARACTER SET utf8mb4;
ALTER TABLE articles ADD FULLTEXT INDEX ft_index (title,body) WITH PARSER ngram;
# Or:
CREATE FULLTEXT INDEX ft_index ON articles (title,body) WITH PARSER ngram;ngram Parser Space Handling
The ngram parser eliminates spaces when parsing. For example:
- “ab cd” is parsed to “ab”, “cd” 
- “a bc” is parsed to “bc” 
ngram Parser Stopword Handling
        The built-in MySQL full-text parser compares words to entries in
        the stopword list. If a word is equal to an entry in the
        stopword list, the word is excluded from the index. For the
        ngram parser, stopword handling is performed differently.
        Instead of excluding tokens that are equal to entries in the
        stopword list, the ngram parser excludes tokens that
        contain stopwords. For example, assuming
        ngram_token_size=2, a document
        that contains “a,b” is parsed to “a,”
        and “,b”. If a comma (“,”) is defined
        as a stopword, both “a,” and “,b” are
        excluded from the index because they contain a comma.
      
By default, the ngram parser uses the default stopword list, which contains a list of English stopwords. For a stopword list applicable to Chinese, Japanese, or Korean, you must create your own. For information about creating a stopword list, see Section 14.9.4, “Full-Text Stopwords”.
        Stopwords greater in length than
        ngram_token_size are ignored.
ngram Parser Term Search
        For natural language mode search, the
        search term is converted to a union of ngram terms. For example,
        the string “abc” (assuming
        ngram_token_size=2) is
        converted to “ab bc”. Given two documents, one
        containing “ab” and the other containing
        “abc”, the search term “ab bc” matches
        both documents.
      
        For boolean mode search, the search term is
        converted to an ngram phrase search. For example, the string
        'abc' (assuming
        ngram_token_size=2) is
        converted to '“ab bc”'. Given two documents, one
        containing 'ab' and the other containing 'abc', the search
        phrase '“ab bc”' only matches the document
        containing 'abc'.
ngram Parser Wildcard Search
        Because an ngram FULLTEXT index contains only
        ngrams, and does not contain information about the beginning of
        terms, wildcard searches may return unexpected results. The
        following behaviors apply to wildcard searches using ngram
        FULLTEXT search indexes:
- If the prefix term of a wildcard search is shorter than ngram token size, the query returns all indexed rows that contain ngram tokens starting with the prefix term. For example, assuming - ngram_token_size=2, a search on “a*” returns all rows starting with “a”.
- If the prefix term of a wildcard search is longer than ngram token size, the prefix term is converted to an ngram phrase and the wildcard operator is ignored. For example, assuming - ngram_token_size=2, an “abc*” wildcard search is converted to “ab bc”.
ngram Parser Phrase Search
Phrase searches are converted to ngram phrase searches. For example, The search phrase “abc” is converted to “ab bc”, which returns documents containing “abc” and “ab bc”.
The search phrase “abc def” is converted to “ab bc de ef”, which returns documents containing “abc def” and “ab bc de ef”. A document that contains “abcdef” is not returned.