WL#2559: Enable fulltext search for non-MyISAM engines
Affects: Server-7.0
—
Status: Assigned
RATIONALE Needed to make a framework for all engines (that implement native full-text indexes) to support full-text search. SUMMARY Most of the full-text search functionality should be moved from MyISAM one level up and be storage engine independent. This task will also make it possible to use boolean search on tables that don't support full-text indexes, and on columns from different tables.
Fulltext search code will be split in two parts: parsing and searching ============================= The parser will be pure C, it'll be in the mysys/ft_parser.c To parse a row for indexing storage engine will need to do: ft_parse_init(); for_every_keyseg { ft_parse(); } ft_parse_end(); basically it's what _mi_ft_parserecord does now. ft_parse_end() is a new name of ft_linearize(). but besides renaming API will be almost the same it is now. ============================= Search will be C++, via handler interface. handler::ft_count(word) returns a number of rows this word is present in (MyISAM returns a wild guess if it does not know) handler::ft_search_word(word) handler::ft_search_next() searches by a word or a prefix, returns position (old name - docid) and weight, but NOT a record if word==NULL, last used word is used (an optimization when ft_search_word is called right after ft_count) handler::rnd_pos() used to read a record by position, as usual methods of records.cc could be reused, especially rr_from_cache is interesting ============================= ft2 as myisam specific optimization, stays in MyISAM special code for concurrent inserts stays in MySAM we assume that ft_search_word()/next returns words ordered by position, then "smarter index-merge" optimization is universally applicable stopwords are removed from ft_static.c and put in a plain-text file somewhere in share/ ============================= Directory structure: mysys/ft_parser.c mysys/ft_stopwords.c share/english/stopwords.txt scripts/ftbench/ - moved verbatim (other suggestions where to put ftbench are welcome) sql/item_ftfunc.* - relevant Item's from Item_func.* and ft_nlq_search.c/ft_boolean_search.c include/fulltext.h - myisam-independent part of old ftdefs.h myisam/ft_eval.* - it's deleted, as obsolete myisam/ft_test1.* - it's deleted, as obsolete myisam/ft_stem.c - deleted myisam/ftdefs.h - myisam-specific part of old ftdefs.h myisam/fulltext.h - merged with what remained from myisam/ftdefs.h myisam/ft_update.c - not moved. myisam-specific code myisam/ft_static.c - not moved. myisam-specific code (no stopwords there)
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.