WL#826: Alternative collation in indexes
Affects: Server-7.1 — Status: Un-Assigned
Add a possibility to create an index with collation different from the collation in the column description. For example: CREATE TABLE t1 (f1 CHAR(5) CHARACTER SET latin1 COLLATE latin1_general_ci); CREATE INDEX k1 ON t1 (f1 COLLATE latin1_general_ci); CREATE INDEX k1 ON t1 (f1 COLLATE latin1_german2_ci); CREATE INDEX k1 ON t1 (f1 COLLATE latin1_bin); This can be very useful with Unicode columns, as we have many collations for UCS2/UTF8. However, having for example, 15 language specific indexes on the same column is an enormous waste of space. Another approach would be to be able to reuse for example an index for language A for searches in language B. It seems to be possible in Unicode, at least with restrictions, because all Unicode collations, unlike MySQL 8bit collations, are "almost the same". I.e. they are the same for almost 64K characters, and slightly different in some, up to only several dozen, language specific characters. So we could reuse indexes with different collations by means of extending a search range, applying language ordering difference. This needs to be additionally studied. The simplest case is to reuse a non-binary index for binary searches. Another use case came up in discussion of WL#896 Primary, Secondary and Tertiary Sorts. The question was: if there is a _ci index and a _w123 index, will the optimizer give priority to the _w123 index? Bar replied: If we have a _ci and _w123 index on the same column, then _w123 index should be picked up in the optimizer for ORDER BY, but a _ci index for GROUP/DISTINCT/WHERE optimization. What do you think?
Copyright (c) 2000, 2021, Oracle Corporation and/or its affiliates. All rights reserved.