WL#2048: Add function for Unicode normalization
Affects: Server-7.1 — Status: Un-Assigned — Priority: Medium
In order to safely and efficiently compare Unicode strings, they need to be normalized so that "equivalent text (canonical or compatibility) will have identical binary representations". Unicode Standard Annex #15 (http://www.unicode.org/reports/tr15/) describes the four Unicode normalization forms. I would suggest adding a function NORMALIZE(string, form) to normalize Unicode strings (UTF-8 or UCS2) to the specified form. Sybase ASE has a setting to normalize its Unicode data types, but does not expose normalization via a function. In Mimer, Unicode data is automatically transformed to NFC. SQL standard has normalization function. An excerpt from sql2009-nov: <normalize function> ::= NORMALIZE <left paren> <character value expression> [ <comma> <normal form> [ <comma> <normalize function result length> ] ] <right paren> <normal form> ::= NFC | NFD | NFKC | NFKD We'll also need the NORMALIZED predicate, to check if a string is already in a given normal form. <normalized predicate> ::= <row value predicand> <normalized predicate part 2> <normalized predicate part 2> ::= IS [ NOT ] [ <normal form> ] NORMALIZED
Copyright (c) 2000, 2015, Oracle Corporation and/or its affiliates. All rights reserved.