WL#4616: Implement UTF-16LE

Affects: Server-5.6 — Status: Complete

Description
High Level Architecture

As of version 5.5, MySQL supports only UTF16 (i.e. big-endian) character set.
This task is about adding UTF-16LE (i.e. little endian).

Rationale
=========
- We need UTF16-LE as a prerequisite for 
  "WL#4616 Support Unicode for Windows command line client",
  as Windows console API functions are all UTF16-LE.

- Sun Globalization rules require us to support UTF-16LE

- UTF16-LE can be useful for other Unicode applications,
  especially on Windows

Jeff Balint (connectors) wrote:
>We are currently doing character set conversions in the driver
>(character_set_results=null). We convert it to UTF8, and have our own
>code to convert it to UTF16. If UTF-16LE support is added to the server
>and driver, it should improve performance for apps using many/large
>unicode strings.

Johannes Schlüter wrote:
>PHP 6, which will, probably, be release in 1.5 - 2 years, will use
>Utf-16 internally. In the current development tree of the PHP connectors
>we're converting to UTF-8. So from that side it'd be nice to save that
>conversion.
>...
>PHP 6's Unicode implementation is based on IBM's ICU library which uses
>system dependent endianess. So being able to pass through Utf-16 BE and
>LE would, again, save the conversion on our side.

Peter Laursen (Basic Quality Contributor) writes in BUG#52494:
> Server side support for UTF16 (LE) would be
> very nice and would solve quite a lot of problems/issues for Windows
> users working in a multi-lingual environment.

Character set name
==================
MySQL character set name will be utf16le.


Built-in  collation names
=========================
As of WL#4616, utf16le will be used mostly for conversion purposes.
We will NOT implement the whole bunch of language collations for utf16le.
We can add all UCA-based utf16le collations later,
when InnoDB supports 2-byte collations IDs. 

WL#4616 will add two built-in collations:
- utf16le_general_ci, the default collation, case insensitive (similar to
utf16_general_ci)
- utf16le_bin,  case sensitive collation with codepoint-to-codepoint
comparison style

Note, in MySQL-5.5.8 we reverted the part of the patch for:
WL#55980 Character sets: supplementary character _bin ordering is wrong
which made utf16_bin sort in byte-by-byte order.
Now utf16_bin implements code point order. So utf16le_bin and utf16_bin
will give exactly the same character order to each other, and to
utf32_bin/utf8mb4_bin.


User-defined collations
=======================
We will not add utf16le_unicode_ci at this point.
That means adding user-defined collations using character
set definition file Index.xml will not be possible for utf16le.
We'll add utf16le_unicode_ci (together with a possibility to
have user-defined collations) later, when InnoDB supports 2-byte collation IDs.