WL#2598: Make field->max_length work better with unicode
Affects: Benchmarks-3.0
—
Status: Un-Assigned
In the MySQL C API, you can access field->max_length to get the maximum length for a particular column in a data set. The problem is that with unicode strings there are really three kinds of length: 1) the length in bytes 2) the length in characters 3) the display width (3) is different than (2) because some characters are zero-width and some characters can be double-width. max_length is often used in existing code for purpose (3). e.g. the mysql command line client accesses field->max_length to calculate how wide columns should be when formatting the pretty tables in the output. If we ever want the mysql command-line client to work well with unicode strings, then we have to either (A) rearchitect it (mysql_store_result() vs. mysql_use_result()) and do some inefficient looping over the result set or (B) somehow provide support in the client protocol and in the C API to get the "display width" analogue of max_length I am not sure whether A or B is the better solution but here I am proposing B. [notes from mark] The JDBC driver already handles this in one way, in that when someone asks for the display length, it issues a 'SHOW CHARACTER SET' for the charset of the field in question, and then caches the value connection-wide so that it can calcuate length-in-chars. This works okay for everything but utf-8, which can be varying length. I think one would _have_ to scan the result set to get length-in-chars for utf-8. This works okay for the JDBC driver, as the display-length in-chars method isn't called very often.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.