All strings sent from the JDBC driver to the server are
converted automatically from native Java Unicode form to the
connection's character encoding, including all queries sent
using Statement.execute()
,
Statement.executeUpdate()
, and
Statement.executeQuery()
, as well as all
PreparedStatement
and
CallableStatement
parameters,
excluding parameters set using the
following methods:
setBlob()
setBytes()
setClob()
setNClob()
setAsciiStream()
setBinaryStream()
setCharacterStream()
setNCharacterStream()
setUnicodeStream()
Number of Encodings Per Connection
Connector/J supports a single character encoding between the
client and the server, and any number of character encodings for
data returned by the server to the client in
ResultSets
.
Setting the Character Encoding
For Connector/J 8.0.25 and earlier: The
character encoding between the client and the server is
automatically detected upon connection (provided that the
Connector/J connection properties
characterEncoding
and
connectionCollation
are not set). The encoding on the server is specified using the
system variable
character_set_server
(for more
information, see Server Character Set and Collation), and the
driver automatically uses the encoding. For example, to use the
4-byte UTF-8 character
set with Connector/J, configure the MySQL server with
character_set_server=utf8mb4
,
and leave
characterEncoding
and
connectionCollation
out of the Connector/J connection string. Connector/J will then
autodetect the UTF-8 setting. To override the automatically
detected encoding on the client side, use the
characterEncoding
property in the connection URL to the server.
For Connector/J 8.0.26 and later: There are two phases during the connection initialization in which the character encoding and collation are set.
-
Pre-Authentication Phase: In this phase, the character encoding between the client and the server is determined by the settings of the Connector/J connection properties, in the following order of priority:
Set to
UTF8
(corresponds toutf8mb4
on MySQL servers), if none of the properties above is set
-
Post-Authentication Phase: In this phase, the character encoding between the client and the server for the rest of the session is determined by the settings of the Connector/J connection properties, in the following order of priority:
Set to
UTF8
(corresponds toutf8mb4
on MySQL servers), if none of the properties above is set
This means Connector/J needs to issue a SET NAMES Statement to change the character set and collation that were established in the pre-authentication phase only if
passwordCharacterEncoding
is set, but its setting is different from that ofconnectionCollation
, or different from that ofcharacterEncoding
(whenconnectionCollation
is not set), or different fromutf8mb4
(when bothconnectionCollation
andcharacterEncoding
are not set).
Custom Character Sets and Collations
To support the use of custom character sets and collations on
the server, set the Connector/J connection property
detectCustomCollations
to true
, and provide the mapping between the
custom character sets and the Java character encodings by
supplying the customCharsetMapping
connection
property with a comma-delimited list of
pairs (for example:
custom_charset
:java_encoding
customCharsetMapping=charset1:UTF-8,charset2:Cp1252
).
MySQL to Java Encoding Name Translations
Use Java-style names when specifying character encodings. The following table lists MySQL character set names and their corresponding Java-style names:
Table 6.21 MySQL to Java Encoding Name Translations
MySQL Character Set Name | Java-Style Character Encoding Name |
---|---|
ascii |
US-ASCII |
big5 |
Big5 |
gbk |
GBK |
sjis |
SJIS or Cp932 |
cp932 |
Cp932 or MS932 |
gb2312 |
EUC_CN |
ujis |
EUC_JP |
euckr |
EUC_KR |
latin1 |
Cp1252 |
latin2 |
ISO8859_2 |
greek |
ISO8859_7 |
hebrew |
ISO8859_8 |
cp866 |
Cp866 |
tis620 |
TIS620 |
cp1250 |
Cp1250 |
cp1251 |
Cp1251 |
cp1257 |
Cp1257 |
macroman |
MacRoman |
macce |
MacCentralEurope |
utf8mb4 |
UTF-8 |
ucs2 |
UnicodeBig |
When
UTF-8
is used forcharacterEncoding
in the connection string, it maps to the MySQL character set nameutf8mb4
.If the connection option
connectionCollation
is also set alongsidecharacterEncoding
and is incompatible with it,characterEncoding
will be overridden with the encoding corresponding toconnectionCollation
.Because there is no Java-style character set name for
utfmb3
that you can use with the connection optioncharaterEncoding
, the only way to useutf8mb3
as your connection character set is to use autf8mb3
collation (for example,utf8_general_ci
) for the connection optionconnectionCollation
, which forces autf8mb3
character set to be used, as explained in the last bullet.
Do not issue the query SET NAMES with Connector/J, as the driver will not detect that the character set has been changed by the query, and will continue to use the character set configured when the connection was first set up.