Documentation Home
MySQL 5.5 Reference Manual
Related Documentation Download this Manual Excerpts from this Manual

MySQL 5.5 Reference Manual  /  ...  /  Configuring the Character Set and Collation for Applications

10.1.5 Configuring the Character Set and Collation for Applications

For applications that store data using the default MySQL character set and collation (latin1, latin1_swedish_ci), no special configuration should be needed. If applications require data storage using a different character set or collation, you can configure character set information several ways:

  • Specify character settings per database. For example, applications that use one database might require utf8, whereas applications that use another database might require sjis.

  • Specify character settings at server startup. This causes the server to use the given settings for all applications that do not make other arrangements.

  • Specify character settings at configuration time, if you build MySQL from source. This causes the server to use the given settings for all applications, without having to specify them at server startup.

When different applications require different character settings, the per-database technique provides a good deal of flexibility. If most or all applications use the same character set, specifying character settings at server startup or configuration time may be most convenient.

For the per-database or server-startup techniques, the settings control the character set for data storage. Applications must also tell the server which character set to use for client/server communications, as described in the following instructions.

The examples shown here assume use of the utf8 character set and utf8_general_ci collation.

Specify character settings per database. To create a database such that its tables will use a given default character set and collation for data storage, use a CREATE DATABASE statement like this:

  DEFAULT COLLATE utf8_general_ci;

Tables created in the database will use utf8 and utf8_general_ci by default for any character columns.

Applications that use the database should also configure their connection to the server each time they connect. This can be done by executing a SET NAMES 'utf8' statement after connecting. The statement can be used regardless of connection method: The mysql client, PHP scripts, and so forth.

In some cases, it may be possible to configure the connection to use the desired character set some other way. For example, for connections made using mysql, you can specify the --default-character-set=utf8 command-line option to achieve the same effect as SET NAMES 'utf8'.

For more information about configuring client connections, see Section 10.1.4, “Connection Character Sets and Collations”.

If you change the default character set or collation for a database, stored routines that use the database defaults must be dropped and recreated so that they use the new defaults. (In a stored routine, variables with character data types use the database defaults if the character set or collation are not specified explicitly. See Section 13.1.15, “CREATE PROCEDURE and CREATE FUNCTION Syntax”.)

Specify character settings at server startup. To select a character set and collation at server startup, use the --character-set-server and --collation-server options. For example, to specify the options in an option file, include these lines:


These settings apply server-wide and apply as the defaults for databases created by any application, and for tables created in those databases.

It is still necessary for applications to configure their connection using SET NAMES or equivalent after they connect, as described previously. You might be tempted to start the server with the --init_connect="SET NAMES 'utf8'" option to cause SET NAMES to be executed automatically for each client that connects. However, this will yield inconsistent results because the init_connect value is not executed for users who have the SUPER privilege.

Specify character settings at MySQL configuration time. To select a character set and collation when you configure and build MySQL from source, use the DEFAULT_CHARSET and DEFAULT_COLLATION options for CMake:

shell> cmake . -DDEFAULT_CHARSET=utf8 \

The resulting server uses utf8 and utf8_general_ci as the default for databases and tables and for client connections. It is unnecessary to use --character-set-server and --collation-server to specify those defaults at server startup. It is also unnecessary for applications to configure their connection using SET NAMES or equivalent after they connect to the server.

Regardless of how you configure the MySQL character set for application use, you must also consider the environment within which those applications execute. If you will send statements using UTF-8 text taken from a file that you create in an editor, you should edit the file with the locale of your environment set to UTF-8 so that the file encoding is correct and so that the operating system handles it correctly. If you use the mysql client from within a terminal window, the window must be configured to use UTF-8 or characters may not display properly. For a script that executes in a Web environment, the script must handle character encoding properly for its interaction with the MySQL server, and it must generate pages that correctly indicate the encoding so that browsers know how to display the content of the pages. For example, you can include this <meta> tag within your <head> element:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Download this Manual
User Comments
  Posted by Ilya Cheburaev on April 22, 2009
If you use PHP with mysql you may want to use
in my.ini:



it will allow you to use right charset without sending
'set NAMES'
with each connection to DB so it will speed up your application
  Posted by time e.less on October 14, 2009
Please add the information about skip-character-set-client-handshake into every page that references server character set and collation.

I'm amazed this startup option is hidden so deep in the documentation, since it seems like many administrators who specify alternate character sets on startup would want the clients to use the server character sets when possible.

  Posted by Gábor Lénárt on October 1, 2010
Wow, skip-character-set-client-handshake saved my day! Old MySQL4.1 setup users had the habit to do "odd" things (having latin2 table but storing utf8 in it, not specifying charset encoding from PHP on the connection etc). I had long hours to solve the problem, since they wanted it to work without extra effort "since it worked before this way" ... Thanks a lot! I think this option should be "advertised" more :) even if it's not the best solution anyway in a clean/new system ...
Sign Up Login You must be logged in to post a comment.