To convert a binary or nonbinary string column to use a particular
character set, use ALTER TABLE
. For
successful conversion to occur, one of the following conditions
must apply:
If the column has a binary data type (
BINARY
,VARBINARY
,BLOB
), all the values that it contains must be encoded using a single character set (the character set you're converting the column to). If you use a binary column to store information in multiple character sets, MySQL has no way to know which values use which character set and cannot convert the data properly.If the column has a nonbinary data type (
CHAR
,VARCHAR
,TEXT
), its contents should be encoded in the column character set, not some other character set. If the contents are encoded in a different character set, you can convert the column to use a binary data type first, and then to a nonbinary column with the desired character set.
Suppose that a table t
has a binary column
named col1
defined as
VARBINARY(50)
. Assuming that the information in
the column is encoded using a single character set, you can
convert it to a nonbinary column that has that character set. For
example, if col1
contains binary data
representing characters in the greek
character
set, you can convert it as follows:
ALTER TABLE t MODIFY col1 VARCHAR(50) CHARACTER SET greek;
If your original column has a type of
BINARY(50)
, you could convert it to
CHAR(50)
, but the resulting values will be
padded with 0x00
bytes at the end, which may be
undesirable. To remove these bytes, use the
TRIM()
function:
UPDATE t SET col1 = TRIM(TRAILING 0x00 FROM col1);
Suppose that table t
has a nonbinary column
named col1
defined as CHAR(50)
CHARACTER SET latin1
but you want to convert it to use
utf8
so that you can store values from many
languages. The following statement accomplishes this:
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET utf8;
Conversion may be lossy if the column contains characters that are not in both character sets.
A special case occurs if you have old tables from before MySQL 4.1
where a nonbinary column contains values that actually are encoded
in a character set different from the server's default character
set. For example, an application might have stored
sjis
values in a column, even though MySQL's
default character set was latin1
. It is
possible to convert the column to use the proper character set but
an additional step is required. Suppose that the server's default
character set was latin1
and
col1
is defined as CHAR(50)
but its contents are sjis
values. The first
step is to convert the column to a binary data type, which removes
the existing character set information without performing any
character conversion:
ALTER TABLE t MODIFY col1 BLOB;
The next step is to convert the column to a nonbinary data type with the proper character set:
ALTER TABLE t MODIFY col1 CHAR(50) CHARACTER SET sjis;
This procedure requires that the table not have been modified
already with statements such as
INSERT
or
UPDATE
after an upgrade to MySQL
4.1 or later. In that case, MySQL would store new values in the
column using latin1
, and the column will
contain a mix of sjis
and
latin1
values and cannot be converted properly.
If you specified attributes when creating a column initially, you
should also specify them when altering the table with
ALTER TABLE
. For example, if you
specified NOT NULL
and an explicit
DEFAULT
value, you should also provide them in
the ALTER TABLE
statement.
Otherwise, the resulting column definition will not include those
attributes.
To convert all character columns in a table, the ALTER
TABLE ... CONVERT TO CHARACTER SET
statement may be
useful. See Section 13.1.8, “ALTER TABLE Syntax”.
charset
<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1251<? //remember to change it if needed ?>" />
<xmp>
</xmp>
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 VARCHAR(100) CHARACTER SET utf8;
like said on page:
http://dev.mysql.com/doc/mysql/en/alter-table.html
The FULLTEXT limitation can be avoided with:
ALTER TABLE `file_section` DISABLE/ENABLE KEYS between every table changes.
You can change all databases' charset or only selected. You can select databases to skip. You can also just print queries and execute them via PHPMyAdmin for example.
This applies to tables which have rows with different character sets. While the above statement is probably true for the sjis and latin1 character sets, it turns out that if you have a mix of latin1 and utf-8 in a single table there's probably a "clean" way to fix this. We had to do this recently for a new customer of ours, and since the process was somewhat of a pain to come up with, I wrote about it here (in the hope that I can save someone else out there some time): http://www.blueboxgrp.com/news/2009/07/mysql_encoding
In my exemple, I ALTER to UTF8 CHARSET. TO RUN SCRIPT:
./altera_colation.sh NAME_DB
----- HERE ----------
#!/bin/bash
USER='admin';
PASS='admin';
BANCO=$1;
QUERY="SELECT table_name FROM information_schema.TABLES WHERE table_schema = '$BANCO';";
TABELAS=$(mysql -u $USER --password=$PASS $BANCO --execute="$QUERY" | sed 's/|//g' | tail -n +2)
var=0;
for tables in $TABELAS; do
var=`expr $var + 1`
echo "ALTER TABLE $tables ......"
mysql -u $USER --password=$PASS $BANCO -e "ALTER TABLE $tables CONVERT TO CHARSET utf8 COLLATE utf8_unicode_ci"
done
------------ END ------------