Documentation Home
MySQL Globalization
Related Documentation Download this Excerpt
PDF (US Ltr) - 479.0Kb
PDF (A4) - 477.7Kb


MySQL Globalization  /  ...  /  Using Collation in INFORMATION_SCHEMA Searches

1.8.7 Using Collation in INFORMATION_SCHEMA Searches

String columns in INFORMATION_SCHEMA tables have a collation of utf8_general_ci, which is case-insensitive. However, for values that correspond to objects that are represented in the file system, such as databases and tables, searches in INFORMATION_SCHEMA string columns can be case-sensitive or case-insensitive, depending on the characteristics of the underlying file system and the value of the lower_case_table_names system variable. For example, searches may be case-sensitive if the file system is case-sensitive. This section describes this behavior and how to modify it if necessary; see also Bug #34921.

Suppose that a query searches the SCHEMATA.SCHEMA_NAME column for the test database. On Linux, file systems are case-sensitive, so comparisons of SCHEMATA.SCHEMA_NAME with 'test' match, but comparisons with 'TEST' do not:

mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME = 'test';
+-------------+
| SCHEMA_NAME |
+-------------+
| test        |
+-------------+
mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME = 'TEST';
Empty set (0.00 sec)

These results occur with the lower_case_table_names system variable set to 0. Changing the value of lower_case_table_names to 1 or 2 causes the second query to return the same (nonempty) result as the first query.

On Windows or macOS, file systems are not case-sensitive, so comparisons match both 'test' and 'TEST':

mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME = 'test';
+-------------+
| SCHEMA_NAME |
+-------------+
| test        |
+-------------+
mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME = 'TEST';
+-------------+
| SCHEMA_NAME |
+-------------+
| TEST        |
+-------------+

The value of lower_case_table_names makes no difference in this context.

The preceding behavior occurs because the utf8_general_ci collation is not used for INFORMATION_SCHEMA queries when searching for values that correspond to objects represented in the file system. It is a result of file system-scanning optimizations implemented for INFORMATION_SCHEMA searches. For information about these optimizations, see Optimizing INFORMATION_SCHEMA Queries.

If the result of a string operation on an INFORMATION_SCHEMA column differs from expectations, a workaround is to use an explicit COLLATE clause to force a suitable collation (see Section 1.8.1, “Using COLLATE in SQL Statements”). For example, to perform a case-insensitive search, use COLLATE with the INFORMATION_SCHEMA column name:

mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME COLLATE utf8_general_ci = 'test';
+-------------+
| SCHEMA_NAME |
+-------------+
| test        |
+-------------+
mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME COLLATE utf8_general_ci = 'TEST';
+-------------+
| SCHEMA_NAME |
+-------------+
| test        |
+-------------+

In the preceding queries, it is important to apply the COLLATE clause to the INFORMATION_SCHEMA column name. Applying COLLATE to the comparison value has no effect.

You can also use the UPPER() or LOWER() function:

WHERE UPPER(SCHEMA_NAME) = 'TEST'
WHERE LOWER(SCHEMA_NAME) = 'test'

Although a case-insensitive comparison can be performed even on platforms with case-sensitive file systems, as just shown, it is not necessarily always the right thing to do. On such platforms, it is possible to have multiple objects with names that differ only in lettercase. For example, tables named city, CITY, and City can all exist simultaneously. Consider whether a search should match all such names or just one and write queries accordingly. The first of the following comparisons (with utf8_bin) is case-sensitive; the others are not:

WHERE TABLE_NAME COLLATE utf8_bin = 'City'
WHERE TABLE_NAME COLLATE utf8_general_ci = 'city'
WHERE UPPER(TABLE_NAME) = 'CITY'
WHERE LOWER(TABLE_NAME) = 'city'

Searches in INFORMATION_SCHEMA string columns for values that refer to INFORMATION_SCHEMA itself do use the utf8_general_ci collation because INFORMATION_SCHEMA is a virtual database not represented in the file system. For example, comparisons with SCHEMATA.SCHEMA_NAME match 'information_schema' or 'INFORMATION_SCHEMA' regardless of platform:

mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME = 'information_schema';
+--------------------+
| SCHEMA_NAME        |
+--------------------+
| information_schema |
+--------------------+
mysql> SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA
       WHERE SCHEMA_NAME = 'INFORMATION_SCHEMA';
+--------------------+
| SCHEMA_NAME        |
+--------------------+
| information_schema |
+--------------------+