One of the more frequently asked questions with MySQL 5.7 is “How can I upgrade my JSON data from using TEXT in an earlier version of MySQL to use the native JSON data type?”. Today I wanted to show an example of how to do so, using sample data from SF OpenData.
Step 1: Preflight Checks
Since the JSON data type converts data to a native format for storage, it requires that all data inserted into it be valid. This check can be done before running the ALTER TABLE command by using the function JSON_VALID():
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
mysql> SHOW CREATE TABLE features\G *************************** 1. row *************************** Table: features Create Table: CREATE TABLE `features` ( `id` int(11) NOT NULL AUTO_INCREMENT, `feature` longtext NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=262141 DEFAULT CHARSET=latin1 1 row in set (0.01 sec) mysql> SELECT * FROM features WHERE JSON_VALID(feature)=0\G *************************** 1. row *************************** id: 43253 feature: {"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[-122.45004426230204, 37.77462573231635, 0], [-122.45005814149769, 37.77469350183866, 0], [-122.44969515193891, 37.774739669796645, 0], [-122.44968127306583, 37.774671900232214, 0], [-122.45004426230204, 37.77462573231635, 0]]]}, "properties": {"TO_ST": "51", "BLKLOT": "1189002B", "STREET": "CLAYTON", "FROM_ST": "49", "LOT_NUM": "002B", "ST_TYPE": "ST", "ODD_EVEN": "O", "BLOCK_NUM": "1189", "MAPB 1 row in set (1.42 sec) |
Searching by JSON_VALID(feature)=0 will return all invalid JSON documents, and I have intentionally corrupted one row to show an example. I will be required to manually fix this before changing the data type to JSON:
1
2
3
4
5
|
mysql> UPDATE features SET feature = '{"type": "Feature", "geometry": {"type": "Polygon", "coordinates": [[[-122.45004426230204, 37.77462573231635, 0], [-122.45005814149769, 37.77469350183866, 0], [-122.44969515193891, 37.774739669796645, 0], [-122.44968127306583, 37.774671900232214, 0], [-122.45004426230204, 37.77462573231635, 0]]]}, "properties": {"TO_ST": "51", "BLKLOT": "1189002B", "STREET": "CLAYTON", "FROM_ST": "49", "LOT_NUM": "002B", "ST_TYPE": "ST", "ODD_EVEN": "O", "BLOCK_NUM": "1189", "MAPBLKLOT": "1189002B"}}' WHERE id = 43253; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> SELECT * FROM features WHERE JSON_VALID(feature)=0\G Empty set (1.36 sec) |
Step 2: ALTER TABLE
Provided the data is all valid, it is now time to change the column definition:
1
2
3
|
mysql> ALTER TABLE features CHANGE feature feature JSON NOT NULL; Query OK, 206560 rows affected (10.84 sec) Records: 206560 Duplicates: 0 Warnings: 0 |
As expected, whitespace information is lost in the conversion to a native type. I should note however that since JSON does not support comments, no other information will be lost*
(* Steven Roussey brings up a good point in the comments, that character set may change).
The ALTER TABLE operation here will need to rebuild the primary key internally, and change data type is one of the few operations currently not covered by Online DDL. For large tables you may consider performing this operation first on a slave, or emulating online ddl via triggers.
But, that’s it! You are now using the native JSON data type 🙂