Correctness of data comes in different forms. One is referential integrity, also known as foreign keys. Another is CHECK constraints. While MySQL supports foreign keys, it doesn’t support CHECK constraints. Today we will look at three ways to emulate them:
- triggers
- views
- generated columns (new in MySQL 5.7)
This is also relevant to another SQL feature: DOMAIN, which, in short, is a user-defined type composed of a base type (INT, CHAR, …), a default value, and a constraint on acceptable values – the latter being some simple sort of CHECK constraint. If we can emulate CHECK, we can have some functionality of DOMAIN.
Let’s take a table of cars, with a column supposed to contain registration numbers of cars in France. Those numbers have a defined syntax: two letters, a dash, three digits, a dash, two letters. Of course, my example could also apply to storing UUIDs, telephone numbers, IP addresses, zip codes… anything with a defined syntax.
1 2 |
create table cars (number char(9)); insert into cars values('AS-229-ZT'); # Valid |
What we want to do is to make MySQL prevent invalid data like:
1 2 |
mysql> insert into cars values('AS-2X9-ZT'); # "X" in digit zone! Query OK, 1 row affected (0,01 sec) # Ouch! Should not be accepted! |
Using a trigger
1 2 3 4 5 6 7 8 9 10 |
delimiter $ create trigger cars_number_validate before insert on cars for each row begin if new.number not rlike '^[[:alpha:]]{2}-[[:digit:]]{3}-[[:alpha:]]{2}$' then signal sqlstate '45000' set message_text = 'bad number'; end if; end$ delimiter ; |
SIGNAL sends an error if the to-be-inserted number doesn’t match the regular expression of the required syntax. Results:
1 2 3 4 5 |
mysql> delete from cars; mysql> insert into cars values('AS-2X9-ZT'); # "X" in digit zone! ERROR 1644 (45000): bad number mysql> insert into cars values('AS-229-ZT'); # Valid Query OK, 1 row affected (0,01 sec) |
Very nice. However, while it prevents inserting an invalid value, it doesn’t prevent updating a valid value to an invalid one:
1 2 3 |
mysql> update cars set number='AS-2X9-ZT'; Query OK, 1 rows affected (0,00 sec) Rows matched: 1 Changed: 1 Warnings: 0 |
To fix that, we need an additional trigger, a BEFORE UPDATE one:
1 2 3 4 5 6 7 8 9 10 |
delimiter $ create trigger cars_number_validate_2 before update on cars for each row begin if new.number not rlike '^[[:alpha:]]{2}-[[:digit:]]{3}-[[:alpha:]]{2}$' then signal sqlstate '45000' set message_text = 'bad number'; end if; end$ delimiter ; |
And now the UPDATE properly fails:
1 2 |
mysql> update cars set number='AS-2X9-ZT'; ERROR 1644 (45000): bad number |
The trigger-based solution certainly works well. The trigger could even do more complex checks (like, checking that the number is related to something in another table, by using a subquery). However, to create the triggers, we had to:
- write 10 lines of procedural code, two times
- have the TRIGGER privilege.
Using a view WITH CHECK OPTION
Let’s drop the triggers, empty the table and look at a different solution. We create a view cars_checked over the cars table; this view has a WHERE clause to show only valid rows. It has WITH CHECK OPTION so any insert/update done against the view will be rejected if it would cause the new row to not appear in the view, i.e. if it would cause the new row to be invalid.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
mysql> drop trigger cars_number_validate_2; mysql> drop trigger cars_number_validate; mysql> delete from cars; mysql> create view cars_checked as select * from cars where number rlike '^[[:alpha:]]{2}-[[:digit:]]{3}-[[:alpha:]]{2}$' with check option; mysql> insert into cars_checked values('AS-2X9-ZT'); # "X" in digit zone! ERROR 1369 (HY000): CHECK OPTION failed 'test.cars_checked' mysql> insert into cars_checked values('AS-229-ZT'); # Valid Query OK, 1 row affected (0,00 sec) mysql> update cars_checked set number='AS-2X9-ZT'; ERROR 1369 (HY000): CHECK OPTION failed 'test.cars_checked' |
In a real setup, we would revoke write access to cars from users, as such access allows them to bypass our check; we would rather give them write access to cars_checked only.
Like triggers, this solution works perfectly and is versatile (it is possible to use a subquery in the WHERE clause of the view). But it requires:
- having the CREATE VIEW privilege
- revoking privileges from users and granting them other privileges
- possibly modifying our applications to make them target cars_checked instead of cars.
Edit: instead of naming the view cars_checked, it’s also possible to name it cars, after renaming the base table cars to cars_data. Then there is no need to change privileges, nor to change existing DMLs. However there is need to change some administrative DDLs which do not work against views. Thanks to Jörg Brühe for suggesting this; he provides more details in the thoughts section further down.
Using a generated column
Let’s try a new method, available since MySQL 5.7:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
mysql> drop view cars_checked; mysql> alter table cars add column number_validate char(0) as (case when number rlike '^[[:alpha:]]{2}-[[:digit:]]{3}-[[:alpha:]]{2}$' then '' end) virtual not null; mysql> insert into cars values('AS-229-ZT'); # Valid ERROR 1136 (21S01): Column count doesn't match value count at row 1 mysql> insert into cars (number) values('AS-229-ZT'); # Valid Query OK, 1 row affected (0,00 sec) mysql> insert into cars (number) values('AS-2X9-ZT'); # "X" in digit zone! ERROR 1048 (23000): Column 'number_validate' cannot be null mysql> update cars set number='AS-2X9-ZT'; ERROR 1048 (23000): Column 'number_validate' cannot be null |
We add a generated column number_validate, it’s of type CHAR(0) NOT NULL so it can only be the empty string and takes no space in the record. It’s virtual which is another reason why it takes no space. If the number is valid our new column evaluates to the empty string (”), which fits into the column’s type, so the insert/update succeeds; if the number is invalid our new column evaluates to NULL (there’s an implicit “ELSE NULL” in the CASE), which is rejected because of NOT NULL.
By the way, notice how my original insert into cars values(‘AS-229-ZT’) started failing: because my row has two columns now; so I have to either specify which column I’m inserting into (insert into cars (number) as above), or use default for the generated column:
1 2 |
mysql> insert into cars values('AS-229-ZT', default); # Valid Query OK, 1 row affected (0,00 sec) |
Let’s compare our three solutions for emulating CHECK constraints:
- the trigger/view solutions are powerful though complicated to deploy.
- the generated-column solution is less powerful, as a generated column’s expression cannot contain any subquery; but it is easy to deploy, as it is attached directly to the table and takes just one line of SQL to create. For doing simple validity checks like the ones in my example, it is, in my opinion, the winner.
I hope some of these solutions will help you implement data validation in your MySQL tables. Thank you for using MySQL!