MySQL Blog Archive
For the latest blogs go to blogs.oracle.com/mysql
New and old ways to emulate CHECK constraints, DOMAIN

Correctness of data comes in different forms. One is referential integrity, also known as foreign keys. Another is CHECK constraints. While MySQL supports foreign keys, it doesn’t support CHECK constraints. Today we will look at three ways to emulate them:

  • triggers
  • views
  • generated columns (new in MySQL 5.7)

This is also relevant to another SQL feature: DOMAIN, which, in short, is a user-defined type composed of a base type (INT, CHAR, …), a default value, and a constraint on acceptable values – the latter being some simple sort of CHECK constraint. If we can emulate CHECK, we can have some functionality of DOMAIN.

Let’s take a table of cars, with a column supposed to contain registration numbers of cars in France. Those numbers have a defined syntax: two letters, a dash, three digits, a dash, two letters. Of course, my example could also apply to storing UUIDs, telephone numbers, IP addresses, zip codes… anything with a defined syntax.

What we want to do is to make MySQL prevent invalid data like:

Using a trigger

SIGNAL sends an error if the to-be-inserted number doesn’t match the regular expression of the required syntax. Results:

Very nice. However, while it prevents inserting an invalid value, it doesn’t prevent updating a valid value to an invalid one:

To fix that, we need an additional trigger, a BEFORE UPDATE one:

And now the UPDATE properly fails:

The trigger-based solution certainly works well. The trigger could even do more complex checks (like, checking that the number is related to something in another table, by using a subquery). However, to create the triggers, we had to:

  • write 10 lines of procedural code, two times
  • have the TRIGGER privilege.

Using a view WITH CHECK OPTION

Let’s drop the triggers, empty the table and look at a different solution. We create a view cars_checked over the cars table; this view has a WHERE clause to show only valid rows. It has WITH CHECK OPTION so any insert/update done against the view will be rejected if it would cause the new row to not appear in the view, i.e. if it would cause the new row to be invalid.

In a real setup, we would revoke write access to cars from users, as such access allows them to bypass our check; we would rather give them write access to cars_checked only.

Like triggers, this solution works perfectly and is versatile (it is possible to use a subquery in the WHERE clause of the view). But it requires:

  • having the CREATE VIEW privilege
  • revoking privileges from users and granting them other privileges
  • possibly modifying our applications to make them target cars_checked instead of cars.

Edit: instead of naming the view cars_checked, it’s also possible to name it cars, after renaming the base table cars to cars_data. Then there is no need to change privileges, nor to change existing DMLs. However there is need to change some administrative DDLs which do not work against views. Thanks to Jörg Brühe for suggesting this; he provides more details in the thoughts section further down.

Using a generated column

Let’s try a new method, available since MySQL 5.7:

We add a generated column number_validate, it’s of type CHAR(0) NOT NULL so it can only be the empty string and takes no space in the record. It’s virtual which is another reason why it takes no space. If the number is valid our new column evaluates to the empty string (”), which fits into the column’s type, so the insert/update succeeds; if the number is invalid our new column evaluates to NULL (there’s an implicit “ELSE NULL” in the CASE), which is rejected because of NOT NULL.

By the way, notice how my original insert into cars values(‘AS-229-ZT’) started failing: because my row has two columns now; so I have to either specify which column I’m inserting into (insert into cars (number) as above), or use default for the generated column:

Let’s compare our three solutions for emulating CHECK constraints:

  • the trigger/view solutions are powerful though complicated to deploy.
  • the generated-column solution is less powerful, as a generated column’s expression cannot contain any subquery; but it is easy to deploy, as it is attached directly to the table and takes just one line of SQL to create. For doing simple validity checks like the ones in my example, it is, in my opinion, the winner.

I hope some of these solutions will help you implement data validation in your MySQL tables. Thank you for using MySQL!