WL#831: Implement INTERVAL types

Affects: Server-7.1   —   Status: Assigned

Implement the interval types required by ANSI SQL.

(The description used to say "in MySQL 5.0", that's no longer true.)


An interval is a difference between any two
datetime values. It contains fields similar
to datetime fields. It is specified in ANSI
SQL, it is supported by Oracle + Informix,
it is partially supported (with a very limited
syntax subset) by PostgreSQL, it is not
supported by SQL Server + DB2.

Syntax for data type definition
-------------------------------

INTERVAL interval-qualifier

Interval-qualifier is one of:

YEAR                  e.g. '1999'
YEAR TO MONTH         e.g. '1999-01'
MONTH                 e.g. '01'

DAY                   e.g. '31'
DAY TO HOUR           e.g. '31 05'
DAY TO MINUTE         e.g. '31 05:59'
DAY TO SECOND         e.g. '31 05:59:07'
HOUR                  e.g. '18'
HOUR TO MINUTE        e.g. '18:19'
HOUR TO SECOND        e.g. '18:19:20'
MINUTE                e.g. '20'
MINUTE TO SECOND      e.g. '20:21'
SECOND                e.g. '21'

In the above, the elements (YEAR MONTH DAY
etc.) are "fields", and when there is more
than one field the term "multi-field" will
be used.

There are no intervals like "MONTH DAY" because
it's hard to interpret the meaning of such a
value, given that months have varying numbers
of days. Therefore the year-month intervals
(YEAR, YEAR TO MONTH, DAY) are incompatible with
the day-second intervals (all the rest), for
some operations that will be described later.

Each field has a default precision (the number
of digits it can contain). The first field of
a multi-field interval, or any single-field
interval, may have a higher precision. For
example: INTERVAL DAY precision is default
precision 2 (up to 99), but you may increase
it to 4 by specifying INTERVAL DAY (4).
For each field, here is the default precision
and the maximum precision you could specify:

Field     Default        Maximum specifiable
-----     -------        -------------------
YEAR            4        6
MONTH           2        4
DAY             2        4
HOUR            2        4
MINUTE          2        4
SECOND          2        4
(Peter just made the maximum numbers up, if they
cause trouble then say so.)

There may be a (leading precision) on the first
field (described later), and a (fractional
seconds precision) on a SECOND field (described
later).

Any column or declared variable may have
an interval data type.

DAY, HOUR, INTERVAL, MINUTE, MONTH, SECOND, YEAR
become new reserved words.

Syntax for literal
------------------

The standard syntax is:
INTERVAL '...' interval-type

Within the ''s, the value should be a number or
a series of numbers. Depending on interval-type,
the separator between the numbers may be ' ' or
':' just as within a datetime literal. Leading
zeros are optional.

An interval may contain a sign, either inside
or outside the ''s. For example:
INTERVAL '+15' HOUR or INTERVAL -'17:18' HOUR TO MINUTE.
It is smarter to put the symbol inside the ''s,
because MySQL will allow replacing the '...'
with a variable, as in
SET @a='+15';
SELECT CURRENT_DATE + INTERVAL @a HOUR;

MySQL Extension #1. Support syntax without
lead word INTERVAL. For example:
'05:06' HOUR TO MINUTE

MySQL Extension #2. Allow removal of ''s when
there is only one field. For example:
INTERVAL 5 MINUTE

MySQL Extension #3. Allow removal of interval-type
when it's clear from the format. For example:
INTERVAL '01:01:01' /* must be hour:minute:second */
INTERVAL '1999-04'  /* must be year-month */

Literals which don't follow the prescribed format,
or contain too-large integers, will cause errors
or warnings depending on sql_mode=strict.

Datetime/Datetime or Datetime/Interval arithmetic
-------------------------------------------------

When date arithmetic takes place, the result is an
interval. One may specify the interval-type so that
it's clear what specific kind of interval. That is:

(datetime - datetime) interval-type
(datetime + interval) interval-type
(datetime - interval) interval-type

The effect is like "casting". For example:
(DATE '1999-04-01' - DATE '1998-04-01') YEAR

The casting is necessary in standard SQL, but
MySQL already allows "date - date", returning
days, and it's okay to preserve that extension.

Interval/Interval arithmetic
----------------------------

It is also possible to add or subtract two intervals.
The intervals must be of the same general type,
a year-month interval can't go with a day-second
interval. If the intervals are of the same general type,
but have different fields, the result data type
is an interval with all the fields that are in
either of the original values. For example:

INTERVAL '1' DAY + INTERVAL '1' YEAR is illegal.

INTERVAL '1' DAY + INTERVAL '1' DAY results in
INTERVAL '2' DAY.

INTERVAL '1 5' DAY TO HOUR + INTERVAL '1' SECOND
results in INTERVAL '1 5:0:1' DAY TO SECOND.

One may specify the interval-type. For example:
(INTERVAL '1' DAY - INTERVAL '1' DAY) DAY.

Datetime/Interval/Interval arithmetic
-------------------------------------

Since year-month intervals won't go with day-second
intervals, there can be some unnecessary difficulty.
Melton gives this example:

(DATE '1999-12-01' + INTERVAL '01' MONTH) + INTERVAL '1' DAY
is legal.

DATE '1999-12-01' + (INTERVAL '01' MONTH + INTERVAL '1' DAY)
is illegal.

Therefore the DBMS must observe parentheses, and if
there are no parentheses, must handle expressions in
the standard order -- left to right.

Interval/Integer arithmetic
---------------------------

It is legal to multiple an interval by an integer.
It is legal to divide an interval by an integer.
For example:

INTERVAL '2-1' YEAR TO MONTH * 2

The result should be INTERVAL '4-2' YEAR TO MONTH.
Or is it? See next section.

Normal Interpretation for Multi-Field Intervals
-----------------------------------------------

Sometimes a multi-field interval can be expressed with
everything in the smallest field ("'00 61' YEAR TO MONTH)
or with some in the larger field ("'05 01' YEAR TO MONTH).

The normal interpretation would be based on the fact
that the user said YEAR TO MONTH rather than MONTH.
So there's a desire to see the years, and '05 01'
is right after a calculation. So "normalization"
would work thus:

If seconds > 59, carry to minutes.
If minutes > 59, carry to hours.
If hours > 23, carry to days.
If months > 11, carry to years.

Even if users deliberately enter '00 61' into an
INTERVAL YEAR TO MONTH column, do not preserve it.
Normalize it to '05 01'. Someday they'll thank you.

Functions
---------

The following function will work with interval
values:
EXTRACT (field FROM interval value)
For example,
EXTRACT (DAY FROM INTERVAL '01 02' DAY TO HOUR)
returns 01.

The following functions, although they are
for datetime arithmetic, probably will not
work with interval values. The expectation
is that nobody cares.
ADDDATE(), SUBDATE()
ADDTIME(), SUBTIME()
DATEDIFF(), TIMEDIFF()
DATE_ADD, DATE_SUB()
TIMESTAMPADD(), TIMESTAMPDIFF()
DATE_FORMAT(), GET_FORMAT()
FROM_DAYS
MICROSECOND(), MINUTE(), HOUR(), DAY(), etc.
PERIOD_ADD(), PERIOD_DIFF()

Data type for connectors
------------------------

Properly, an INTERVAL value should be passed to a
connector or mysql client as an INTERVAL. In fact
MySQL will just pass it as a string.

Coming the other way (from client to server),
MySQL will accept either any string for an
interval, provided it's formatted as described
in earlier section 'Syntax for literal'.

Fractional seconds
------------------

An interval may include fractional seconds, after the
completion of
WL#946 TIME/TIMESTAMP/DATETIME with fractional seconds

Then an interval literal may include a non-integer SECONDS
component. For example:
INTERVAL 1.5 SECOND

Then an interval data type definition may include (n)
where n is an integer between 0 and 6. For example:
INTERVAL SECOND (6)

Monty's comment:
"For the first implementation, we should not support
[fractional seconds] precisions as we can't 
yet store this in the .frm file. (To be done in 5.1)"

Storage
-------

Monty says:
"The simple types like 'interval year' can be stored as a tinyint
The others should be stored as 2-4 byte ints, depending on circumstances
For example year-to-month should be stored as 'number of months' and
2 bytes would probably sufficient for this."

Peter says:
"Every interval is really a DECIMAL(26).
The digits of an interval are as follows:
YEAR                     6 digits
MONTH                    4 digits
DAY                      4 digits
HOUR                     4 digits
MINUTE                   4 digits
SECOND                   4 digits
The main advantage is that you can convert
easily to a number rather than an array of
numbers. Arithmetic is tricky because a
carry isn't always the same. Comparison
is not tricky because MySQL forces numbers
to be normal.

Arithmetic
----------

As well as the type, we need to support the arithmetic.

Currently, MySQL does allow what looks like an INTERVAL
expression, but this is "pseudo-interval support", it
ignores essentials. The following does not work in a
standard way:

CREATE TABLE F052B (S1 TIME)
INSERT INTO F052B VALUES ('00:00:00')
INSERT INTO F052B VALUES ('23:59:59')
INSERT INTO F052B VALUES ('01:02:03')
SELECT S1, S1 + INTERVAL '1' SECOND FROM F052B 

(Monty has explained that this occurs because MySQL will 
attempt to interpret the TIME as a DATE.)

At the Budapest conference, Monty suggested that correct 
interval arithmetic is only priority = 50. So Peter believes 
that interval arithmetic has a lower priority than interval
data type support.

INFORMATION_SCHEMA
------------------

In all cases where there is currently a data type
descriptor, there must be an indication of the
interval-type and interval-precision. (These are
not simply part of the data type name.) For example,
INFORMATION_SCHEMA.COLUMNS requires two new columns:

INTERVAL_TYPE          VARCHAR() CHARACTER SET UTF8

... here INTERVAL_TYPE is one of:
'YEAR', 'MONTH', 'DAY', 'HOUR', 'MINUTE', 'SECOND',
'YEAR TO MONTH', 'DAY TO HOUR', 'DAY TO MINUTE',
'DAY TO SECOND', 'HOUR TO MINUTE',
'HOUR TO SECOND', or 'MINUTE TO SECOND'.

INTERVAL_PRECISION     INTEGER

... where INTERVAL_PREVISION is not fractional-seconds
precision (that's taken care of by another column
that's already in the data type descriptor), but the
precision of the leading field. If you said DAY(5)
then INTERVAL_PRECISION = 5.

Errors
------

The standard defines these errors:

22015       Interval field overflow
2200P       Interval value out of range

Use '22015' for everything except aggregate functions.

Non-standard expressions
------------------------

The MySQL Reference Manual, 12.5. Date and Time Functions,
describes non-standard keywords for DATE_ADD and DATE_SUB:

MICROSECOND            MICROSECONDS
WEEK                   WEEKS
QUARTER                QUARTERS
SECOND_MICROSECOND    'SECONDS.MICROSECONDS'
MINUTE_MICROSECOND    'MINUTES.MICROSECONDS'
MINUTE_SECOND         'MINUTES:SECONDS'
HOUR_MICROSECOND      'HOURS.MICROSECONDS'
HOUR_SECOND           'HOURS:MINUTES:SECONDS'
HOUR_MINUTE           'HOURS:MINUTES'
DAY_MICROSECOND       'DAYS.MICROSECONDS'
DAY_SECOND            'DAYS HOURS:MINUTES:SECONDS'
DAY_MINUTE            'DAYS HOURS:MINUTES'
DAY_HOUR              'DAYS HOURS'
YEAR_MONTH            'YEARS-MONTHS'

Unfortunately these non-standard keywords have migrated
to the pseudo-interval support. MySQL currently allows:

mysql> select s1 +  interval '1' year_month from td;
+-------------------------------+
| s1 +  interval '1' year_month |
+-------------------------------+
| 2007-02-25                    |
+-------------------------------+
1 row in set (0.00 sec)

A similar extension exists for CREATE EVENT. Peter
asked for it to be removed, without success.

The recommendation now is: remove support for all
the non-standard keywords.

References
----------

SQL-99 Complete, Really pages 184ff.

crash-me test results (all types are tested in crash-me)

Feature requests: BUG#20844 interval datatype