WL#3744: TIME/TIMESTAMP/DATETIME with time zones

Affects: Server-7.1   —   Status: Un-Assigned   —   Priority: Very High

Allow specification of the data types
TIME WITH TIME ZONE
TIMESTAMP WITH TIME ZONE
DATETIME WITH TIME ZONE
The objective is conformance with SQL:2003 standard feature
F411, "Time zone specification". Oracle extensions are desirable,
they'll be listed in a separate section after the standard
descriptions.

Syntax for data type definition
-------------------------------

TIME [ { WITH | WITHOUT } TIME ZONE ]
TIMESTAMP [ { WITH | WITHOUT } TIME ZONE ]
DATETIME [ { WITH | WITHOUT } TIME ZONE ]

The default is WITHOUT TIME ZONE, that is,
"TIME" and "TIME WITHOUT TIME ZONE" are the same thing.

If WL#946 "TIME/TIMESTAMP/DATETIME with fractional seconds"
is implemented first, then the precision comes before
{WITH | WITHOUT}, e.g.
TIMESTAMP [(p)] [ { WITH | WITHOUT } TIME ZONE ]

This syntax may appear wherever a data type specification
may appear: in a column definition, in a stored-routine
for parameter or declared variable or returns, in CAST.

Example:
CREATE TABLE t (s1 TIMESTAMP WITH TIME ZONE);

The with-time-zone data types are separate from the
without-time-zone data types, but comparisons and
assignments are always possible.

Syntax for literal
------------------

A with-time-zone literal ends with a "displacement",
a term that we will prefer, rather than "offset".

The displacement comes right after the time and
looks like: {+|-}hh:mm, where hh and mm are
unsigned integers. Displacement is in fact an INTERVAL
value between '-12:59' and '+14:00'. MySQL allows
SET TIME_ZONE = '-12:59'; /* etc. */
and the same rules will apply for displacement
as for @@time_zone. Spaces are illegal.

Sometimes you'll see this sort of thing on emails:
"> On Fri, 2006-12-08 at 19:44 +0100, Stefan Hinz wrote:"
Notice the space after '19:44'. MySQL will allow that.
Notice the displacement has no ':'. MySQL will allow that.

For datetimes, there's what MySQL calls a "relaxed" syntax.
For example, '19970523091528' is a valid TIMESTAMP. So
'19970523091528+00:00' is a valid TIMESTAMP WITH TIME ZONE,
and so is '19970523091528+0000'. But the '+' or '-' is
compulsory, and it is no longer true that (quoting the
manual) "Any punctuation character may be used as the
delimiter between date parts or time parts" -- at the
very least MySQL should disallow "+" and "-", and for
WL#946 purposes MySQL should also disallow ".", as
delimiters.

In strict mode, an illegal displacement (for example
'2001-01-01 00:00:00+99:7A') will cause an error,
SQLSTATE='22009', message = "Data Exception: Invalid
Time Zone Displacement Value". In non-strict mode,
the illegal displacement will be converted to UTC, or
the illegal displacement will be converted to @@time_zone,
or the whole string will be converted to
'0000-00-00 00:00:00+00:00', nobody cares which.

Examples:
TIMESTAMP '1997-01-01 22:22:22+00:30'
TIME '1:1:1 +4:0'

In the example, TIME and TIMESTAMP are not compulsory.
But they're desirable, for reasons given in the next
section.

Using TIME and TIMESTAMP for literals
-------------------------------------

MySQL allows use of TIME or TIMESTAMP for literals,
for example TIME '00:00:00' and TIMESTAMP '1999-01-01 00:00:00'.
But it ignores the words -- TIME '00:00:00' is treated as
VARCHAR(8). MySQL should stop ignoring the words.

Consider:

Moscow's usual displacement is '+03:00'.
Berlin's usual displacement is '+01:00'.

So if it's 3:00 in Moscow, it's 1:00 in Berlin.

Therefore this expression should be true:
TIME '03:00:00+03:00' = TIME '01:00:00+01:00'

But it will be false if MySQL thinks the two
values are strings and compares byte-by-byte.

Notice, in fact, that TIME '03:00:00+03:00' is a
TIME WITH TIME ZONE rather than a TIME. There is
no requirement in the standard for the syntax
TIME WITH TIME ZONE '03:00:00+03:00'. It's not
necessary, MySQL can figure out that it's a
with-time-zone data type by looking for + or -.

Storing as UTC component + Displacement
---------------------------------------

Suppose I'm in Edmonton, 7 hours west of ("behind")
Greenwich, and the clock on my wall says "10:30".
I can tell the time two ways:
(1) "It's 10:30 here and I'm -7 hours from Greenwich", or
(2) "It's 17:30 in Greenwich and I'm -7 hours from Greenwich".
Either way, I'm conveying the same amount of information.
If UTC = Greenwich today (it often is), I'd say this briefly as:
Way 1: TIMESTAMP '2007-02-22 10:30:00-07:00'
Way 2: TIMESTAMP '2007-02-22 17:30:00-07:00'

Way 1 is what I want to see because it's "normal",
it's what ISO 8601 says, it's what I want to input/output.

Way 2 is far better for comparisons, though. If I want
to ask "is 10:30 in Edmonton the same as 00:30 in Moscow?"
If I express the times in Way 1, I'm asking
'10:30:00-07:00' = '00:00:00+03:00'?
If I express the times in Way 2, I'm asking
'17:30:00-07:00' = '17:30:00+03:00'?
Way 2 is better because any left-to-right string-comparison
of the main part -- the "UTC component" -- gives the answer
immediately.

The other reason that I have to know the Way 1 / Way 2
difference is: the SQL standard document often assumes "Way 2".
For example it says:
"In comparing values of a data type WITH TIME ZONE, the value
of the time zone displacement is disregarded."
and
"NOTE 74 — If <time zone interval> is specified [in a literal],
then a <time literal> or <timestamp literal> is interpreted as
local time with the specified time zone displacement. However,
it is effectively converted to UTC while retaining the original
time zone displacement."

Therefore, when you read the words "UTC component" in an SQL
standards document, and you're thinking of
TIMESTAMP '10:30:00-07:00', think of it this way:
the UTC component is '17:30:00' (not '10:30:00'!), and the
displacement is '-07:00'. Since that makes comparison easy,
it's reasonable to store as UTC component + displacement.
(Dmitri Lenev disagrees with this interpretation.)

But the end user won't see UTC components, the end user sees
things as "Way 1".

Difficulties due to MySQL's non-standard TIMESTAMP
--------------------------------------------------

MySQL TIMESTAMP minimum is '1970-01-01 00:00:00'. That's UTC.
Therefore the TIMESTAMP WITH TIME ZONE minimum is slightly
higher if the displacement is greater than '+00:00'.

MySQL TIMESTAMP has an "automatic" value = NOW() if it's
the first defined column. Should this apply for TIMESTAMP
WITH TIME ZONE too? The proposal is: No.

Determinism
-----------

Comparisons or assignments that combine with-time-zone and
without-time-zone values are dependent on session time zone.
Therefore they are non-deterministic.

Maybe that's why, in standard SQL, it is illegal to combine
in a reference clause, as in:
CREATE TABLE t1
 (s1 TIME WITHOUT TIME ZONE, PRIMARY KEY (s1));
CREATE TABLE t2
 (s1 TIME WITH TIME ZONE, FOREIGN KEY (s1) REFERENCES t1 (s1));

Functions and Operators
-----------------------

Compare this list to the list in WL#946.

The feature will be tested with all functions and operators
that accept or return temporal values. Specifically:

= > < >= <= + -.
As in "WHERE TIME '12:12:12+00:00' > TIME '12:12:12+00:00'
If the UTC components match, then the two values are "equal",
a difference of displacements is a "secondary difference"
and will be ignored in all comparisons, even for ORDER BY.
If a user must know whether two with-time-zone values are
really identical as well as equal, he/she will have to
EXTRACT(TIMEZONE_HOUR ...) and EXTRACT(TIMEZONE_MINUTE ...).

ADDTIME().
As in ADDTIME(time_with_time_zone_col, 1); 
No problems expected. Addition is to the primary
hour/minute/second fields, ADDTIME can't affect displacement.

AT (new operator).
After any datetime expression, it's legal to say
"AT LOCAL" or "AT TIME ZONE displacement". This
converts the expression to the local (i.e. the
@@time_zone) or the specified displacement. For example:
'1999-12-12 10:30:00' AT TIME ZONE '-07:00' becomes
'1999-12-12 10:30:00-07:00'.

CURTIME(), CURRENT_TIME(), CURRENT_TIME, UTC_TIMESTAMP(),
CURRENT_TIMESTAMP(), CURRENT_TIMESTAMP, NOW(),
LOCALTIME, LOCALTIME(), LOCALTIMESTAMP, LOCALTIMESTAMP()
Currently CURRENT_TIME and CURRENT_TIMESTAMP are synonyms
for LOCALTIME and LOCALTIMESTAMP. The standard requirement
is that CURRENT_TIME should include a displacement equal
to the local time zone. For example, if it's 12:00 noon:
and I've said "SET TIME_ZONE = '+03:00';" then
"SELECT CURRENT_TIME;" should return '12:00:00+03:00' but
"SELECT LOCALTIME;"    should return '12:00:00'. This is
unfortunate, it means that data type of CURRENT_TIME is
different (TIME WITH TIME ZONE instead of TIME).

CAST(), including implicit cast for assignment.
Allow CAST(expr AS TIME|TIMESTAMP|DATETIME WITH TIME ZONE)".
To convert with-time-zone to without-time-zone, take the UTC
component of the with-time-zone value and add the displacement.
For example, after SET TIME_ZONE = '-07:00':
'1999-12-12 10:30:00-06:00' =
'1999-12-12 16:30:00 + INTERVAL '-07:00' HOUR TO MINUTE =
'1999-12-12 09:30'.
To convert without-time-zone to with-time-zone, concatenate
the session timezone as the displacement. For example,
after SET TIME_ZONE = '-07:00':
'1999-12-12 09:30' = '1999-12-12 09:30-07:00'.

CONVERT_TZ().
This would be pointless with with-time-zone datetimes,
so it will be illegal.

DATE_FORMAT().
Following Oracle's ideas for "Format Models",
http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements004.htm#i34510
these additional specifiers would be possible:
TZH = time zone hour
TZM = time zone minute
TZR = time zone region (if "Oracle Extension" accepted)
TZD = time zone daylight (if "Oracle Extension" accepted)
But MySQL will not quite do it Oracle's way. See TIME_FORMAT().

EXTRACT(TIMEZONE_HOUR ...) and EXTRACT(TIMEZONE_MINUTE ...)
For the value '1999-12-12 12:12:12+01:02',
EXTRACT(TIMEZONE_HOUR ...) = 1, EXTRACT(TIMEZONE_MINUTE ...) = 2.
For the value '1999-12-12 12:12:12-01:02',
EXTRACT(TIMEZONE_HOUR ...) =-1, EXTRACT(TIMEZONE_MINUTE ...) =-2.
An attempt to extract these fields from a without-time-zone
value should cause an error in strict mode, should result in
NULLs in non-strict mode, or maybe zeros, nobody cares.

This must work with real times and timestamps, not just literals.
GET_FORMAT().
There will be no GET_FORMAT() return value which includes
formatting specification for the displacement.

MAKETIME().
This won't work with displacement arguments. Forget it.

MICROSECOND().
No problems expected.

SEC_TO_TIME(seconds).
No problem expected. This returns TIME, not TIME WITH TIME ZONE. 

SUBTIME().
No problems expected.

TIME(expr).
If the expr contains a displacement, return it, and the
data type of the result will be TIME WITH TIME ZONE rather
than TIME.

TIMEDIFF().
No problems expected.

TIMESTAMP(expr).
If the expr contains a displacement, return it, and the
data type of the result will be TIMESTAMP WITH TIME ZONE
rather than TIMESTAMP.

TIMESTAMPADD().
No problems expected.

TIMESTAMPDIFF().
No problems expected.

TIME_FORMAT().
The usual locale specifiers are:
%Z  equivalent to Oracle's TZD. Example: 'PST'.
%z  displacement, without colon. Example: '+0000'.
MySQL would only need %Z if it accepts "Oracle extensions".
MySQL should include a colon for %z, probably.

Connectors and API
------------------

We will not ask the connector team whether
they can handle with-time-zone data types.
If they can't, that will be reported as a connector bug.

For C API prepared statement data types, in MYSQL_TIME
structure, we will need timezone_hour and timezone_minute.

New Reserved Words
------------------

LOCAL
TIMEZONE_HOUR
TIMEZONE_MINUTE

INFORMATION_SCHEMA
------------------

All data type columns in information_schema, for
example information_schema.columns_data_type, may
contain 'TIME WITH TIME ZONE', 'TIMESTAMP WITH
TIME ZONE', or DATETIME WITH TIME ZONE'.

Oracle Extension: Named Zones
-----------------------------

In standard SQL, all with-time-zone values end with
a displacement, e.g. '-07:00'. Oracle allows names,
e.g. 'PST' or 'Canada/Mountain'. Since MySQL allows
named time zones too (see the MySQL Reference Manual,
"5.11.8. MySQL Server Time Zone Support",
http://dev.mysql.com/doc/refman/5.1/en/time-zone-support.html)
there will naturally be demand for MySQL to support
'1999-12-12 00:00:00 Canada/Mountain' etc.

Problem #1: the zone must be actually stored. If one says
INSERT INTO t (timestamp_with_time_zone_column)
VALUES ('1999-12-12 00:00:00 Canada/Mountain');
SELECT * FROM t;
one gets '1999-12-12 00:00:00 Canada/Mountain'!
Oracle didn't store the value converted to a displacement.
Therefore the proposed MySQL storage for with-time-zone is:
  UTC component
  displacement
  ID of zone
The storage wouldn't really be the string naming the zone,
it would be the ID of the zone, and MySQL would look up the
name when retrieving, from the time zone tables. If for some
reason the time zone tables disappeared, the retrieved value
would be '1999-12-12 00:00:00 TIME_ZONE_ID=999'.
The storage would have to include both the displacement and
the zone. You can't find the displacement given the zone,
even if you use the time zone tables, because the tables
can change.

Problem #2: arithmetic can be complex. It's difficult to
say what "TIMESTAMP '2007-03-11 01:00:00 America/Edmonton'
+ INTERVAL '3' HOUR should be, due to possible daylight
saving time changes. Yes, one can look it up in
the time zone tables, but that's non-deterministic and can
take a long time. The benefit is small.

Despite the problems, some people want this because they
hope it will make handling of CREATE EVENT easier.

Oracle Extension: LOCAL TIME ZONE
---------------------------------

TIME/TIMESTAMP/DATETIME WITH LOCAL TIME ZONE would be
simpler data types than the WITH TIME ZONE data types.
One can enter a with-time-zone literal containing a
displacement or a named zone, but it would get converted
to UTC and stored as UTC -- with no storing of displacement.
Then, upon retrieval, MySQL would convert from UTC to the
local time zone.

This is approximately what PostgreSQL is doing with their
WITH TIME ZONE data type. That is, PostgreSQL WITH TIME ZONE
is like Oracle WITH LOCAL TIME ZONE. See
http://www.postgresql.org/docs/current/static/datatype-datetime.html

Long ago Dmitri Lenev said that Peter Zaitsev said that
there was a potential customer for this LOCAL TIME ZONE.
Dmitri also proposed to align MySQL's TIME ZONE better
with LOCAL TIME ZONE requirements. Peter Gulutzan didn't
check what happened as a result.

References
----------

"Timestamps" article by Trudy Pelzer
http://www.dbazine.com/db2/db2-disarticles/pelzer2

"Working with Time Zones"
http://www.w3.org/TR/2005/NOTE-timezone-20051013/

"Write Time Zone Aware Code in Oracle"
http://www.devx.com/dbzone/Article/30501

5.10.8. MySQL Server Time Zone Support
http://dev.mysql.com/doc/refman/5.1/en/time-zone-support.html

BUG#19945 no DATE_FORMAT(date, %Z)

Emails between pgulutzan and dlenev:
"Re: Time Zones and SQL Standards" August 2003
"Re: [Fwd: Thoughts about timezone support (proposal for discussion)]"
 August 2003
"Re: [Fwd: Re: UTC_TIME and UTC_TIMESTAMP]" August 2003

SQL Server 2008 DATETIMEOFFSET
http://msdn.microsoft.com/en-us/library/bb630289.aspx

BUG#6742 No way to store Timezone toogether with datetime?