The following discussion provides several examples of the ways in which MySQL detects functional dependencies. The examples use this notation:
Press CTRL+C to copy{X} -> {Y}
Understand this as “X
uniquely
determines Y
,” which also
means that Y
is functionally
dependent on X
.
The examples use the world
database, which
can be downloaded from
https://dev.mysql.com/doc/index-other.html. You can find details
on how to install the database on the same page.
The following query selects, for each country, a count of spoken languages:
Press CTRL+C to copySELECT co.Name, COUNT(*) FROM countrylanguage cl, country co WHERE cl.CountryCode = co.Code GROUP BY co.Code;
co.Code
is a primary key of
co
, so all columns of co
are functionally dependent on it, as expressed using this
notation:
Press CTRL+C to copy{co.Code} -> {co.*}
Thus, co.name
is functionally dependent on
GROUP BY
columns and the query is valid.
A UNIQUE
index over a NOT
NULL
column could be used instead of a primary key
and the same functional dependence would apply. (This is not
true for a UNIQUE
index that permits
NULL
values because it permits multiple
NULL
values and in that case uniqueness is
lost.)
This query selects, for each country, a list of all spoken languages and how many people speak them:
Press CTRL+C to copySELECT co.Name, cl.Language, cl.Percentage * co.Population / 100.0 AS SpokenBy FROM countrylanguage cl, country co WHERE cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
The pair (cl.CountryCode
,
cl.Language
) is a two-column composite
primary key of cl
, so that column pair
uniquely determines all columns of cl
:
Press CTRL+C to copy{cl.CountryCode, cl.Language} -> {cl.*}
Moreover, because of the equality in the
WHERE
clause:
Press CTRL+C to copy{cl.CountryCode} -> {co.Code}
And, because co.Code
is primary key of
co
:
Press CTRL+C to copy{co.Code} -> {co.*}
“Uniquely determines” relationships are transitive, therefore:
Press CTRL+C to copy{cl.CountryCode, cl.Language} -> {cl.*,co.*}
As a result, the query is valid.
As with the previous example, a UNIQUE
key
over NOT NULL
columns could be used instead
of a primary key.
An INNER JOIN
condition can be used instead
of WHERE
. The same functional dependencies
apply:
Press CTRL+C to copySELECT co.Name, cl.Language, cl.Percentage * co.Population/100.0 AS SpokenBy FROM countrylanguage cl INNER JOIN country co ON cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
Whereas an equality test in a WHERE
condition or INNER JOIN
condition is
symmetric, an equality test in an outer join condition is not,
because tables play different roles.
Assume that referential integrity has been accidentally broken
and there exists a row of countrylanguage
without a corresponding row in country
.
Consider the same query as in the previous example, but with a
LEFT JOIN
:
Press CTRL+C to copySELECT co.Name, cl.Language, cl.Percentage * co.Population/100.0 AS SpokenBy FROM countrylanguage cl LEFT JOIN country co ON cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
For a given value of cl.CountryCode
, the
value of co.Code
in the join result is
either found in a matching row (determined by
cl.CountryCode
) or is
NULL
-complemented if there is no match
(also determined by cl.CountryCode
). In
each case, this relationship applies:
Press CTRL+C to copy{cl.CountryCode} -> {co.Code}
cl.CountryCode
is itself functionally
dependent on {cl.CountryCode
,
cl.Language
} which is a primary key.
If in the join result co.Code
is
NULL
-complemented,
co.Name
is as well. If
co.Code
is not
NULL
-complemented, then because
co.Code
is a primary key, it determines
co.Name
. Therefore, in all cases:
Press CTRL+C to copy{co.Code} -> {co.Name}
Which yields:
Press CTRL+C to copy{cl.CountryCode, cl.Language} -> {cl.*,co.*}
As a result, the query is valid.
However, suppose that the tables are swapped, as in this query:
Press CTRL+C to copySELECT co.Name, cl.Language, cl.Percentage * co.Population/100.0 AS SpokenBy FROM country co LEFT JOIN countrylanguage cl ON cl.CountryCode = co.Code GROUP BY cl.CountryCode, cl.Language;
Now this relationship does not apply:
Press CTRL+C to copy{cl.CountryCode, cl.Language} -> {cl.*,co.*}
Indeed, all NULL
-complemented rows made for
cl
is put into a single group (they have
both GROUP BY
columns equal to
NULL
), and inside this group the value of
co.Name
can vary. The query is invalid and
MySQL rejects it.
Functional dependence in outer joins is thus linked to whether
determinant columns belong to the left or right side of the
LEFT JOIN
. Determination of functional
dependence becomes more complex if there are nested outer
joins or the join condition does not consist entirely of
equality comparisons.
Suppose that a view on countries produces their code, their name in uppercase, and how many different official languages they have:
Press CTRL+C to copyCREATE VIEW country2 AS SELECT co.Code, UPPER(co.Name) AS UpperName, COUNT(cl.Language) AS OfficialLanguages FROM country AS co JOIN countrylanguage AS cl ON cl.CountryCode = co.Code WHERE cl.isOfficial = 'T' GROUP BY co.Code;
This definition is valid because:
Press CTRL+C to copy{co.Code} -> {co.*}
In the view result, the first selected column is
co.Code
, which is also the group column and
thus determines all other selected expressions:
Press CTRL+C to copy{country2.Code} -> {country2.*}
MySQL understands this and uses this information, as described following.
This query displays countries, how many different official
languages they have, and how many cities they have, by joining
the view with the city
table:
Press CTRL+C to copySELECT co2.Code, co2.UpperName, co2.OfficialLanguages, COUNT(*) AS Cities FROM country2 AS co2 JOIN city ci ON ci.CountryCode = co2.Code GROUP BY co2.Code;
This query is valid because, as seen previously:
Press CTRL+C to copy{co2.Code} -> {co2.*}
MySQL is able to discover a functional dependency in the
result of a view and use that to validate a query which uses
the view. The same would be true if
country2
were a derived table (or common
table expression), as in:
Press CTRL+C to copySELECT co2.Code, co2.UpperName, co2.OfficialLanguages, COUNT(*) AS Cities FROM ( SELECT co.Code, UPPER(co.Name) AS UpperName, COUNT(cl.Language) AS OfficialLanguages FROM country AS co JOIN countrylanguage AS cl ON cl.CountryCode=co.Code WHERE cl.isOfficial='T' GROUP BY co.Code ) AS co2 JOIN city ci ON ci.CountryCode = co2.Code GROUP BY co2.Code;