Documentation Home
MySQL 5.5 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 26.9Mb
PDF (A4) - 26.9Mb
PDF (RPM) - 25.3Mb
HTML Download (TGZ) - 6.3Mb
HTML Download (Zip) - 6.3Mb
HTML Download (RPM) - 5.4Mb
Man Pages (TGZ) - 157.6Kb
Man Pages (Zip) - 260.9Kb
Info (Gzip) - 2.6Mb
Info (Zip) - 2.6Mb
Excerpts from this Manual

MySQL 5.5 Reference Manual  /  ...  /  Aggregate (GROUP BY) Function Descriptions

12.16.1 Aggregate (GROUP BY) Function Descriptions

This section describes group (aggregate) functions that operate on sets of values.

Table 12.20 Aggregate (GROUP BY) Functions

Name Description
AVG() Return the average value of the argument
BIT_AND() Return bitwise AND
BIT_OR() Return bitwise OR
BIT_XOR() Return bitwise XOR
COUNT() Return a count of the number of rows returned
COUNT(DISTINCT) Return the count of a number of different values
GROUP_CONCAT() Return a concatenated string
MAX() Return the maximum value
MIN() Return the minimum value
STD() Return the population standard deviation
STDDEV() Return the population standard deviation
STDDEV_POP() Return the population standard deviation
STDDEV_SAMP() Return the sample standard deviation
SUM() Return the sum
VAR_POP() Return the population standard variance
VAR_SAMP() Return the sample variance
VARIANCE() Return the population standard variance

Unless otherwise stated, group functions ignore NULL values.

If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows. For more information, see Section 12.16.3, “MySQL Handling of GROUP BY”.

For numeric arguments, the variance and standard deviation functions return a DOUBLE value. The SUM() and AVG() functions return a DECIMAL value for exact-value arguments (integer or DECIMAL), and a DOUBLE value for approximate-value arguments (FLOAT or DOUBLE).

The SUM() and AVG() aggregate functions do not work with temporal values. (They convert the values to numbers, losing everything after the first nonnumeric character.) To work around this problem, convert to numeric units, perform the aggregate operation, and convert back to a temporal value. Examples:

SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(time_col))) FROM tbl_name;
SELECT FROM_DAYS(SUM(TO_DAYS(date_col))) FROM tbl_name;

Functions such as SUM() or AVG() that expect a numeric argument cast the argument to a number if necessary. For SET or ENUM values, the cast operation causes the underlying numeric value to be used.

The BIT_AND(), BIT_OR(), and BIT_XOR() aggregate functions perform bit operations. They require BIGINT (64-bit integer) arguments and return BIGINT values. Arguments of other types are converted to BIGINT and truncation might occur.

  • AVG([DISTINCT] expr)

    Returns the average value of expr. The DISTINCT option can be used to return the average of the distinct values of expr.

    If there are no matching rows, AVG() returns NULL.

    mysql> SELECT student_name, AVG(test_score)
           FROM student
           GROUP BY student_name;
  • BIT_AND(expr)

    Returns the bitwise AND of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision.

    If there are no matching rows, BIT_AND() returns a neutral value (all bits set to 1).

  • BIT_OR(expr)

    Returns the bitwise OR of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision.

    If there are no matching rows, BIT_OR() returns a neutral value (all bits set to 0).

  • BIT_XOR(expr)

    Returns the bitwise XOR of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision.

    If there are no matching rows, BIT_XOR() returns a neutral value (all bits set to 0).

  • COUNT(expr)

    Returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement. The result is a BIGINT value.

    If there are no matching rows, COUNT() returns 0.

    mysql> SELECT student.student_name,COUNT(*)
           FROM student,course
           WHERE student.student_id=course.student_id
           GROUP BY student_name;

    COUNT(*) is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain NULL values.

    For transactional storage engines such as InnoDB, storing an exact row count is problematic. Multiple transactions may be occurring at the same time, each of which may affect the count.

    InnoDB does not keep an internal count of rows in a table because concurrent transactions might see different numbers of rows at the same time. Consequently, SELECT COUNT(*) statements only count rows visible to the current transaction.

    To process a SELECT COUNT(*) statement, InnoDB scans an index of the table, which takes some time if the index is not entirely in the buffer pool. For a faster count, create a counter table and let your application update it according to the inserts and deletes it does. However, this method may not scale well in situations where thousands of concurrent transactions are initiating updates to the same counter table. If an approximate row count is sufficient, use SHOW TABLE STATUS.

    InnoDB handles SELECT COUNT(*) and SELECT COUNT(1) operations in the same way. There is no performance difference.

    For MyISAM tables, COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause. For example:

    mysql> SELECT COUNT(*) FROM student;

    This optimization only applies to MyISAM tables, because an exact row count is stored for this storage engine and can be accessed very quickly. COUNT(1) is only subject to the same optimization if the first column is defined as NOT NULL.

  • COUNT(DISTINCT expr,[expr...])

    Returns a count of the number of rows with different non-NULL expr values.

    If there are no matching rows, COUNT(DISTINCT) returns 0.

    mysql> SELECT COUNT(DISTINCT results) FROM student;

    In MySQL, you can obtain the number of distinct expression combinations that do not contain NULL by giving a list of expressions. In standard SQL, you would have to do a concatenation of all expressions inside COUNT(DISTINCT ...).

  • GROUP_CONCAT(expr)

    This function returns a string result with the concatenated non-NULL values from a group. It returns NULL if there are no non-NULL values. The full syntax is as follows:

    GROUP_CONCAT([DISTINCT] expr [,expr ...]
                 [ORDER BY {unsigned_integer | col_name | expr}
                     [ASC | DESC] [,col_name ...]]
                 [SEPARATOR str_val])
    mysql> SELECT student_name,
             GROUP_CONCAT(test_score)
           FROM student
           GROUP BY student_name;

    Or:

    mysql> SELECT student_name,
             GROUP_CONCAT(DISTINCT test_score
                          ORDER BY test_score DESC SEPARATOR ' ')
           FROM student
           GROUP BY student_name;

    In MySQL, you can get the concatenated values of expression combinations. To eliminate duplicate values, use the DISTINCT clause. To sort values in the result, use the ORDER BY clause. To sort in reverse order, add the DESC (descending) keyword to the name of the column you are sorting by in the ORDER BY clause. The default is ascending order; this may be specified explicitly using the ASC keyword. The default separator between values in a group is comma (,). To specify a separator explicitly, use SEPARATOR followed by the string literal value that should be inserted between group values. To eliminate the separator altogether, specify SEPARATOR ''.

    The result is truncated to the maximum length that is given by the group_concat_max_len system variable, which has a default value of 1024. The value can be set higher, although the effective maximum length of the return value is constrained by the value of max_allowed_packet. The syntax to change the value of group_concat_max_len at runtime is as follows, where val is an unsigned integer:

    SET [GLOBAL | SESSION] group_concat_max_len = val;

    The return value is a nonbinary or binary string, depending on whether the arguments are nonbinary or binary strings. The result type is TEXT or BLOB unless group_concat_max_len is less than or equal to 512, in which case the result type is VARCHAR or VARBINARY.

    See also CONCAT() and CONCAT_WS(): Section 12.5, “String Functions”.

  • MAX([DISTINCT] expr)

    Returns the maximum value of expr. MAX() may take a string argument; in such cases, it returns the maximum string value. See Section 8.3.1, “How MySQL Uses Indexes”. The DISTINCT keyword can be used to find the maximum of the distinct values of expr, however, this produces the same result as omitting DISTINCT.

    If there are no matching rows, MAX() returns NULL.

    mysql> SELECT student_name, MIN(test_score), MAX(test_score)
           FROM student
           GROUP BY student_name;

    For MAX(), MySQL currently compares ENUM and SET columns by their string value rather than by the string's relative position in the set. This differs from how ORDER BY compares them.

  • MIN([DISTINCT] expr)

    Returns the minimum value of expr. MIN() may take a string argument; in such cases, it returns the minimum string value. See Section 8.3.1, “How MySQL Uses Indexes”. The DISTINCT keyword can be used to find the minimum of the distinct values of expr, however, this produces the same result as omitting DISTINCT.

    If there are no matching rows, MIN() returns NULL.

    mysql> SELECT student_name, MIN(test_score), MAX(test_score)
           FROM student
           GROUP BY student_name;

    For MIN(), MySQL currently compares ENUM and SET columns by their string value rather than by the string's relative position in the set. This differs from how ORDER BY compares them.

  • STD(expr)

    Returns the population standard deviation of expr. STD() is a synonym for the standard SQL function STDDEV_POP(), provided as a MySQL extension.

    If there are no matching rows, STD() returns NULL.

  • STDDEV(expr)

    Returns the population standard deviation of expr. STDDEV() is a synonym for the standard SQL function STDDEV_POP(), provided for compatibility with Oracle.

    If there are no matching rows, STDDEV() returns NULL.

  • STDDEV_POP(expr)

    Returns the population standard deviation of expr (the square root of VAR_POP()). You can also use STD() or STDDEV(), which are equivalent but not standard SQL.

    If there are no matching rows, STDDEV_POP() returns NULL.

  • STDDEV_SAMP(expr)

    Returns the sample standard deviation of expr (the square root of VAR_SAMP().

    If there are no matching rows, STDDEV_SAMP() returns NULL.

  • SUM([DISTINCT] expr)

    Returns the sum of expr. If the return set has no rows, SUM() returns NULL. The DISTINCT keyword can be used to sum only the distinct values of expr.

    If there are no matching rows, SUM() returns NULL.

  • VAR_POP(expr)

    Returns the population standard variance of expr. It considers rows as the whole population, not as a sample, so it has the number of rows as the denominator. You can also use VARIANCE(), which is equivalent but is not standard SQL.

    If there are no matching rows, VAR_POP() returns NULL.

  • VAR_SAMP(expr)

    Returns the sample variance of expr. That is, the denominator is the number of rows minus one.

    If there are no matching rows, VAR_SAMP() returns NULL.

  • VARIANCE(expr)

    Returns the population standard variance of expr. VARIANCE() is a synonym for the standard SQL function VAR_POP(), provided as a MySQL extension.

    If there are no matching rows, VARIANCE() returns NULL.


User Comments
User comments in this section are, as the name implies, provided by MySQL users. The MySQL documentation team is not responsible for, nor do they endorse, any of the information provided here.
  Posted by Martin Lester on October 1, 2010
If you are writting queries where you want a COUNT (or SUM) for when a field has a specific value, there is a nice optimisation. E.g.

SELECT COUNT( IF(x=1, 1, NULL) ) AS c;
SELECT SUM( IF(x=1, 1, 0) ) AS c;
SELECT COUNT(x=1 OR NULL) AS c;

The above 3 statements are equivalent. Though the last is not immediately obvious how its working. The condition x=1 is evaluated as TRUE or FALSE. If it is TRUE, then the lazy evaluation of the OR means that it exits and runs COUNT(TRUE). When x=1 is FALSE, you get FALSE OR NULL, which evaluates to NULL, so you get COUNT(NULL). This means you only COUNT the lines where x=1.

It is also worth noting that in all cases above you can have any condition e.g. x>1, x IN (1,2,3), (x=1 AND y=2)
i.e. COUNT( (x=1 AND y=2) OR NULL ) as c

  Posted by t tomtul2 on December 15, 2010
Note that Variance is quite different from Coefficient of Variation.

Variance() and Var_pop() return the Variance. The variance of a random variable or distribution is the mean of the squared deviation of that variable from its mean.

Coefficient of Variation must be calcuated by STDDEV_POP(x)/Avg(x), as far as I know there is not a built in function.

http://en.wikipedia.org/wiki/Coefficient_of_variation
http://en.wikipedia.org/wiki/Variance

  Posted by ronel licayan on April 26, 2011
To Erel Segal:

FIRST Aggregate Function MySQL equivalent is MIN

MIN returns the lowest number for numeric fields
MIN returns the first line in the GROUP BY Clause base on Sort Order for non numeric fields
  Posted by Henry Hayes on July 24, 2011
Here's a good example of how to limit the GROUP CONCAT function.

http://www.websitefactors.co.uk/mysql/2011/07/mysql-limit-the-values-in-the-group_concat-function/
  Posted by Biju Mon on June 15, 2012
An Example of Running Total Calculations
--------------------------------------------

SELECT
B.Year_Enter,
B.Month_Enter,
B.MonthName_Enter,
@IT_Open := @IT_Open + B.IT AS IT_Open,
@CS_Open := @CS_Open + B.CS AS CS_Open
FROM
(SELECT
A.Year_Enter,
A.Month_Enter,
A.MonthName_Enter,
SUM(IF(A.Month_Enter <> A.Month_Close, A.IT, 0)) AS IT,
SUM(IF(A.Month_Enter <> A.Month_Close, A.CS, 0)) AS CS
FROM
(SELECT
YEAR(I.Date_Entered) AS Year_Enter,
MONTH(I.Date_Entered) AS Month_Enter,
MONTHNAME(I.Date_Entered) AS MonthName_Enter,
IFNULL(MONTH(I.Closed_Date),0) AS Month_Close,
IFNULL(MONTHNAME(I.Closed_Date),'') AS MonthName_Close,
SUM(IF(U.Department = 'IT',1,0)) AS IT,
SUM(IF(U.Department = 'CS',1,0)) AS CS
FROM
Incident I,
User U
WHERE
I.Initiated_By = U.userLogin
GROUP BY
YEAR(I.Date_Entered),
MONTH(I.Date_Entered),
MONTHNAME(I.Date_Entered),
MONTH(I.Closed_Date),
MONTHNAME(I.Closed_Date)
ORDER BY
YEAR(I.Date_Entered),
MONTH(I.Date_Entered),
MONTH(I.Closed_Date)) A
GROUP BY
A.Year_Enter,
A.Month_Enter) B,
(SELECT
@IT_Open := 0,
@CS_Open := 0
FROM DUAL) C
  Posted by David Cooke on August 25, 2012
Yet another approach to finding the MEDIAN of a group of numbers. It requires entering your table specification and column name in only one place. It uses prepared statements, so it may not be applicable to creating a stored function, but it works for one-off uses by copying and pasting into your MySQL client.

First it runs a query to count the (non-null) values and saves the count in variable @N_VALS.

Then it runs a query containing a subquery:
- Inner query (named SQ1): from the values, selects the median value (if the number of values is odd) or the two middle values. During the selection, stores the value(s) selected in variables @MID1 and @MID2, leaving @MID1 null or @MID1<=@MID2. To accomplish this the values are ordered using an index or sort. The full scan required by some approaches, after the ordering, to assign a sequence number to every row is avoided. By using the LIMIT keyword, we can proceed directly to (at worst) a scan over half the values (also required where sequence numbers are assigned) to fetch the median value or the two middle values. We use LIMIT with OFFSET to identify the one or two values we need by position in the ordered list of values.
- Outer query: Uses @N_VALS, @MID1 and @MID2 to calculate median.

SET @TABLE =
/* Your table (or schema.table) > */ 'the_table_specification';
SET @COLUMN =
/* Your column > */ 'the_column_name';
#
SET @MEDIAN_QUERY1 = CONCAT('SELECT COUNT(', @COLUMN,') FROM ', @TABLE,' INTO @N_VALS');
PREPARE MEDIAN_PREPARED1 FROM @MEDIAN_QUERY1;
EXECUTE MEDIAN_PREPARED1;
DEALLOCATE PREPARE MEDIAN_PREPARED1;
SET @ODD = @N_VALS%2;
SET @MEDIAN_QUERY2 = CONCAT('SELECT CAST(IF(@ODD, @MID2, (@MID1+@MID2)/2) AS CHAR) AS `MEDIAN(', @COLUMN, ')` FROM (SELECT @MID1:=@MID2, @MID2:=', @COLUMN, ' FROM ',
@TABLE, ' WHERE ', @COLUMN, ' IS NOT NULL ORDER BY ', @COLUMN, ' LIMIT ', 2-@ODD, ' OFFSET ', @N_VALS DIV 2 - 1*(@N_VALS>'0') + @ODD,') SQ1 LIMIT 1');
PREPARE MEDIAN_PREPARED2 FROM @MEDIAN_QUERY2;
EXECUTE MEDIAN_PREPARED2;
DEALLOCATE PREPARE MEDIAN_PREPARED2;

  Posted by Miklos Oszkar on September 4, 2012
Here's a good example of how to limit the GROUP CONCAT function.

http://mahmudahsan.wordpress.com/2008/08/27/mysql-the-group_concat-function/

http://www.qubed.ro
  Posted by Alexander Cheprasov on November 19, 2013
Easy way to use LIMIT in GROUP_CONCAT function:

SELECT GROUP_CONCAT(`t`.`id`) FROM
(SELECT `id` FROM `table` LIMIT 5) `t`
  Posted by Mike Jorgenstam on February 14, 2014
@David Cooke...

Great function but 1) Cannot run in stored function so must be procedure. 2) You forgot to ad " INTO @SomeOutputVariable" in the final line I believe...

Here is a corrected version...

-- ------------------------------------------------------------------------------------
-- Dynamic SQL not permitted in stored functions, thus using a stored procedure instead.
-- Median procedure => MEDIAN(Using Column,From Table,InOut Result)
--
-- Use by...
-- CALL MEDIAN('YourColumn','YourTable',@res);
-- SELECT @res AS MEDIAN;
-- ------------------------------------------------------------------------------------
DELIMITER $$
DROP PROCEDURE IF EXISTS `MEDIAN`$$
CREATE DEFINER=`root`@`localhost` PROCEDURE `MEDIAN`(UsingCol varchar(64),FromTbl varchar(64),INOUT res decimal(50,5))
BEGIN

SET @ires = 0;

SET @TABLE = FromTbl;
/* Your table (or schema.table) > 'the_table_specification';*/
SET @COLUMN = UsingCol;
/* Your column > 'the_column_name'; */
#
SET @MEDIAN_QUERY1 = CONCAT('SELECT COUNT(', @COLUMN,') FROM ', @TABLE,' INTO @N_VALS');
PREPARE MEDIAN_PREPARED1 FROM @MEDIAN_QUERY1;
EXECUTE MEDIAN_PREPARED1;
DEALLOCATE PREPARE MEDIAN_PREPARED1;
SET @ODD = @N_VALS%2;
SET @MEDIAN_QUERY2 = CONCAT('SELECT CAST(IF(@ODD, @MID2, (@MID1+@MID2)/2) AS CHAR) AS `MEDIAN(', @COLUMN, ')` FROM (SELECT @MID1:=@MID2, @MID2:=', @COLUMN, ' FROM ',
@TABLE, ' WHERE ', @COLUMN, ' IS NOT NULL ORDER BY ', @COLUMN, ' LIMIT ', 2-@ODD, ' OFFSET ', @N_VALS DIV 2 - 1*(@N_VALS>'0') + @ODD,') SQ1 LIMIT 1 INTO @ires');
PREPARE MEDIAN_PREPARED2 FROM @MEDIAN_QUERY2;
EXECUTE MEDIAN_PREPARED2;
DEALLOCATE PREPARE MEDIAN_PREPARED2;

SET res = @ires;

END;
$$

DELIMITER ;

--- Now for the rest...

Just a few samples without needing to learn C / C++ or get a separate compiler outside MySQL

-- ------------------------------------------
-- Faculty function FACULTY(x) => x!
-- input Max value = 65, min value = 0
-- ------------------------------------------
DELIMITER $$
DROP FUNCTION IF EXISTS `FACULTY`$$
CREATE DEFINER=`root`@`localhost` FUNCTION `FACULTY`(x tinyint(3) UNSIGNED) RETURNS bigint(20) UNSIGNED
BEGIN

IF x = 1 OR x = 0 THEN
SET @res = 1;
ELSE

SET @res = 1;
SET @i = x;
WHILE @i >= 1 DO
SET @res = (@res * @i);
SET @i = @i - 1;
END WHILE;

END IF;

RETURN @res;

END;
$$

DELIMITER ;

-- ------------------------------------------
-- Combination function COMBINATIONS(x,y) => x!/y!/(x-y)!
-- input Max value = 65, min value = 0
-- ------------------------------------------
DELIMITER $$
DROP FUNCTION IF EXISTS `COMBINATIONS`$$
CREATE DEFINER=`root`@`localhost` FUNCTION `COMBINATIONS`(x tinyint(3) UNSIGNED,y tinyint(3) UNSIGNED) RETURNS bigint(20) UNSIGNED
BEGIN

IF (x < y) THEN

SET @res = 0;

ELSEIF (x = y) OR (x = 0) OR (y = 0) THEN

SET @res = 1;

ELSE

SET @varX = FACULTY(x);
SET @varY = FACULTY(y);
SET @varZ = FACULTY(x-y);
SET @res = (@varX / @varY / @varZ);

END IF;

RETURN @res;

END;
$$

DELIMITER ;
Sign Up Login You must be logged in to post a comment.