The storage requirements for each of the data types supported by MySQL are listed here by category.
The maximum size of a row in a MySQL table is 65,535 bytes. Each
BLOB and TEXT column
accounts for only nine to twelve bytes toward this size. This
limitation may be shared by other storage engines as well.
For tables using the NDBCLUSTER storage
engine, there is the factor of 4-byte
alignment to be taken into account when calculating
storage requirements. This means that all NDB
data storage is done in multiples of 4 bytes. Thus, a column
value that would take 15 bytes in a table using a storage engine
other than NDB requires 16 bytes in an
NDB table. This requirement applies in
addition to any other considerations that are discussed in this
section. For example, in NDBCLUSTER tables,
the TINYINT, SMALLINT,
MEDIUMINT, and INTEGER
(INT) column types each require 4 bytes
storage per record due to the alignment factor.
In addition, when calculating storage requirements for Cluster
tables, you must remember that every table using the
NDBCLUSTER storage engine requires a primary
key; if no primary key is defined by the user, then a
“hidden” primary key will be created by
NDB. This hidden primary key consumes 31-35
bytes per table record.
You may find the ndb_size.pl utility to be
useful for estimating NDB storage requirements.
This Perl script connects to a current MySQL (non-Cluster)
database and creates a report on how much space that database
would require if it used the NDBCLUSTER storage
engine. See Section 15.10.14, “ndb_size.pl — NDBCLUSTER Size Requirement Estimator”,
for more information.
Storage Requirements for Numeric Types
| Data Type | Storage Required |
TINYINT |
1 byte |
SMALLINT |
2 bytes |
MEDIUMINT |
3 bytes |
INT, INTEGER
|
4 bytes |
BIGINT |
8 bytes |
FLOAT( |
4 bytes if 0 <= p <= 24, 8 bytes if 25
<= p <= 53 |
FLOAT |
4 bytes |
DOUBLE [PRECISION], REAL
|
8 bytes |
DECIMAL(,
NUMERIC(
|
Varies; see following discussion |
In MySQL versions up to and including 4.1,
DECIMAL columns are represented as strings and
their storage requirements are:
M+2 bytes, if
D > 0
bytes, if
M+1D = 0
D+2, if
M <
D
Storage Requirements for Date and Time Types
| Data Type | Storage Required |
DATE |
3 bytes |
TIME |
3 bytes |
DATETIME |
8 bytes |
TIMESTAMP |
4 bytes |
YEAR |
1 byte |
The storage requirements shown in the table arise from the way that MySQL represents temporal values:
DATE: A three-byte integer packed as
DD + MM×32
+ YYYY×16×32
TIME: A three-byte integer packed as
DD×24×3600 +
HH×3600 +
MM×60 + SS
DATETIME: Eight bytes:
A four-byte integer packed as
YYYY×10000 +
MM×100 +
DD
A four-byte integer packed as
HH×10000 +
MM×100 +
SS
TIMESTAMP: A four-byte integer representing
seconds UTC since the epoch ('1970-01-01
00:00:00' UTC)
YEAR: A one-byte integer
Storage Requirements for String Types
In the following table, M represents
the declared column length in characters for non-binary string
types and bytes for binary string types.
L represents the actual length in bytes
of a given string value.
| Data Type | Storage Required |
CHAR( |
M × w bytes,
0 <= 255, where w is
the number of bytes required for the maximum-length
character in the character set |
BINARY( |
M bytes, 0 <=
255 |
VARCHAR(,
VARBINARY(
|
L + 1 bytes, 0 <=
255 |
TINYBLOB, TINYTEXT
|
L + 1 bytes, where
L <
28
|
BLOB, TEXT
|
L + 2 bytes, where
L <
216
|
MEDIUMBLOB, MEDIUMTEXT
|
L + 3 bytes, where
L <
224
|
LONGBLOB, LONGTEXT
|
L + 4 bytes, where
L <
232
|
ENUM(' |
1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum) |
SET(' |
1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum) |
Variable-length string types are stored using a length prefix plus
data. The length prefix requires from one to four bytes depending
on the data type, and the value of the prefix is
L (the byte length of the string). For
example, storage for a MEDIUMTEXT value
requires L bytes to store the value
plus three bytes to store the length of the value.
As of MySQL 4.1, to calculate the number of bytes used to store a
particular CHAR, VARCHAR, or
TEXT column value, you must take into account
the character set used for that column and whether the value
contains multi-byte characters. In particular, when using the
utf8 Unicode character set, you must keep in
mind that not all utf8 characters use the same
number of bytes and can require up to three bytes per character.
For a breakdown of the storage used for different categories of
utf8 characters, see
Section 9.1.7, “Unicode Support”.
VARCHAR and the BLOB and
TEXT types are variable-length types. For each,
the storage requirements depend on the actual length of column
values (represented by L in the
preceding table), rather than on the type's maximum possible size.
For example, a VARCHAR(10) column can hold a
string with a maximum length of 10 characters. The actual storage
required is the length of the string
(L), plus one byte to record the length
of the string. For the string 'abcd',
L is 4 and the storage requirement is
five bytes.
The NDBCLUSTER engine supports only
fixed-width columns. This means that a
VARCHAR column from a table in a MySQL
Cluster will behave almost as if it were of type
CHAR (except that each record still has one
extra byte overhead). For example, in an NDB
table, each record in a column declared as
VARCHAR(100) will take up 101 bytes for
storage, regardless of the length of the string actually stored
in any given record.
TEXT and BLOB columns are
implemented differently in the NDBCLUSTER
storage engine, wherein each record in a TEXT
column is made up of two separate parts. One of these is of fixed
size (256 bytes), and is actually stored in the original table.
The other consists of any data in excess of 256 bytes, which is
stored in a hidden table. The records in this second table are
always 2,000 bytes long. This means that the size of a
TEXT column is 256 if
size <= 256 (where
size represents the size of the
record); otherwise, the size is 256 +
.
size + (2000 -
(size - 256) % 2000)
The size of an ENUM object is determined by the
number of different enumeration values. One byte is used for
enumerations with up to 255 possible values. Two bytes are used
for enumerations having between 256 and 65,535 possible values.
See Section 10.4.4, “The ENUM Type”.
The size of a SET object is determined by the
number of different set members. If the set size is
N, the object occupies
( bytes,
rounded up to 1, 2, 3, 4, or 8 bytes. A N + 7) / 8SET can
have a maximum of 64 members. See Section 10.4.5, “The SET Type”.

User Comments
Had a lot of trouble finding the maximum table size in bytes for capacity planning. More specifically it was InnoDB tables that I had a problem with. Average row size is good, but I wanted maximum row size.
I checked several products and could not find what I wanted. Some of the tables I deal with are 300+ fields and so manual calculation was not practical.
So I wrote a little perl script that does it. Thought it might be of some use, so I include it here...it does all field types except enum/set types. It does not calculate anything regarding index size.
Just do a mysqldump -d (just the schema) of your DB to a file, and run this perl script specifying the schema file as the only argument.
----------------------------------------------------------------
#!/usr/bin/perl
use Data::Dumper;
use strict;
$| = 1;
my %DataType =
("TINYINT"=>1,
"SMALLINT"=>2,
"MEDIUMINT"=>3,
"INT"=>4,
"BIGINT"=>8,
"FLOAT"=>'if ($M <= 24) {return 4;} else {return 8;}',
"DOUBLE"=>8,
"DECIMAL"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"NUMERIC"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"DATE"=>3,
"DATETIME"=>8,
"TIMESTAMP"=>4,
"TIME"=>3,
"YEAR"=>1,
"CHAR"=>'$M',
"VARCHAR"=>'$M+1',
"TINYBLOB"=>'$M+1',
"TINYTEXT"=>'$M+1',
"BLOB"=>'$M+2',
"TEXT"=>'$M+2',
"MEDIUMBLOB"=>'$M+3',
"MEDIUMTEXT"=>'$M+3',
"LONGBLOB"=>'$M+4',
"LONGTEXT"=>'$M+4');
my $D;
my $M;
my $dt;
my $fieldCount = 0;
my $byteCount = 0;
my $fieldName;
open (TABLEFILE,"< $ARGV[0]");
LOGPARSE:while (<TABLEFILE>)
{
chomp;
if ( $_ =~ s/create table[ ]*([a-zA-Z_]*).*/$1/i )
{
print "Fieldcount: $fieldCount Bytecount: $byteCount\n" if $fieldCount;
$fieldCount = 0;
$byteCount = 0;
print "\nTable: $_\n";
next;
}
next if $_ !~ s/(.*)[ ]+(TINYINT[ ]*\(*[0-9,]*\)*|SMALLINT[ ]*\(*[0-9,]*\)*|MEDIUMINT[ ]*\(*[0-9,]*\)*|INT[ ]*\(*[0-9,]*\)*|BIGINT[ ]*\(*[0-9,]*\)*|FLOAT[ ]*\(*[0-9,]*\)*|DOUBLE[ ]*\(*[0-9,]*\)*|DECIMAL[ ]*\(*[0-9,]*\)*|NUMERIC[ ]*\(*[0-9,]*\)*|DATE[ ]*\(*[0-9,]*\)*|DATETIME[ ]*\(*[0-9,]*\)*|TIMESTAMP[ ]*\(*[0-9,]*\)*|TIME[ ]*\(*[0-9,]*\)*|YEAR[ ]*\(*[0-9,]*\)*|CHAR[ ]*\(*[0-9,]*\)*|VARCHAR[ ]*\(*[0-9,]*\)*|TINYBLOB[ ]*\(*[0-9,]*\)*|TINYTEXT[ ]*\(*[0-9,]*\)*|BLOB[ ]*\(*[0-9,]*\)*|TEXT[ ]*\(*[0-9,]*\)*|MEDIUMBLOB[ ]*\(*[0-9,]*\)*|MEDIUMTEXT[ ]*\(*[0-9,]*\)*|LONGBLOB[ ]*\(*[0-9,]*\)*|LONGTEXT[ ]*\(*[0-9,]*\)*).*/$2/gix;
$fieldName=$1;
$_=uc;
$D=0;
($D = $_) =~ s/.*\,([0-9]+).*/$1/g if ( $_ =~ m/\,/ );
$_ =~ s/\,([0-9]*)//g if ( $_ =~ m/\,/ );
($M = $_) =~ s/[^0-9]//g;
$M=0 if ! $M;
($dt = $_) =~ s/[^A-Za-z_]*//g;
print "$fieldName $_:\t".eval($DataType{"$dt"})." bytes\n";
++$fieldCount;
$byteCount += eval($DataType{"$dt"});
}
print "Fieldcount: $fieldCount Bytecount: $byteCount\n";
Here's a modification of Marc's script above that also handles ENUM's. Enjoy.
#!/usr/bin/perl
use Data::Dumper;
use strict;
$| = 1;
my %DataType =
("TINYINT"=>1, "SMALLINT"=>2, "MEDIUMINT"=>3,
"INT"=>4, "BIGINT"=>8,
"FLOAT"=>'if ($M <= 24) {return 4;} else {return 8;}',
"DOUBLE"=>8,
"DECIMAL"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"NUMERIC"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"DATE"=>3, "DATETIME"=>8, "TIMESTAMP"=>4, "TIME"=>3, "YEAR"=>1,
"CHAR"=>'$M', "VARCHAR"=>'$M+1',
"ENUM"=>1,
"TINYBLOB"=>'$M+1', "TINYTEXT"=>'$M+1',
"BLOB"=>'$M+2', "TEXT"=>'$M+2',
"MEDIUMBLOB"=>'$M+3', "MEDIUMTEXT"=>'$M+3',
"LONGBLOB"=>'$M+4', "LONGTEXT"=>'$M+4');
my ($D, $M, $dt);
my $fieldCount = 0;
my $byteCount = 0;
my $fieldName;
open (TABLEFILE,"< $ARGV[0]");
LOGPARSE:while (<TABLEFILE>) {
chomp;
if ( $_ =~ s/create table[ ]`*([a-zA-Z_]*).*`/$1/i ) {
print "Fieldcount: $fieldCount Bytecount: $byteCount\n" if $fieldCount;
$fieldCount = 0;
$byteCount = 0;
print "\nTable: $_\n";
next;
}
next if $_ !~ s/(.*)[ ]+(TINYINT[ ]*\(*[0-9,]*\)*|SMALLINT[ ]*\(*[0-9,]*\)*|MEDIUMINT[ ]*\(*[0-9,]*\)*|INT[ ]*\(*[0-9,]*\)*|BIGINT[ ]*\(*[0-9,]*\)*|FLOAT[ ]*\(*[0-9,]*\)*|DOUBLE[ ]*\(*[0-9,]*\)*|DECIMAL[ ]*\(*[0-9,]*\)*|NUMERIC[ ]*\(*[0-9,]*\)*|DATE[ ]*\(*[0-9,]*\)*|DATETIME[ ]*\(*[0-9,]*\)*|TIMESTAMP[ ]*\(*[0-9,]*\)*|TIME[ ]*\(*[0-9,]*\)*|YEAR[ ]*\(*[0-9,]*\)*|CHAR[ ]*\(*[0-9,]*\)*|VARCHAR[ ]*\(*[0-9,]*\)*|TINYBLOB[ ]*\(*[0-9,]*\)*|TINYTEXT[ ]*\(*[0-9,]*\)*|ENUM[ ]*\(*['A-Za-z_,]*\)*|BLOB[ ]*\(*[0-9,]*\)*|TEXT[ ]*\(*[0-9,]*\)*|MEDIUMBLOB[ ]*\(*[0-9,]*\)*|MEDIUMTEXT[ ]*\(*[0-9,]*\)*|LONGBLOB[ ]*\(*[0-9,]*\)*|LONGTEXT[ ]*\(*[0-9,]*\)*).*/$2/gix;
$fieldName=$1;
$_=uc;
$D=0;
($D = $_) =~ s/.*\,([0-9]+).*/$1/g if ( $_ =~ m/\,/ );
$_ =~ s/\,([0-9]*)//g if ( $_ =~ m/\,/ );
($M = $_) =~ s/[^0-9]//g;
$M=0 if ! $M;
($dt = $_) =~ s/\(.*\)//g;
$dt =~ s/[^A-Za-z_]*//g;
print "$fieldName $_:\t".eval($DataType{"$dt"})." bytes\n";
++$fieldCount;
$byteCount += eval($DataType{"$dt"});
}
print "Fieldcount: $fieldCount Bytecount: $byteCount\n";
Add your own comment.