WL#3196: Generic table space API
Affects: Connector/.NET-5.2
—
Status: In-Design
The generic table space interface allows storage of multiple tables in a single
OS-level file or raw partition. It aims to be as transparent as possible, and
should coexist with the possibility of storing tables in individual files.
Features
--------
· Multiple tables per space.
· Multiple table spaces in a database.
· The user can choose the table space on a table-by-table
basis.
· Where the underlying operating system supports it, table
spaces can be placed on raw partitions. No code changes
should be needed to do this.
· It should be possible to store tables from different en-
gines in the same table space. This needs consideration.
Advantages
----------
We expect a number of advantages from storing tables in table
space instead of in individual files:
· Many fewer open files. Under UNIX and similar operating
system, each open table currently requires two file de-
scriptors. With table spaces, a single kernel file de-
scriptor can handle multiple tables.
· Better storage efficiency. General purpose file systems
offer many features not needed for storing the fixed-sized
large blocks used in databases. Some of these features
add overhead that is unnecessary for a database table.
· The option of optimizing layout for performance. Since
the storage allocation is under the direct control of the
server, it could be allocated to ensure locality of refer-
ence. In a RAID system, the allocator could use knowledge
of the RAID layout to determine optimum allocation.
Constraints
-----------
· As far as possible, the use of table spaces should be
transparent. In particular, this means that little code
should be rewritten.
· All blocks in a specific table space must be of the same
size.
· It must be possible to represent a table space with a file
in an existing file system.
Table description files
-----------------------
The current proposal does not support storage of table descrip-
tion files (.frm files) in table space. It may be possible to
add this at a later point, but currently there is a "chicken
and egg" problem: the .frm file is needed to locate the table
space file.
Functional interface
--------------------
The current proposal is to integrate the table spaces in mysys
in the my_open, my_write, my_read and my_close functions. As
far as possible, it should be transparent to the caller that
the table is stored in table space and not in individual files.
Changes in interface
--------------------
The main change in the interface is the manner in which the
file descriptors are used. Currently this is the UNIX file de-
scriptor or similar, a small positive number returned by the
The main change in the interface is the manner in which the
file descriptors are used. Currently this is the UNIX file de-
scriptor or similar, a small positive number returned by the
operating system:
+---------------------------------------+
| |
| file descriptor |
+---------------------------------------+
31 0
The proposed change places a non-zero table space ID in the
first few bits of the descriptor:
+--------------+------------------------+
| | |
table space ID | file descriptor |
+--------------+------------------------+
31 0
There are a number of considerations:
· The table space ID is non-zero, so the interface for
normal files remains unchanged.
· For table spaces, the functions my_read, my_write and
friends identify file descriptors referring to a table
space and to act accordingly.
· This approach assumes that the number of bits required to
represent file descriptors allocated by the kernel is
significantly less than 32. The exact number of bits is
difficult to determine; currently it seems unlikely that
any system will have more than 65536 files open at the
same time, so the implementation might use the first 16
bits to identify the table and the second 16 bits to
identify the file. If this proves to be a limitation, it
should be possible to provide for a different split
between table space id and table within the space.
New function
------------
Currently new tables are created and existing tables are opened
by calling my_open with appropriate parameters. For table
spaces, a function my_open_table will be provided with a
similar interface:
File my_open(const char *FileName, int Flags, myf MyFlags)
/* Path-name of file */
/* Read | write .. */
/* Special flags */
File my_open_table(const char *TableSpaceName, const char *FileName, int Flags,
myf MyFlags)
/* Path-name of table file */
/* SQL-visible name of table */
/* Read | write .. */
/* Special flags */
my_open does not need to be modified; it returns a kernel file
descriptor as before. my_open_table performs the following
steps:
· Ensure that the specified table space is open. This
requires keeping a list of open table spaces.
· Locate the FileName within the table space.
· Return a "file descriptor" derived from the table space
and the table itself. The first component could be the
index of the table space in the list of open table spaces,
and the second could be related to the location of the
table within the table space. This could make it
unnecessary to maintain any further information in memory
about the individual tables.
Utility programs
----------------
A number of a priori objections have been addressed towards
table spaces. One of the biggest is that it is possible to use
UNIX commands to copy individual tables when stored in UNIX
files. This is no longer possible in this form when storing
files in table spaces.
This document does not address the issue of loss of consistency
when using this method; it is practised, and most users
probably understand the dangers. On the other hand, it is
relatively simple to write a program that extracts tables from
a table space and converts them into individual file pairs. It
would also be a good idea to have the converse functionality,
would also be a good idea to have the converse functionality,
to copy file pairs into a table space. Neither program appears
to be complicated.
New functionality
-----------------
Function my_open_table()
Functions requiring modifications
---------------------------------
my_register_filename() should register both file and table names.
At the current stage of this draft, it is possible that other
functions also require modification.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.