NDB supports online schema changes. A
schema object such as a Table or
Index has a 4-byte
schema object version identifier, which can
be observed in the output of the ndb_desc
utility (see ndb_desc — Describe NDB Tables),
as shown here (emphasized text):
shell> ndb_desc -c 127.0.0.1 -d test t1
-- t1 --
Version: 33554434
Fragment type: HashMapPartition
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 3
Number of primary keys: 1
Length of frm data: 269
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 4
ExtraRowGciBits: 0
ExtraRowAuthorBits: 0
TableStatus: Retrieved
HashMap: DEFAULT-HASHMAP-240-4
-- Attributes --
c1 Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
c2 Int NULL AT=FIXED ST=MEMORY
c4 Varchar(50;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY
-- Indexes --
PRIMARY KEY(c1) - UniqueHashIndex
PRIMARY(c1) - OrderedIndex
NDBT_ProgramExit: 0 - OK
The schema object version identifier (or simply “schema
version”) is made up of a major version and a minor
version; the major version occupies the (single) least sigificant
byte of the schema version, and the minor version the remaining (3
most significant) bytes. You can see these two components more
easily when viewing the schema version in hexadecimal notation. In
the example output just shown, the schema version is shown as
33554434, which in hexadecimal (filling in
leading zeroes as necessary) is 0x02000002;
this is equivalent to major version 2, minor version 2. Adding an
index to table t1 causes the schema version as
reported by ndb_desc to advance to
50331650, or 0x03000002
hexadecimal, which is equivalent to major version 2 (3 least
significant bytes 00 00 02), minor version 3
(most significant byte 03). Minor schema
versions start with 0 for a newly created table.
In addition, each NDB API database object class has its own
getObjectVersion() method that, like
Object::getObjectVersion(),
returns the object's schema object version. This includes
instances, not only of Object,
but of Table,
Index,
Column,
LogfileGroup,
Tablespace,
Datafile, and
Undofile, as well as
Event. (However,
NdbBlob::getVersion() has a
purpose and function that is completely unrelated to that of the
methods just listed.)
Schema changes which are considered backwards
compatible—such as adding a DEFAULT or
NULL column at the end of a table—cause
the table object's minor version to be incremented. Schema
changes which are not considered backwards compatible—such
as removing a column from a table—cause the major version to
be incremented.
While the implementation of an operation causing a schema major version change may actually involve 2 copies of the affected table (dropping and recreating the table), the final outcome can be observed as an increase in the table's major version.
Queries and DML operations which arrive from NDB clients also have an associated schema version, which is checked at the start of processing in the data nodes. If the schema version of the request differs from the affected database object's latest schema version only in its minor version component, the operation is considered compatible and is allowed to proceed. If the schema version differs in the major schema version then it will be rejected.
This mechanism allows the schema to be changed in the data nodes in various ways, without requiring a synchronized schema change in clients. Clients need not move on to the new schema version until they are ready to do so. Queries and DML operations can thus continue uninterrupted.
The NDB API and schema object versions.
An NDB API application normally uses an
NdbDictionary object associated
with an Ndb object to retrieve
schema objects. Schema objects are retrieved on demand from the
data nodes; signalling is used to obtain the table or index
definition; then, a local memory object is constructed which the
application can use. NDB internally caches schema objects, so
that each successive request for the same table or index by name
does not require signalling.
Global schema cache.
To avoid the need to signal to the data nodes for every schema
object lookup, a schema cache is used for each
Ndb_cluster_connection. This is
referred to as the global schema cache.
It is global in terms of spanning multiple Ndb objects.
Instantiated table and index objects are automatically put into
this cache to save on future signalling and instantiation costs.
The cache maintains a reference count for each object; this
count is used to determine when a given schema object can be
deleted. Schema objects can have their reference counts modified
by explicit API method calls or local schema cache operations.
Local schema cache.
In addition to the per-connection global schema cache, each
Ndb object's
NdbDictionary object has a
local schema cache. This cache contains
pointers to objects held in the global schema cache. Each local
schema cache holding a reference to a schema object in the
global schema cache increments the global schema cache reference
count by 1. Having a schema cache that is local to each
Ndb object allows schema objects to be looked
up without imposing any locks. The local schema cache is
normally emptied (reducing global cache reference counts in the
process) when its associated Ndb object is
deleted.
Operation without schema changes. Normal operation proceeds as follows in the cases listed below:
A table is requested by some client
(Ndb object) for the first
time.
The local cache is checked; the attempt results in a miss.
The global cache is then also checked (using a lock), and
the result is another miss.
Since there were no cache hits, the data node is sent a signal; the node's response is used to instantiate the table object. A pointer to the instantiated data object is added to the global cache; another such pointer is added to the local cache, and the reference count is set to 1. A pointer to the table is returned to the client.
A second client (a different Ndb
object) requests access to the same table, also by name.
A check of the local cache results in a miss, but a check of
the global cache yields a hit.
As a result, an object pointer is added to the local cache, the global reference count is incremented—so that its value is now 2—and an object pointer is returned to the client. No new pointer is added to the global cache.
For a second time, the second client requests access to same table by name. The local cache is checked, producing a hit. An object pointer is immediately returned to the client. No pointers are added to the local or global caches, and the object's reference count is not incremented (and so the reference count remains constant at 2).
Second client deletes Ndb object.
Objects in this client's local schema cache have their
reference counts decremented in global cache.
This sets the global cache reference count to 1. Since it is
not yet 0, no action is yet taken to remove the parent
Ndb object.
Schema changes.
Assuming that an object's schema never changes, the schema
version first retrieved is used for the lifetime of the
application process, and the in-memory object is deleted only
when all local cache references (that is, all references to
Ndb objects) have been deleted.
This is unlikely to occur other than during a shutdown or
cluster connection reset.
If an object's schema changes in a backwards compatible way while an application is running, this has the following affects:
The minor version at the data nodes is incremented. (Ongoing DML operations using the old schema version still succeed.)
NDB API clients subsequently retrieving the latest version of the schema object then fetch the new schema version.
NDB API clients with cached older versions do not use the new schema version unless and until their local and global caches are invalidated.
NDB API clients subscribing to events can observe a
TE_ALTER event for the table in question,
and can use this to trigger schema object cache invalidations.
Each local cache entry can be removed by calling
removeCachedTable()
or
removeCachedIndex().
This removes the entry from the local cache, and decrements
the reference count in the global cache. When (and if) the
global cache reference count reaches zero, the old cached
object can be deleted.
Alternatively, local cache entries can be removed, and the
global cache entry invalidated, by calling
invalidateTable()
or
invalidateIndex().
Subsequent calls to
getTable() or
getIndex() for
this and other clients return the new schema object version by
signalling the data nodes and instantiating a new object.
New Ndb objects fill their
local table caches on demand from the global table cache as
normal. This means that, once an old schema object has been
invalidated in the global cache, such objects retrieve the
latest table objects known at the time that the table objects
are first cached.
When an incompatible schema change is made (that is, a schema major version change), NDB API requests using the old version fail as soon as the new version is committed. This can also be used as a trigger to retrieve a new schema object version.
The rules governing the handling of schema version changes are summarized in the following list:
An online schema change (minor version change) does not affect
existing clients (Ndb
objects); clients can continue to use the old schema object
version
If and only if a client voluntarily removes cached objects by making API calls can it then observe the new schema object version.
As Ndb objects remove cached objects and
are deleted, the reference count on the old schema object
version decreases.
When this reference count reaches 0, the object can be deleted.
Implications of the schema object lifecycle.
The lifespan of a schema object (such as a
Table or
Index) is limited by the
lifetime of the Ndb object from
which it is obtained. When the parent Ndb
object of a schema object is deleted, the reference count which
keeps the Ndb object alive is decremented. If
this Ndb object holds the last reamining
reference to a given schema object version, the deletion of the
Ndb object can also result in the deletion of
the schema object. For this reason, no other threads can be
using the object at this time.
Care must be exercised when pointers to schema objects are held in
the application and used between multiple
Ndb objects. A schema object
should not be used beyond the lifespan of the
Ndb object which created it.
Applications can respond, asynchronously and independently of each other, to backwards compatible schema changes, moving to the new schema only when necessary. Different threads can operate on different schema object versions concurrently.
It is thus very important to ensure that schema objects do not
outlive the Ndb objects used to
create them. To help prevent this from happening, you can take any
of the following actions to invalidate old schema objects:
To trigger invalidation when and as needed, use NDB API
TE_ALTER events (see
Section 2.3.6.1.1, “The Event::TableEvent Type”).
Use an external trigger to initiate invalidation.
Perform a periodic invalidation explicitly.
Invalidating the caches in any of these ways allows applications to obtain new versions of schema objects as required.
It is also worth noting that not all NDB API
Table getter methods return
pointers; many of them (in addition to
Table::getName()) return table
names. Such methods include
Index::getTable(),
NdbOperation::getTableName(),
Event::getTableName(), and
NdbDictionary::getRecordTableName().
