NDB Cluster Internals  /  NDB Schema Object Versions

Chapter 6 NDB Schema Object Versions

NDB supports online schema changes. A schema object such as a Table or Index has a 4-byte schema object version identifier, which can be observed in the output of the ndb_desc utility (see ndb_desc — Describe NDB Tables), as shown here (emphasized text):

$> ndb_desc -c 127.0.0.1 -d test t1
-- t1 --
Version: 33554434
Fragment type: HashMapPartition
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 3
Number of primary keys: 1
Length of frm data: 269
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 4
ExtraRowGciBits: 0
ExtraRowAuthorBits: 0
TableStatus: Retrieved
HashMap: DEFAULT-HASHMAP-240-4
-- Attributes --
c1 Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
c2 Int NULL AT=FIXED ST=MEMORY
c4 Varchar(50;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY
-- Indexes --
PRIMARY KEY(c1) - UniqueHashIndex
PRIMARY(c1) - OrderedIndex

NDBT_ProgramExit: 0 - OK

The schema object version identifier (or simply schema version) is made up of a major version and a minor version; the major version occupies the (single) least sigificant byte of the schema version, and the minor version the remaining (3 most significant) bytes. You can see these two components more easily when viewing the schema version in hexadecimal notation. In the example output just shown, the schema version is shown as 33554434, which in hexadecimal (filling in leading zeroes as necessary) is 0x02000002; this is equivalent to major version 2, minor version 2. Adding an index to table t1 causes the schema version as reported by ndb_desc to advance to 50331650, or 0x03000002 hexadecimal, which is equivalent to major version 2 (3 least significant bytes 00 00 02), minor version 3 (most significant byte 03). Minor schema versions start with 0 for a newly created table.

In addition, each NDB API database object class has its own getObjectVersion() method that, like Object::getObjectVersion(), returns the object's schema object version. This includes instances, not only of Object, but of Table, Index, Column, LogfileGroup, Tablespace, Datafile, and Undofile, as well as Event. (However, NdbBlob::getVersion() has a purpose and function that is completely unrelated to that of the methods just listed.)

Schema changes which are considered backward compatible—such as adding a DEFAULT or NULL column at the end of a table—cause the table object's minor version to be incremented. Schema changes which are not considered backward compatible—such as removing a column from a table—cause the major version to be incremented.

Note

While the implementation of an operation causing a schema major version change may actually involve 2 copies of the affected table (dropping and recreating the table), the final outcome can be observed as an increase in the table's major version.

Queries and DML operations which arrive from NDB clients also have an associated schema version, which is checked at the start of processing in the data nodes. If the schema version of the request differs from the affected database object's latest schema version only in its minor version component, the operation is considered compatible and is allowed to proceed. If the schema version differs in the major schema version then it will be rejected.

This mechanism allows the schema to be changed in the data nodes in various ways, without requiring a synchronized schema change in clients. Clients need not move on to the new schema version until they are ready to do so. Queries and DML operations can thus continue uninterrupted.

The NDB API and schema object versions.  An NDB API application normally uses an NdbDictionary object associated with an Ndb object to retrieve schema objects. Schema objects are retrieved on demand from the data nodes; signalling is used to obtain the table or index definition; then, a local memory object is constructed which the application can use. NDB internally caches schema objects, so that each successive request for the same table or index by name does not require signalling.

Global schema cache.  To avoid the need to signal to the data nodes for every schema object lookup, a schema cache is used for each Ndb_cluster_connection. This is referred to as the global schema cache. It is global in terms of spanning multiple Ndb objects. Instantiated table and index objects are automatically put into this cache to save on future signalling and instantiation costs. The cache maintains a reference count for each object; this count is used to determine when a given schema object can be deleted. Schema objects can have their reference counts modified by explicit API method calls or local schema cache operations.

Local schema cache.  In addition to the per-connection global schema cache, each Ndb object's NdbDictionary object has a local schema cache. This cache contains pointers to objects held in the global schema cache. Each local schema cache holding a reference to a schema object in the global schema cache increments the global schema cache reference count by 1. Having a schema cache that is local to each Ndb object allows schema objects to be looked up without imposing any locks. The local schema cache is normally emptied (reducing global cache reference counts in the process) when its associated Ndb object is deleted.

Operation without schema changes.  Normal operation proceeds as follows in the cases listed below:

  1. A table is requested by some client (Ndb object) for the first time.  The local cache is checked; the attempt results in a miss. The global cache is then also checked (using a lock), and the result is another miss.

    Since there were no cache hits, the data node is sent a signal; the node's response is used to instantiate the table object. A pointer to the instantiated data object is added to the global cache; another such pointer is added to the local cache, and the reference count is set to 1. A pointer to the table is returned to the client.

  2. A second client (a different Ndb object) requests access to the same table, also by name.  A check of the local cache results in a miss, but a check of the global cache yields a hit.

    As a result, an object pointer is added to the local cache, the global reference count is incremented—so that its value is now 2—and an object pointer is returned to the client. No new pointer is added to the global cache.

  3. For a second time, the second client requests access to same table by name.  The local cache is checked, producing a hit. An object pointer is immediately returned to the client. No pointers are added to the local or global caches, and the object's reference count is not incremented (and so the reference count remains constant at 2).

  4. Second client deletes Ndb object.  Objects in this client's local schema cache have their reference counts decremented in global cache.

    This sets the global cache reference count to 1. Since it is not yet 0, no action is yet taken to remove the parent Ndb object.

Schema changes.  Assuming that an object's schema never changes, the schema version first retrieved is used for the lifetime of the application process, and the in-memory object is deleted only when all local cache references (that is, all references to Ndb objects) have been deleted. This is unlikely to occur other than during a shutdown or cluster connection reset.

If an object's schema changes in a backward-compatible way while an application is running, this has the following affects:

  • The minor version at the data nodes is incremented. (Ongoing DML operations using the old schema version still succeed.)

  • NDB API clients subsequently retrieving the latest version of the schema object then fetch the new schema version.

  • NDB API clients with cached older versions do not use the new schema version unless and until their local and global caches are invalidated.

  • NDB API clients subscribing to events can observe a TE_ALTER event for the table in question, and can use this to trigger schema object cache invalidations.

  • Each local cache entry can be removed by calling removeCachedTable() or removeCachedIndex(). This removes the entry from the local cache, and decrements the reference count in the global cache. When (and if) the global cache reference count reaches zero, the old cached object can be deleted.

  • Alternatively, local cache entries can be removed, and the global cache entry invalidated, by calling invalidateTable() or invalidateIndex(). Subsequent calls to getTable() or getIndex() for this and other clients return the new schema object version by signalling the data nodes and instantiating a new object.

  • New Ndb objects fill their local table caches on demand from the global table cache as normal. This means that, once an old schema object has been invalidated in the global cache, such objects retrieve the latest table objects known at the time that the table objects are first cached.

When an incompatible schema change is made (that is, a schema major version change), NDB API requests using the old version fail as soon as the new version is committed. This can also be used as a trigger to retrieve a new schema object version.

The rules governing the handling of schema version changes are summarized in the following list:

  • An online schema change (minor version change) does not affect existing clients (Ndb objects); clients can continue to use the old schema object version

  • If and only if a client voluntarily removes cached objects by making API calls can it then observe the new schema object version.

  • As Ndb objects remove cached objects and are deleted, the reference count on the old schema object version decreases.

  • When this reference count reaches 0, the object can be deleted.

Implications of the schema object lifecycle.  The lifespan of a schema object (such as a Table or Index) is limited by the lifetime of the Ndb object from which it is obtained. When the parent Ndb object of a schema object is deleted, the reference count which keeps the Ndb object alive is decremented. If this Ndb object holds the last reamining reference to a given schema object version, the deletion of the Ndb object can also result in the deletion of the schema object. For this reason, no other threads can be using the object at this time.

Care must be exercised when pointers to schema objects are held in the application and used between multiple Ndb objects. A schema object should not be used beyond the lifespan of the Ndb object which created it.

Applications can respond, asynchronously and independently of each other, to backward-compatible schema changes, moving to the new schema only when necessary. Different threads can operate on different schema object versions concurrently.

It is thus very important to ensure that schema objects do not outlive the Ndb objects used to create them. To help prevent this from happening, you can take any of the following actions to invalidate old schema objects:

  • To trigger invalidation when and as needed, use NDB API TE_ALTER events (see Event::TableEvent).

  • Use an external trigger to initiate invalidation.

  • Perform a periodic invalidation explicitly.

Invalidating the caches in any of these ways allows applications to obtain new versions of schema objects as required.

It is also worth noting that not all NDB API Table getter methods return pointers; many of them (in addition to Table::getName()) return table names. Such methods include Index::getTable(), NdbOperation::getTableName(), Event::getTableName(), and NdbDictionary::getRecordTableName().