NDB Cluster Internals  /  NDB Cluster Start Phases  /  NDB_STTOR Phase 5

5.13 NDB_STTOR Phase 5

For initial starts and system restarts this phase means executing a local checkpoint. This is handled by the master so that the other nodes will return immediately from this phase. Node restarts and initial node restarts perform the copying of the records from the primary fragment replica to the starting fragment replicas in this phase. Local checkpoints are enabled before the copying process is begun.

Copying the data to a starting node is part of the node takeover protocol. As part of this protocol, the node status of the starting node is updated; this is communicated using the global checkpoint protocol. Waiting for these events to take place ensures that the new node status is communicated to all nodes and their system files.

After the node's status has been communicated, all nodes are signaled that we are about to start the takeover protocol for this node. Part of this protocol consists of Steps 3 - 9 during the system restart phase as described later in this section. This means that restoration of all the fragments, preparation for execution of the redo log, execution of the redo log, and finally reporting back to DBDIH when the execution of the redo log is completed, are all part of this process.

After preparations are complete, copy phase for each fragment in the node must be performed. The process of copying a fragment involves the following steps:

  1. The DBLQH kernel block in the starting node is informed that the copy process is about to begin by sending it a PREPARE_COPY_FRAGREQ signal.

  2. When DBLQH acknowledges this request a CREATE_FRAGREQ signal is sent to all nodes notify them of the preparation being made to copy data to this fragment replica for this table fragment.

  3. After all nodes have acknowledged this, a COPY_FRAGREQ signal is sent to the node from which the data is to be copied to the new node. This is always the primary fragment replica of the fragment. The node indicated copies all the data over to the starting node in response to this message.

  4. After copying has been completed, and a COPY_FRAGCONF message is sent, all nodes are notified of the completion through an UPDATE_TOREQ signal.

  5. After all nodes have updated to reflect the new state of the fragment, the DBLQH kernel block of the starting node is informed of the fact that the copy has been completed, and that the fragment replica is now up-to-date and any failures should now be treated as real failures.

  6. The new fragment replica is transformed into a primary fragment replica if this is the role it had when the table was created.

  7. After completing this change another round of CREATE_FRAGREQ messages is sent to all nodes informing them that the takeover of the fragment is now committed.

  8. After this, process is repeated with the next fragment if another one exists.

  9. When there are no more fragments for takeover by the node, all nodes are informed of this by sending an UPDATE_TOREQ signal sent to all of them.

  10. Wait for the next complete local checkpoint to occur, running from start to finish.

  11. The node states are updated, using a complete global checkpoint. As with the local checkpoint in the previous step, the global checkpoint must be permitted to start and then to finish.

  12. When the global checkpoint has completed, it will communicate the successful local checkpoint of this node restart by sending an END_TOREQ signal to all nodes.

  13. A START_COPYCONF is sent back to the starting node informing it that the node restart has been completed.

  14. Receiving the START_COPYCONF signal ends NDB_STTOR phase 5. This provides another synchronization point for system restarts, designated as WAITPOINT_5_2.

Note

The copy process in this phase can in theory be performed in parallel by several nodes. However, all messages from the master to all nodes are currently sent to single node at a time, but can be made completely parallel. This is likely to be done in the not too distant future.

In an initial and an initial node restart, the SUMA block requests the subscriptions from the SUMA master node. NDBCNTR executes NDB_STTOR phase 6. No other NDBCNTR activity takes place.