This section contains unified change history highlights for all
MySQL Cluster releases based on version 6.3 of the
NDB storage engine through MySQL
Cluster NDB 6.3.55. Included are all changelog
entries in the categories MySQL Cluster,
Disk Data, and Cluster
For an overview of features that were added in MySQL Cluster NDB 6.3, see MySQL Cluster Development in MySQL Cluster NDB 6.3.
Version 5.1.73-ndb-6.3.54 has no changelog entries.
Version 5.1.72-ndb-6.3.53 has no changelog entries.
Version 5.1.69-ndb-6.3.52 has no changelog entries.
Node failure during the dropping of a table could lead to the node hanging when attempting to restart.
When this happened, the
internal dictionary (
DBDICT) lock taken by
the drop table operation was held indefinitely, and the logical
global schema lock taken by the SQL the drop table operation
from which the drop operation originated was held until the
NDB internal operation timed out. To aid in
debugging such occurrences, a new dump code,
which dumps the contents of the
queue, has been added in the ndb_mgm client.
A slow filesystem during local checkpointing could exert undue
DBDIH kernel block file page
buffers, which in turn could lead to a data node crash when
these were exhausted. This fix limits the number of table
definition updates that
DBDIH can issue
When the buffer pool used for
NDB API requests for primary key and scans was exhausted while
KeyInfo, the error handling
path did not correctly abort the scan request. Symptoms of this
incorrect error handling included the NDB API client that
requested the scan experiencing a long timeout, as well as
permanent leakage of the scan record, scan fragment records, and
linked operation record associated with the scan.
This issue is not present in MySQL Cluster NDB 7.0 and later,
due to the replacement of the fixed-size single-purpose buffers
as well as improvements in error handling.
When reloading the redo log during a node or system restart, and
greater than or equal to 42, it was possible for metadata to be
read for the wrong file (or files). Thus, the node or nodes
involved could try to reload the wrong set of data.
If the Transaction Coordinator aborted a transaction in the “prepared” state, this could cause a resource leak. (Bug #14208924)
DUMP 2303 in the ndb_mgm
client now includes the status of the single fragment scan
record reserved for a local checkpoint.
A shortage of scan fragment records in
resulted in a leak of concurrent scan table records and key
In some circumstances, transactions could be lost during an online upgrade. (Bug #13834481)
When trying to use ndb_size.pl
to connect to a MySQL server running on a nonstandard port, the
port argument was ignored.
(Bug #13364905, Bug #62635)
Attempting to add both a column and an index on that column in
the same online
statement caused mysqld to fail. Although
this issue affected only the mysqld shipped
with MySQL Cluster, the table named in the
TABLE could use any storage engine for which online
operations are supported.
When an NDB API application called
again after the previous call had returned end-of-file (return
code 1), a transaction object was leaked. Now when this happens,
NDB returns error code 4210 (Ndb sent more info than
length specified); previouslyu in such cases, -1 was
returned. In addition, the extra transaction object associated
with the scan is freed, by returning it to the transaction
coordinator's idle list.
When a failure of multiple data nodes during a local checkpoint (LCP) that took a long time to complete included the node designated as master, any new data nodes attempting to start before all ongoing LCPs were completed later crashed. This was due to the fact that node takeover by the new master cannot be completed until there are no pending local checkpoints. Long-running LCPs such as those which triggered this issue can occur when fragment sizes are sufficiently large (see MySQL Cluster Nodes, Node Groups, Replicas, and Partitions, for more information). Now in such cases, data nodes (other than the new master) are kept from restarting until the takeover is complete. (Bug #13323589)
When deleting from multiple tables using a unique key in the
WHERE condition, the wrong rows were deleted.
UPDATE triggers failed when rows
were changed by deleting from or updating multiple tables.
(Bug #12718336, Bug #61705, Bug #12728221)
When replicating DML statements with
between clusters, the number of operations that failed due to
nonexistent keys was expected to be no greater than the number
of defined operations of any single type. Because the slave SQL
thread defines operations of multiple types in batches together,
code which relied on this assumption could cause
mysqld to fail.
When failure handling of an API node takes longer than 300 seconds, extra debug information is included in the resulting output. In cases where the API node's node ID was greater than 48, these extra debug messages could lead to a crash, and confuing output otherwise. This was due to an attempt to provide information specific to data nodes for API nodes as well. (Bug #62208)
In rare cases, a series of node restarts and crashes during restarts could lead to errors while reading the redo log. (Bug #62206)
When global checkpoint indexes were written with no intervening end-of-file or megabyte border markers, this could sometimes lead to a situation in which the end of the redo log was mistakenly regarded as being between these GCIs, so that if the restart of a data node took place before the start of the next redo log was overwritten, the node encountered an Error while reading the REDO log. (Bug #12653993, Bug #61500)
References: See also Bug #56961.
Error reporting has been improved for cases in which API nodes are unable to connect due to apparent unavailability of node IDs. (Bug #12598398)
Error messages for Failed to convert connection transporter registration problems were inspecific. (Bug #12589691)
Under certain rare circumstances, a data node process could fail
with Signal 11 during a restart. This was due to uninitialized
variables in the
QMGR kernel block.
Handling of the
configuration parameters was not consistent in all parts of the
NDB kernel, and were only strictly
enforced by the
SUMA kernel blocks. This could lead to
problems when tables could be created but not replicated. Now
these parameters are treated by
DBDICT as suggested maximums rather than hard
limits, as they are elsewhere in the
Within a transaction, after creating, executing, and closing a
creating and executing but not closing a second scan caused the
application to crash.
Two unused test files in
storage/ndb/test/sql contained incorrect
versions of the GNU Lesser General Public License. The files and
the directory containing them have been removed.
References: See also Bug #11810224.
Cluster API: Performing interpreted operations using a unique index did not work correctly, because the interpret bit was kept when sending the lookup to the index table.
A scan with a pushed condition (filter) using the
CommittedRead lock mode could hang for a
short interval when it was aborted when just as it had decided
to send a batch.
When aborting a multi-read range scan exactly as it was changing ranges in the local query handler, LQH could fail to detect it, leaving the scan hanging. (Bug #11929643)
Limits imposed by the size of
not always enforced consistently with regard to Disk Data undo
buffers and log files. This could sometimes cause a
CREATE LOGFILE GROUP or
ALTER LOGFILE GROUP statement to
fail for no apparent reason, or cause the log file group
to be created when starting the cluster.
Functionality Added or Changed
Functionality Added or Changed
for ndb_restore. This option causes
ndb_restore to ignore tables corrupted due to
missing blob parts tables, and to continue reading from the
backup file and restoring the remaining tables.
References: See also Bug #51652.
Two related problems could occur with read-committed scans made in parallel with transactions combining multiple (concurrent) operations:
When committing a multiple-operation transaction that contained concurrent insert and update operations on the same record, the commit arrived first for the insert and then for the update. If a read-committed scan arrived between these operations, it could thus read incorrect data; in addition, if the scan read variable-size data, it could cause the data node to fail.
When rolling back a multiple-operation transaction having concurrent delete and insert operations on the same record, the abort arrived first for the delete operation, and then for the insert. If a read-committed scan arrived between the delete and the insert, it could incorrectly assume that the record should not be returned (in other words, the scan treated the insert as though it had not yet been committed).
A row insert or update followed by a delete operation on the same row within the same transaction could in some cases lead to a buffer overflow. (Bug #59242)
References: See also Bug #56524. This bug was introduced by Bug #35208.
FAIL_REP signal, used inside the NDB
kernel to declare that a node has failed, now includes the node
ID of the node that detected the failure. This information can
be useful in debugging.
In some circumstances, an SQL trigger on an
NDB table could read stale data.
During a node takeover, it was possible in some circumstances
for one of the remaining nodes to send an extra transaction
LQH_TRANSCONF) signal to the
DBTC kernel block, conceivably leading to a
crash of the data node trying to take over as the new
A query having multiple predicates joined by
OR in the
WHERE clause and
which used the
sort_union access method (as
EXPLAIN) could return
Trying to drop an index while it was being used to perform scan updates caused data nodes to crash. (Bug #58277, Bug #57057)
When handling failures of multiple data nodes, an error in the construction of internal signals could cause the cluster's remaining nodes to crash. This issue was most likely to affect clusters with large numbers of data nodes. (Bug #58240)
The number of rows affected by a statement that used a
WHERE clause having an
IN condition with a value list
containing a great many elements, and that deleted or updated
enough rows such that
them in batches, was not computed or reported correctly.
A query using
BETWEEN as part of a
WHERE condition could cause
mysqld to hang or crash.
In some circumstances, it was possible for
mysqld to begin a new multi-range read scan
without having closed a previous one. This could lead to
exhaustion of all scan operation objects, transaction objects,
or lock objects (or some combination of these) in
NDB, causing queries to fail with
such errors as Lock wait timeout exceeded
or Connect failure - out of connection
References: See also Bug #58750.
a table with a unique index created with
returned an empty result.
enabled, a query using
LIKE on an
ENUM column of an
NDB table failed to return any
results. This issue is resolved by disabling
performing such queries.
When a slash character (
/) was used as part
of the name of an index on an
table, attempting to execute a
TABLE statement on the table failed with the error
Index not found, and the table was
In certain cases, a race condition could occur when
DROP LOGFILE GROUP removed the
logfile group while a read or write of one of the effected files
was in progress, which in turn could lead to a crash of the data
A race condition could sometimes be created when
DROP TABLESPACE was run
concurrently with a local checkpoint; this could in turn lead to
a crash of the data node.
Disk Data: Performing what should have been an online drop of a multi-column index was actually performed offline. (Bug #55618)
When at least one data node was not running, queries against the
INFORMATION_SCHEMA.FILES table took
an excessive length of time to complete because the MySQL server
waited for responses from any stopped nodes to time out. Now, in
such cases, MySQL does not attempt to contact nodes which are
not known to be running.
Attempting to read the same value (using
than 9000 times within the same transaction caused the
transaction to hang when executed. Now when more reads are
performed in this way than can be accommodated in a single
transaction, the call to
with a suitable error.
Functionality Added or Changed
Id configuration parameter used with
MySQL Cluster management, data, and API nodes (including SQL
nodes) is now deprecated, and the
parameter (long available as a synonym for
when configuring these types of nodes) should be used instead.
Id continues to be supported for reasons of
backward compatibility, but now generates a warning when used
with these types of nodes, and is subject to removal in a future
release of MySQL Cluster.
This change affects the name of the configuration parameter
only, establishing a clear preference for
Id in the
sections of the MySQL Cluster global configuration
config.ini) file. The behavior of unique
identifiers for management, data, and SQL and API nodes in MySQL
Cluster has not otherwise been altered.
Id parameter as used in the
[computer] section of the MySQL Cluster
global configuration file is not affected by this change.
MySQL Cluster RPM distributions did not include a
shared-compat RPM for the MySQL Server, which
meant that MySQL applications depending on
libmysqlclient.so.15 (MySQL 5.0 and
earlier) no longer worked.
LQHKEYREQ request message used by the
local query handler when checking the major schema version of a
table, being only 16 bits wide, could cause this check to fail
with an Invalid schema version error
NDB error code 1227). This issue
occurred after creating and dropping (and re-creating) the same
table 65537 times, then trying to insert rows into the table.
References: See also Bug #57897.
An internal buffer overrun could cause a data node to fail. (Bug #57767)
Data nodes compiled with gcc 4.5 or higher crashed during startup. (Bug #57761)
ndb_restore now retries failed transactions when replaying log entries, just as it does when restoring data. (Bug #57618)
During a GCP takeover, it was possible for one of the data nodes
not to receive a
with the result that it would report itself as
GCP_COMMITTING while the other data nodes
of the form
when selecting from
NDB table having a primary key
on multiple columns could result in Error 4259
Invalid set of range scan bounds if
range2 started exactly where
range1 ended and the primary key
definition declared the columns in a different order relative to
the order in the table's column list. (Such a query should
simply return all rows in the table, since any expression
is always true.)
CREATE TABLE t (a, b, PRIMARY KEY (b, a)) ENGINE NDB;
This issue could then be triggered by a query such as this one:
SELECT * FROM t WHERE b < 8 OR b >= 8;
In addition, the order of the ranges in the
WHERE clause was significant; the issue was
not triggered, for example, by the query
SELECT * FROM t WHERE b
<= 8 OR b > 8.
A GCP stop is detected using 2 parameters which determine the
maximum time that a global checkpoint or epoch can go unchanged;
one of these controls this timeout for GCPs and one controls the
timeout for epochs. Suppose the cluster is configured such that
is 100 ms but
1500 ms. A node failure can be signalled after 4 missed
heartbeats—in this case, 6000 ms. However, this would
causing false detection of a GCP. To prevent this from
happening, the configured value for
is automatically adjusted, based on the values of
The current issue arose when the automatic adjustment routine
did not correctly take into consideration the fact that, during
cascading node-failures, several intervals of length
* (HeartbeatIntervalDBDB + ArbitrationTimeout) may
elapse before all node failures have internally been resolved.
This could cause false GCP detection in the event of a cascading
WHERE against an
NDB table having a
VARCHAR column as its primary key
failed to return all matching rows.
When a data node angel process failed to fork off a new worker
process (to replace one that had failed), the failure was not
handled. This meant that the angel process either transformed
itself into a worker process, or itself failed. In the first
case, the data node continued to run, but there was no longer
any angel to restart it in the event of failure, even with
StopOnError set to 0.
An application dropping a table at the same time that another
application tried to set up a replication event on the same
table could lead to a crash of the data node. The same issue
could sometimes cause
An NDB API client program under load could abort with an
assertion error in
References: See also Bug #32708.
Functionality Added or Changed
References: See also Bug #34325, Bug #11747863.
The MGM API function
was previously internal, has now been moved to the public API.
This function can be used to get
engine and other version information from the management server.
References: See also Bug #51273.
A data node can be shut down having completed and synchronized a
x, while having written a
great many log records belonging to the next GCI
x + 1, as part of normal operations.
However, when starting, completing, and synchronizing GCI
x + 1, then the log records from
original start must not be read. To make sure that this does not
happen, the REDO log reader finds the last GCI to restore, scans
forward from that point, and erases any log records that were
not (and should never be) used.
The current issue occurred because this scan stopped immediately as soon as it encountered an empty page. This was problematic because the REDO log is divided into several files; thus, it could be that there were log records in the beginning of the next file, even if the end of the previous file was empty. These log records were never invalidated; following a start or restart, they could be reused, leading to a corrupt REDO log. (Bug #56961)
An error in program flow in
result in data node shutdown routines being called multiple
DROP TABLE operations among
several SQL nodes attached to a MySQL Cluster. the
LOCK_OPEN lock normally protecting
mysqld's internal table list is released
so that other queries or DML statements are not blocked.
However, to make sure that other DDL is not executed
simultaneously, a global schema lock (implemented as a row-level
NDB) is used, such that all
operations that can modify the state of the
mysqld internal table list also need to
acquire this global schema lock. The
TABLE STATUS statement did not acquire this lock.
In certain cases,
could sometimes leave behind a cached table object, which caused
problems with subsequent DDL operations.
Memory pages used for
assigned to ordered indexes, were not ever freed, even after any
rows that belonged to the corresponding indexes had been
MySQL Cluster stores, for each row in each
NDB table, a Global Checkpoint Index (GCI)
which identifies the last committed transaction that modified
the row. As such, a GCI can be thought of as a coarse-grained
Due to changes in the format used by
store local checkpoints (LCPs) in MySQL Cluster NDB 6.3.11, it
could happen that, following cluster shutdown and subsequent
recovery, the GCI values for some rows could be changed
unnecessarily; this could possibly, over the course of many node
or system restarts (or both), lead to an inconsistent database.
When multiple SQL nodes were connected to the cluster and one of
them stopped in the middle of a DDL operation, the
mysqld process issuing the DDL timed out with
the error distributing
tbl_name timed out.
TABLE ... ADD COLUMN operation that changed the table
schema such that the number of 32-bit words used for the bitmask
allocated to each DML operation increased during a transaction
in DML which was performed prior to DDL which was followed by
either another DML operation or—if using
replication—a commit, led to data node failure.
This was because the data node did not take into account that the bitmask for the before-image was smaller than the current bitmask, which caused the node to crash. (Bug #56524)
References: This bug is a regression of Bug #35208.
The text file
containing old MySQL Cluster changelog information was no longer
being maintained, and so has been removed from the tree.
The failure of a data node during some scans could cause other data nodes to fail. (Bug #54945)
Exhausting the number of available commit-ack markers
(controlled by the
parameter) led to a data node crash.
When running a
SELECT on an
NDB table with
TEXT columns, memory was
allocated for the columns but was not freed until the end of the
SELECT. This could cause problems
with excessive memory usage when dumping (using for example
mysqldump) tables with such columns and
having many rows, large column values, or both.
References: See also Bug #56488, Bug #50310.
Functionality Added or Changed
More finely grained control over restart-on-failure behavior is
provided with two new data node configuration parameters
limits the total number of retries made before giving up on
starting the data node;
the number of seconds between retry attempts.
These parameters are used only if
StopOnError is set to 0.
For more information, see Defining MySQL Cluster Data Nodes. (Bug #54341)
Following a failure of the master data node, the new master sometimes experienced a race condition which caused the node to terminate with a GcpStop error. (Bug #56044)
The warning MaxNoOfExecutionThreads
#) > LockExecuteThreadToCPU count
#), this could cause
contention could be logged when running
ndbd, even though the condition described can
occur only when using ndbmtd.
The graceful shutdown of a data node could sometimes cause transactions to be aborted unnecessarily. (Bug #18538)
References: See also Bug #55641.
Functionality Added or Changed
Important Change; Cluster API:
The poll and select calls made by the MGM API were not
interrupt-safe; that is, a signal caught by the process while
waiting for an event on one or more sockets returned error -1
errno set to
EINTR. This caused problems with MGM API
functions such as
To fix this problem, the internal
ndb_socket_poller::poll() function has been
The old version of this function has been retained as
poll_unsafe(), for use by those parts of NDB
that do not need the EINTR-safe version
of the function.
When another data node failed, a given data node
DBTC kernel block could time out while
DBDIH to signal commits of
pending transactions, leading to a crash. Now in such cases the
timeout generates a prinout, and the data node continues to
The configure.js option
WITHOUT_DYNAMIC_PLUGINS=TRUE was ignored when
building MySQL Cluster for Windows using
CMake. Among the effects of this issue was
that CMake attempted to build the
InnoDB storage engine as a plugin
.DLL file) even though the
Plugin is not currently supported by MySQL Cluster.
It was possible for a
DATABASE statement to remove
NDB hidden blob tables without
removing the parent tables, with the result that the tables,
although hidden to MySQL clients, were still visible in the
output of ndb_show_tables but could not be
dropped using ndb_drop_table.
An excessive number of timeout warnings (normally used only for debugging) were written to the data node logs. (Bug #53987)
Disk Data: As an optimization when inserting a row to an empty page, the page is not read, but rather simply initialized. However, this optimzation was performed in all cases when an empty row was inserted, even though it should have been done only if it was the first time that the page had been used by a table or fragment. This is because, if the page had been in use, and then all records had been released from it, the page still needed to be read to learn its log sequence number (LSN).
This caused problems only if the page had been flushed using an incorrect LSN and the data node failed before any local checkpoint was completed—which would remove any need to apply the undo log, hence the incorrect LSN was ignored.
The user-visible result of the incorrect LSN was that it caused the data node to fail during a restart. It was perhaps also possible (although not conclusively proven) that this issue could lead to incorrect data. (Bug #54986)
Functionality Added or Changed
Restrictions on some types of mismatches in column definitions when restoring data using ndb_restore have been relaxed. These include the following types of mismatches:
Different default values
Different distribution key settings
Now, when one of these types of mismatches in column definitions is encountered, ndb_restore no longer stops with an error; instead, it accepts the data and inserts it into the target table, while issuing a warning to the user.
For more information, see ndb_restore — Restore a MySQL Cluster Backup. (Bug #54423)
References: See also Bug #53810, Bug #54178, Bug #54242, Bug #54279.
HeartbeatOrder data node
configuration parameter, which can be used to set the order in
which heartbeats are transmitted between data nodes. This
parameter can be useful in situations where multiple data nodes
are running on the same host and a temporary disruption in
connectivity between hosts would otherwise cause the loss of a
node group, leading to failure of the cluster.
The disconnection of all API nodes (including SQL nodes) during
ALTER TABLE caused a memory
If a node shutdown (either in isolation or as part of a system shutdown) occurred directly following a local checkpoint, it was possible that this local checkpoint would not be used when restoring the cluster. (Bug #54611)
When performing an online alter table where 2 or more SQL nodes connected to the cluster were generating binary logs, an incorrect message could be sent from the data nodes, causing mysqld processes to crash. This problem was often difficult to detect, because restarting SQL node or data node processes could clear the error, and because the crash in mysqld did not occur until several minutes after the erroneous message was sent and received. (Bug #54168)
A table having the maximum number of attributes permitted could not be backed up using the ndb_mgm client.
The maximum number of attributes supported per table is not the same for all MySQL Cluster releases. See Limits Associated with Database Objects in MySQL Cluster, to determine the maximum that applies in the release which you are using.
During initial node restarts, initialization of the REDO log was always performed 1 node at a time, during start phase 4. Now this is done during start phase 2, so that the initialization can be performed in parallel, thus decreasing the time required for initial restarts involving multiple nodes. (Bug #50062)
Cluster API: When using the NDB API, it was possible to rename a table with the same name as that of an existing table.
This issue did not affect table renames executed using SQL on MySQL servers acting as MySQL Cluster API nodes.
Cluster API: An excessive number of client connections, such that more than 1024 file descriptors, sockets, or both were open, caused NDB API applications to crash. (Bug #34303)
Functionality Added or Changed
--wait-nodes option has been added for
ndb_waiter. When this option is used, the
program waits only for the nodes having the listed IDs to reach
the desired state. For more information, see
ndb_waiter — Wait for MySQL Cluster to Reach a Given Status.
option for ndb_restore. This option causes
ndb_restore to ignore any schema objects
which it does not recognize. Currently, this is useful chiefly
for restoring native backups made from a cluster running MySQL
Cluster NDB 7.0 to a cluster running MySQL Cluster NDB 6.3.
Incompatible Change; Cluster API: The default behavior of the NDB API Event API has changed as follows:
Previously, when creating an
Event, DDL operations (alter
and drop operations on tables) were automatically reported on
any event operation that used this event, but as a result of
this change, this is no longer the case. Instead, you must now
invoke the event's
setReport() method, with
ER_DDL, to get this behavior.
For existing NDB API applications where you wish to retain the
old behavior, you must update the code as indicated previously,
then recompile, following an upgrade. Otherwise, DDL operations
are no longer reported after upgrading
When attempting to create an
table on an SQL node that had not yet connected to a MySQL
Cluster management server since the SQL node's last
statement failed as expected, but with the unexpected Error 1495
For the partitioned engine it is necessary to define
(Bug #11747335, Bug #31853)
Creating a Disk Data table, dropping it, then creating an
in-memory table and performing a restart, could cause data node
processes to fail with errors in the
kernel block if the new table's internal ID was the same as
that of the old Disk Data table. This could occur because undo
log handling during the restart did not check that the table
having this ID was now in-memory only.
A table created while
enabled was not always stored to disk, which could lead to a
data node crash with Error opening DIH schema files
An internal buffer allocator used by
NDB has the form
alloc( and attempts to
wanted pages, but is
permitted to allocate a smaller number of pages, between
minimum. However, this allocator
could sometimes allocate fewer than
minimum pages, causing problems with
multi-threaded building of ordered indexes.
When compiled with support for
epoll but this
functionality is not available at runtime, MySQL Cluster tries
to fall back to use the
select() function in
its place. However, an extra
in the transporter registry code caused ndbd
to fail instead.
NDB truncated a column declared as
DECIMAL(65,0) to a length of 64.
Now such a column is accepted and handled correctly. In cases
where the maximum length (65) is exceeded,
NDB now raises an error instead of
When a watchdog shutdown occurred due to an error, the process
was not terminated quickly enough, sometimes resulting in a
hang. (To correct this, the internal
function is now called in such situations, rather than
higher than 4G on 32-bit platforms caused
ndbd to crash, instead of failing gracefully
with an error.
(Bug #52536, Bug #50928)
NDB did not distinguish correctly between table names differing
only by lettercase when
lower_case_table_names was set
ndb_mgm -e "ALL STATUS" erroneously reported
that data nodes remained in start phase 0 until they had
Functionality Added or Changed
It is now possible to determine, using the
ndb_desc utility or the NDB API, which data
nodes contain replicas of which partitions. For
ndb_desc, a new
--extra-node-info option is
added to cause this information to be included in its output. A
added to the NDB API for obtaining this information
DUMP commands returned output to all
ndb_mgm clients connected to the same MySQL
Cluster. Now, these commands return their output only to the
ndb_mgm client that actually issued the
If a node or cluster failure occurred while
mysqld was scanning the
ndb.ndb_schema table (which it does when
attempting to connect to the cluster), insufficient error
handling could lead to a crash by mysqld in
certain cases. This could happen in a MySQL Cluster with a great
many tables, when trying to restart data nodes while one or more
mysqld processes were restarting.
After running a mixed series of node and system restarts, a system restart could hang or fail altogether. This was caused by setting the value of the newest completed global checkpoint too low for a data node performing a node restart, which led to the node reporting incorrect GCI intervals for its first local checkpoint. (Bug #52217)
When performing a complex mix of node restarts and system
restarts, the node that was elected as master sometimes required
optimized node recovery due to missing
information. When this happened, the node crashed with
Failure to recreate object ... during restart, error
721 (because the
code was run twice). Now when this occurs, node takeover is
executed immediately, rather than being made to wait until the
remaining data nodes have started.
References: See also Bug #48436.
The redo log protects itself from being filled up by
periodically checking how much space remains free. If
insufficient redo log space is available, it sets the state
TAIL_PROBLEM which results in transactions
being aborted with error code 410 (out of redo
log). However, this state was not set following a
node restart, which meant that if a data node had insufficient
redo log space following a node restart, it could crash a short
time later with Fatal error due to end of REDO
log. Now, this space is checked during node
The output of the ndb_mgm client
REPORT BACKUPSTATUS command could sometimes
contain errors due to uninitialized data.
GROUP BY query against
NDB tables sometimes did not use
any indexes unless the query included a
INDEX option. With this fix, indexes are used by such
queries (where otherwise possible) even when
INDEX is not specified.
The ndb_mgm client sometimes inserted extra
prompts within the output of the
Issuing a command in the ndb_mgm client after it had lost its connection to the management server could cause the client to crash. (Bug #49219)
The ndb_print_backup_file utility failed to function, due to a previous internal change in the NDB code. (Bug #41512, Bug #48673)
configuration parameter was set in
config.ini, the ndb_mgm
REPORT MEMORYUSAGE command printed its
output multiple times.
ndb_mgm -e "... REPORT ..." did not write any
The fix for this issue also prevents the cluster log from being
INFO messages when
DataMemory usage reaches
100%, and insures that when the usage is decreased, an
appropriate message is written to the cluster log.
(Bug #31542, Bug #44183, Bug #49782)
Disk Data: Inserts of blob column values into a MySQL Cluster Disk Data table that exhausted the tablespace resulted in misleading no such tuple error messages rather than the expected error tablespace full.
This issue appeared similar to Bug #48113, but had a different underlying cause. (Bug #52201)
The error message returned after atttempting to execute
ALTER LOGFILE GROUP on an
nonexistent logfile group did not indicate the reason for the
When reading blob data with lock mode
LM_SimpleRead, the lock was not upgraded as
A number of issues were corrected in the NDB API coding examples
found in the
directory in the MySQL Cluster source tree. These included
possible endless recursion in
ndbapi_scan.cpp as well as problems running
some of the examples on systems using Windows or Mac OS X due to
the lettercase used for some table names.
(Bug #30552, Bug #30737)
Functionality Added or Changed
A new configuration parameter
HeartbeatThreadPriority makes it possible to
select between a first-in, first-out or round-round scheduling
policy for management node and API node heartbeat threads, as
well as to set the priority of these threads. See
Defining a MySQL Cluster Management Server, or
Defining SQL and Other API Nodes in a MySQL Cluster, for more
The ndb_desc utility can now show the extent
space and free extent space for subordinate
TEXT columns (stored in hidden
BLOB tables by NDB). A
--blob-info option has been
added for this program that causes ndb_desc
to generate a report for each subordinate
BLOB table. For more information, see
ndb_desc — Describe NDB Tables.
When one or more data nodes read their LCPs and applied undo logs significantly faster than others, this could lead to a race condition causing system restarts of data nodes to hang. This could most often occur when using both ndbd and ndbmtd processes for the data nodes. (Bug #51644)
When deciding how to divide the REDO log, the
DBDIH kernel block saved more than was needed
to restore the previous local checkpoint, which could cause REDO
log space to be exhausted prematurely (
DML operations can fail with
NDB error 1220
(REDO log files overloaded...) if the
opening and closing of REDO log files takes too much time. If
this occurred as a GCI marker was being written in the REDO log
while REDO log file 0 was being opened or closed, the error
could persist until a GCP stop was encountered. This issue could
be triggered when there was insufficient REDO log space (for
example, with configuration parameter settings
NoOfFragmentLogFiles = 6 and
FragmentLogFileSize = 6M) with a load
including a very high number of updates.
References: See also Bug #20904.
During an online upgrade from MySQL Cluster NDB 6.2 to MySQL
Cluster NDB 6.3, a sufficiently large amount of traffic with
more than 1 DML operation per transaction could lead.an NDB 6.3
data node to crash an NDB 6.2 data node with an internal error
DBLOQH kernel block.
A side effect of the ndb_restore
--rebuild-indexes options is
to change the schema versions of indexes. When a
mysqld later tried to drop a table that had
been restored from backup using one or both of these options,
the server failed to detect these changed indexes. This caused
the table to be dropped, but the indexes to be left behind,
leading to problems with subsequent backup and restore
ndb_restore crashed while trying to restore a corrupted backup, due to missing error handling. (Bug #51223)
The ndb_restore message
created index `PRIMARY`... was directed to
stderr instead of
equal to 1 or 2, if data nodes from one node group were
restarted 256 times and applications were running traffic such
that it would encounter
1204 (Temporary failure, distribution
changed), the live node in the node group would
crash, causing the cluster to crash as well. The crash occurred
only when the error was encountered on the 256th restart; having
the error on any previous or subsequent restart did not cause
For a Disk Data tablespace whose extent size was not equal to a
whole multiple of 32K, the value of the
FREE_EXTENTS column in the
INFORMATION_SCHEMA.FILES table was
smaller than the value of
As part of this fix, the implicit rounding of
UNDO_BUFFER_SIZE performed by
CREATE TABLESPACE Syntax) is now done explicitly, and
the rounded values are used for calculating
values and other purposes.
References: See also Bug #31712.
Once all data files associated with a given tablespace had been
dropped, there was no way for MySQL client applications
(including the mysql client) to tell that the
tablespace still existed. To remedy this problem,
INFORMATION_SCHEMA.FILES now holds
an additional row for each tablespace. (Previously, only the
data files in each tablespace were shown.) This row shows
TABLESPACE in the
FILE_TYPE column, and
It was possible to issue a
TABLESPACE statement in which
INITIAL_SIZE was less than
EXTENT_SIZE. (In such cases,
erroneously reported the value of the
FREE_EXTENTS column as
and that of the
TOTAL_EXTENTS column as
0.) Now when either of these statements is
issued such that
INITIAL_SIZE is less than
EXTENT_SIZE, the statement fails with an
appropriate error message.
References: See also Bug #49709.
Cluster API: An issue internal to ndb_mgm could cause problems when trying to start a large number of data nodes at the same time. (Bug #51273)
References: See also Bug #51310.
greater than 2GB could cause data nodes to crash while starting.
An initial restart of a data node configured with a large amount of memory could fail with a Pointer too large error. (Bug #51027)
References: This bug was introduced by Bug #47818.
Functionality Added or Changed
The maximum permitted value of the
system variable has been increased from 256 to 65536.
greater than 1 with more than 31 ordered indexes caused node and
system restarts to fail.
Dropping unique indexes in parallel while they were in use could cause node and cluster failures. (Bug #50118)
When setting the
configuration parameter failed, only the error Failed
to memlock pages... was returned. Now in such cases
the operating system's error code is also returned.
If a query on an
NDB table compared
a constant string value to a column, and the length of the
string was greater than that of the column, condition pushdown
did not work correctly. (The string was truncated to fit the
column length before being pushed down.) Now in such cases, the
condition is no longer pushed down.
Performing intensive inserts and deletes in parallel with a high
scan load could a data node crashes due to a failure in the
DBACC kernel block. This was because checking
for when to perform bucket splits or merges considered the first
4 scans only.
During Start Phases 1 and 2, the
command sometimes (falsely) returned
Connected for data nodes running
mysqld could sometimes crash during a commit while trying to handle NDB Error 4028 Node failure caused abort of transaction. (Bug #38577)
the stated memory was not allocated when the node was started,
but rather only when the memory was used by the data node
process for other reasons.
Trying to insert more rows than would fit into an
NDB table caused data nodes to crash. Now in
such situations, the insert fails gracefully with error 633
Table fragment hash index has reached maximum
When a crash occurs due to a problem in Disk Data code, the
currently active page list is printed to
stdout (that is, in one or more
files). One of these lists could contain an endless loop; this
caused a printout that was effectively never-ending. Now in such
cases, a maximum of 512 entries is printed from each list.
Functionality Added or Changed
Added multi-threaded ordered index building capability during
system restarts or node restarts, controlled by the
node configuration parameter (also introduced in this release).
Functionality Added or Changed
This enhanced functionality is supported for upgrades to MySQL
Cluster NDB 7.0 when the
version is 7.0.10 or later.
(Bug #48528, Bug #49163)
Whether a system restart or a node restart is required when resetting that parameter;
Whether cluster nodes need to be restarted using the
--initial option when resetting the
Node takeover during a system restart occurs when the REDO log for one or more data nodes is out of date, so that a node restart is invoked for that node or those nodes. If this happens while a mysqld process is attached to the cluster as an SQL node, the mysqld takes a global schema lock (a row lock), while trying to set up cluster-internal replication.
However, this setup process could fail, causing the global schema lock to be held for an excessive length of time, which made the node restart hang as well. As a result, the mysqld failed to set up cluster-internal replication, which led to tables being read only, and caused one node to hang during the restart.
This issue could actually occur in MySQL Cluster NDB 7.0 only, but the fix was also applied MySQL Cluster NDB 6.3, to keep the two codebases in alignment.
If the master data node receiving a request from a newly started API or data node for a node ID died before the request has been handled, the management server waited (and kept a mutex) until all handling of this node failure was complete before responding to any other connections, instead of responding to other connections as soon as it was informed of the node failure (that is, it waited until it had received a NF_COMPLETEREP signal rather than a NODE_FAILREP signal). On visible effect of this misbehavior was that it caused management client commands such as SHOW and ALL STATUS to respond with unnecessary slowness in such circumstances. (Bug #49207)
When evaluating the options
ndb_restore program overwrote the result of
the database-level options with the result of the table-level
options rather than merging these results together, sometimes
leading to unexpected and unpredictable results.
As part of the fix for this problem, the semantics of these options have been clarified; because of this, the rules governing their evaluation have changed slightly. These changes be summed up as follows:
--exclude-* options are now evaluated from
right to left in the order in which they are passed to
--exclude-* options are now cumulative.
In the event of a conflict, the first (rightmost) option takes precedence.
For more detailed information and examples, see ndb_restore — Restore a MySQL Cluster Backup. (Bug #48907)
When performing tasks that generated large amounts of I/O (such as when using ndb_restore), an internal memory buffer could overflow, causing data nodes to fail with signal 6.
Subsequent analysis showed that this buffer was not actually required, so this fix removes it. (Bug #48861)
Exhaustion of send buffer memory or long signal memory caused data nodes to crash. Now an appropriate error message is provided instead when this situation occurs. (Bug #48852)
Under certain conditions, accounting of the number of free scan records in the local query handler could be incorrect, so that during node recovery or a local checkpoint operations, the LQH could find itself lacking a scan record that is expected to find, causing the node to crash. (Bug #48697)
References: See also Bug #48564.
The creation of an ordered index on a table undergoing DDL operations could cause a data node crash under certain timing-dependent conditions. (Bug #48604)
During an LCP master takeover, when the newly elected master did
not receive a
COPY_GCI LCP protocol message
but other nodes participating in the local checkpoint had
received one, the new master could use an uninitialized
variable, which caused it to crash.
When running many parallel scans, a local checkpoint (which performs a scan internally) could find itself not getting a scan record, which led to a data node crash. Now an extra scan record is reserved for this purpose, and a problem with obtaining the scan record returns an appropriate error (error code 489, Too many active scans). (Bug #48564)
During a node restart, logging was enabled on a per-fragment
basis as the copying of each fragment was completed but local
checkpoints were not enabled until all fragments were copied,
making it possible to run out of redo log file space
NDB error code 410) before the
restart was complete. Now logging is enabled only after all
fragments has been copied, just prior to enabling local
--configinfo now indicates that
parameters belonging in the
[SHM DEFAULT] sections of the
config.ini file are deprecated or
experimental, as appropriate.
NDB stores blob column data in a
separate, hidden table that is not accessible from MySQL. If
this table was missing for some reason (such as accidental
deletion of the file corresponding to the hidden table) when
making a MySQL Cluster native backup, ndb_restore crashed when
attempting to restore the backup. Now in such cases, ndb_restore
fails with the error message Table
table_name has blob column
column_name) with missing parts
table in backup instead.
DROP DATABASE failed when there
were stale temporary
NDB tables in
the database. This situation could occur if
mysqld crashed during execution of a
DROP TABLE statement after the
table definition had been removed from
NDBCLUSTER but before the
.ndb file had been removed
from the crashed SQL node's data directory. Now, when
DATABASE, it checks for these files and removes them
if there are no corresponding table definitions for them found
NDB table with an
excessive number of large
columns caused the cluster to fail. Now, an attempt to create
such a table is rejected with error 791 (Too many
total bits in bitfields).
References: See also Bug #42047.
When a long-running transaction lasting long enough to cause
Error 410 (REDO log files overloaded) was
later committed or rolled back, it could happen that
NDBCLUSTER was not able to release
the space used for the REDO log, so that the error condition
The most likely cause of such transactions is a bug in the application using MySQL Cluster. This fix should handle most cases where this might occur. (Bug #36500)
Disk Data: When running a write-intensive workload with a very large disk page buffer cache, CPU usage approached 100% during a local checkpoint of a cluster containing Disk Data tables. (Bug #49532)
Disk Data: Repeatedly creating and then dropping Disk Data tables could eventually lead to data node failures. (Bug #45794, Bug #48910)
configuration parameter was set to an non-existent path, the
data nodes shut down with the generic error code 2341
(Internal program error). Now in such
cases, the error reported is error 2815 (File not
When a DML operation failed due to a uniqueness violation on an
NDB table having more than one
unique index, it was difficult to determine which constraint
caused the failure; it was necessary to obtain an
NdbError object, then decode
details property, which in could lead to
memory management issues in application code.
To help solve this problem, a new API method
added, providing a well-formatted string containing more precise
information about the index that caused the unque constraint
violation. The following additional changes are also made in the
NdbError.details is now deprecated
in favor of the new method.
method has been modified to provide more information.
When using blobs, calling
requires the full key to have been set using
getBlobHandle() must access the key for
adding blob table operations. However, if
getBlobHandle() was called without first
setting all parts of the primary key, the application using it
crashed. Now, an appropriate error code is returned instead.
(Bug #28116, Bug #48973)
Using a large number of small fragment log files could cause
NDBCLUSTER to crash while trying to
read them during a restart. This issue was first observed with
1024 fragment log files of 16 MB each.
When the combined length of all names of tables using the
NDB storage engine was greater than
or equal to 1024 bytes, issuing the
BACKUP command in the ndb_mgm
client caused the cluster to crash.
Functionality Added or Changed
Performance: Significant improvements in redo log handling and other file system operations can yield a considerable reduction in the time required for restarts. While actual restart times observed in a production setting will naturally vary according to database size, hardware, and other conditions, our own preliminary testing shows that these optimizations can yield startup times that are faster than those typical of previous MySQL Cluster releases by a factor of 50 or more.
--with-ndb-port-base option for
configure did not function correctly, and has
been deprecated. Attempting to use this option produces the
warning Ignoring deprecated option
Beginning with MySQL Cluster NDB 7.1.0, the deprecation warning
itself is removed, and the
option is simply handled as an unknown and invalid option if you
try to use it.
References: See also Bug #38502.
In certain cases, performing very large inserts on
NDB tables when using
ndbmtd caused the memory allocations for
ordered or unique indexes (or both) to be exceeded. This could
cause aborted transactions and possibly lead to data node
References: See also Bug #48113.
IGNORE statements, batching of updates is now
disabled. This is because such statements failed when batching
of updates was employed if any updates violated a unique
constraint, to the fact a unique constraint violation could not
be handled without aborting the transaction.
Starting a data node with a very large amount of
(approximately 90G or more) could lead to crash of the node due
to job buffer congestion.
For example, consider the table created and populated using these statements:
CREATE TABLE t1 ( c1 INT NOT NULL, c2 INT NOT NULL, PRIMARY KEY(c1), KEY(c2) ) ENGINE = NDB; INSERT INTO t1 VALUES(1, 1);
even though they did not change any rows, each still matched a
row, but this was reported incorrectly in both cases, as shown
UPDATE t1 SET c2 = 1 WHERE c1 = 1;Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 mysql>
UPDATE t1 SET c1 = 1 WHERE c2 = 1;Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0
Now in such cases, the number of rows matched is correct. (In
the case of each of the example
UPDATE statements just shown,
this is displayed as Rows matched: 1, as it should be.)
This issue could affect
statements involving any indexed columns in
NDB tables, regardless of the type
of index (including
PRIMARY KEY) or the number
of columns covered by the index.
On Solaris, shutting down a management node failed when issuing the command to do so from a client connected to a different management node. (Bug #47948)
FragmentLogFileSize to a
value greater than 256 MB led to errors when trying to read the
redo log file.
SHOW CREATE TABLE did not display
AUTO_INCREMENT value for
NDB tables having
Under some circumstances, when a scan encountered an error early
in processing by the
DBTC kernel block (see
The DBTC Block), a node
could crash as a result. Such errors could be caused by
applications sending incorrect data, or, more rarely, by a
DROP TABLE operation executed in
parallel with a scan.
When starting a node and synchronizing tables, memory pages were allocated even for empty fragments. In certain situations, this could lead to insufficient memory. (Bug #47782)
A very small race-condition between
LQH_TRANSREQ signals when handling node
failure could lead to operations (locks) not being taken over
when they should have been, and subsequently becoming stale.
This could lead to node restart failures, and applications
getting into endless lock-conflicts with operations that were
not released until the node was restarted.
References: See also Bug #41297.
configure failed to honor the
--with-zlib-dir option when trying to build
MySQL Cluster from source.
ndbd was not built correctly when compiled using gcc 4.4.0. (The ndbd binary was built, but could not be started.) (Bug #46113)
If a node failed while sending a fragmented long signal, the receiving node did not free long signal assembly resources that it had allocated for the fragments of the long signal that had already been received. (Bug #44607)
When starting a cluster with a great many tables, it was possible for MySQL client connections as well as the slave SQL thread to issue DML statements against MySQL Cluster tables before mysqld had finished connecting to the cluster and making all tables writeable. This resulted in Table ... is read only errors for clients and the Slave SQL thread.
This issue is fixed by introducing the
--ndb-wait-setup option for the
MySQL server. This provides a configurable maximum amount of
time that mysqld waits for all
NDB tables to become writeable,
before enabling MySQL clients or the slave SQL thread to
References: See also Bug #46955.
When building MySQL Cluster, it was possible to configure the
--with-ndb-port without supplying a
port number. Now in such cases, configure
fails with an error.
References: See also Bug #47941.
When the MySQL server SQL mode included
engine warnings and error codes specific to
NDB were returned when errors occurred,
instead of the MySQL server errors and error codes expected by
some programming APIs (such as Connector/J) and applications.
When a copying operation exhausted the available space on a data
node while copying large
columns, this could lead to failure of the data node and a
Table is full error on the SQL node which
was executing the operation. Examples of such operations could
ALTER TABLE that
INT column to a
BLOB column, or a bulk insert of
BLOB data that failed due to
running out of space or to a duplicate key error.
(Bug #34583, Bug #48040)
References: See also Bug #41674, Bug #45768.
Disk Data: A local checkpoint of an empty fragment could cause a crash during a system restart which was based on that LCP. (Bug #47832)
References: See also Bug #41915.
Cluster API: If an NDB API program reads the same column more than once, it is possible exceed the maximum permissible message size, in which case the operation should be aborted due to NDB error 880 Tried to read too much - too many getValue calls, however due to a change introduced in MySQL Cluster NDB 6.3.18, the check for this was not done correctly, which instead caused a data node crash. (Bug #48266)
The NDB API methods
NdbOperation::getErrorLine() formerly had
const and non-
variants. The non-
const versions of these
methods have been removed. In addition, the
method has been re-implemented to provide consistent internal
Cluster API: A duplicate read of a column caused NDB API applications to crash. (Bug #45282)
The error handling shown in the example file
ndbapi_scan.cpp included with the MySQL
Cluster distribution was incorrect.
The disconnection of an API or SQL node having a node ID greater than 49 caused a forced shutdown of the cluster. (Bug #47844)
The error message text for
error code 410 (REDO log files
overloaded...) was truncated.
Functionality Added or Changed
Two new columns have been added to the output of
ndb_desc to make it possible to determine how
much of the disk space allocated to a given table or fragment
remains free. (This information is not available from the
FILES table applies only
to Disk Data files.) For more information, see
ndb_desc — Describe NDB Tables.
mysqld allocated an excessively large buffer
BLOB values due to
overestimating their size. (For each row, enough space was
allocated to accommodate every
TEXT column value in the result
set.) This could adversely affect performance when using tables
TEXT columns; in a few extreme
cases, this issue could also cause the host system to run out of
References: See also Bug #47572, Bug #47573.
When an instance of the
handler was recycled (this can happen due to table definition
cache pressure or to operations such as
FLUSH TABLES or
ALTER TABLE), if the last row
read contained blobs of zero length, the buffer was not freed,
even though the reference to it was lost. This resulted in a
For example, consider the table defined and populated as shown here:
CREATE TABLE t (a INT PRIMARY KEY, b LONGTEXT) ENGINE=NDB; INSERT INTO t VALUES (1, REPEAT('F', 20000)); INSERT INTO t VALUES (2, '');
SELECT a, length(b) FROM bl ORDER BY a; FLUSH TABLES;
Prior to the fix, this resulted in a memory leak proportional to
the size of the stored
each time these two statements were executed.
References: See also Bug #47572, Bug #47574.
Large transactions involving joins between tables containing
BLOB columns used excessive
References: See also Bug #47573, Bug #47574.
A variable was left uninitialized while a data node copied data from its peers as part of its startup routine; if the starting node died during this phase, this could lead a crash of the cluster when the node was later restarted. (Bug #47505)
When a data node restarts, it first runs the redo log until
reaching the latest restorable global checkpoint; after this it
scans the remainder of the redo log file, searching for entries
that should be invalidated so they are not used in any
subsequent restarts. (It is possible, for example, if restoring
GCI number 25, that there might be entries belonging to GCI 26
in the redo log.) However, under certain rare conditions, during
the invalidation process, the redo log files themselves were not
always closed while scanning ahead in the redo log. In rare
cases, this could lead to
exceeded, causing a the data node to crash.
For very large values of
overflow when creating large numbers of tables, leading to
NDB error 773 (Out of
string memory, please modify StringMemory config
parameter), even when
StringMemory was set to
100 (100 percent).
The default value for the
configuration parameter, unlike other MySQL Cluster
configuration parameters, was not set in
Signals from a failed API node could be received after an
API_FAILREQ signal (see
Operations and Signals)
has been received from that node, which could result in invalid
states for processing subsequent signals. Now, all pending
signals from a failing API node are processed before any
API_FAILREQ signal is received.
References: See also Bug #44607.
Using triggers on
NDB tables caused
to be treated as having the NDB kernel's internal default
value (32) and the value for this variable as set on the
cluster's SQL nodes to be ignored.
When performing auto-discovery of tables on individual SQL
NDBCLUSTER attempted to overwrite
files and corrupted them.
In the mysql client, create a new table
t2) with same definition as the corrupted
t1). Use your system shell or file
manager to rename the old
.MYD file to
the new file name (for example, mv t1.MYD
t2.MYD). In the mysql client,
repair the new table, drop the old one, and rename the new
table using the old file name (for example,
RENAME TABLE t2
Running ndb_restore with the
could cause it to crash.
(Bug #40428, Bug #33040)
An insert on an
NDB table was not
always flushed properly before performing a scan. One way in
which this issue could manifest was that
LAST_INSERT_ID() sometimes failed
to return correct values when using a trigger on an
When a data node received a
signal from the master before that node had received a
NODE_FAILREP, a race condition could in
References: See also Bug #25364, Bug #28717.
Some joins on large
BLOB columns could cause
mysqld processes to leak memory. The joins
did not need to reference the
BLOB columns directly for this
issue to occur.
On Mac OS X 10.5, commands entered in the management client
failed and sometimes caused the client to hang, although
management client commands invoked using the
-e) option from the system shell worked
For example, the following command failed with an error and hung until killed manually, as shown here:
SHOWWarning, event thread startup failed, degraded printouts as result, errno=36
However, the same management client command, invoked from the system shell as shown here, worked correctly:
ndb_mgm -e "SHOW"
References: See also Bug #34438.
Disk Data: Calculation of free space for Disk Data table fragments was sometimes done incorrectly. This could lead to unnecessary allocation of new extents even when sufficient space was available in existing ones for inserted data. In some cases, this might also lead to crashes when restarting data nodes.
This miscalculation was not reflected in the contents of the
as it applied to extents allocated to a fragment, and not to a
In some circumstances, if an API node encountered a data node
failure between the creation of a transaction and the start of a
scan using that transaction, then any subsequent calls to
closeTransaction() could cause the same
transaction to be started and closed repeatedly.
Performing multiple operations using the same primary key within
could lead to a data node crash.
This fix does not make change the fact that performing
multiple operations using the same primary key within the same
not supported; because there is no way to determine the order
of such operations, the result of such combined operations
References: See also Bug #44015.
Functionality Added or Changed
On Solaris platforms, the MySQL Cluster management server and
NDB API applications now use
as the default clock.
A new option
--exclude-missing-columns has been
added for the ndb_restore program. In the
event that any tables in the database or databases being
restored to have fewer columns than the same-named tables in the
backup, the extra columns in the backup's version of the
tables are ignored. For more information, see
ndb_restore — Restore a MySQL Cluster Backup.
This issue, originally resolved in MySQL 5.1.16, re-occurred due to a later (unrelated) change. The fix has been re-applied.
Restarting the cluster following a local checkpoint and an
ALTER TABLE on a non-empty
table caused data nodes to crash.
Full table scans failed to execute when the cluster contained more than 21 table fragments.
The number of table fragments in the cluster can be calculated
as the number of data nodes, times 8 (that is, times the value
of the internal constant
MAX_FRAG_PER_NODE), divided by the number
of replicas. Thus, when
NoOfReplicas = 1 at
least 3 data nodes were required to trigger this issue, and
NoOfReplicas = 2 at least 4 data nodes
were required to do so.
Killing MySQL Cluster nodes immediately following a local checkpoint could lead to a crash of the cluster when later attempting to perform a system restart.
The exact sequence of events causing this issue was as follows:
Local checkpoint occurs.
Immediately following the LCP, kill the master data node.
Kill the remaining data nodes within a few seconds of killing the master.
Attempt to restart the cluster.
Ending a line in the
config.ini file with
an extra semicolon character (
reading the file to fail with a parsing error.
When combining an index scan and a delete with a primary key delete, the index scan and delete failed to initialize a flag properly. This could in rare circumstances cause a data node to crash. (Bug #46069)
configuration parameter for API nodes (including SQL nodes) has
been added. This is intended to prevent API nodes from re-using
allocated node IDs during cluster restarts. For more
information, see Defining SQL and Other API Nodes in a MySQL Cluster.
The signals used by ndb_restore to send progress information about backups to the cluster log accessed the cluster transporter without using any locks. Because of this, it was theoretically possible that these signals could be interefered with by heartbeat signals if both were sent at the same time, causing the ndb_restore messages to be corrupted. (Bug #45646)
Problems could arise when using
whose size was greater than 341 characters and which used the
utf8_unicode_ci collation. In some cases,
this combination of conditions could cause certain queries and
OPTIMIZE TABLE statements to
An internal NDB API buffer was not properly initialized. (Bug #44977)
When a data node had written its GCI marker to the first page of a megabyte, and that node was later killed during restart after having processed that page (marker) but before completing a LCP, the data node could fail with file system errors. (Bug #44952)
References: See also Bug #42564, Bug #44291.
The warning message Possible bug in Dbdih::execBLOCK_COMMIT_ORD ... could sometimes appear in the cluster log. This warning is obsolete, and has been removed. (Bug #44563)
If the cluster crashed during the execution of a
CREATE LOGFILE GROUP statement,
the cluster could not be restarted afterward.
References: See also Bug #34102.
Partitioning; Disk Data:
NDB table created with a very
large value for the
could—if this table was dropped and a new table with fewer
partitions, but having the same table ID, was
created—cause ndbd to crash when
performing a system restart. This was because the server
attempted to examine each partition whether or not it actually
References: See also Bug #58638.
If the value set in the
config.ini file for
was identical to the value set for
parameter was ignored when starting the data node with
--initial option. As a result, the
Disk Data files in the corresponding directory were not removed
when performing an initial start of the affected data node or
Disk Data: During a checkpoint, restore points are created for both the on-disk and in-memory parts of a Disk Data table. Under certain rare conditions, the in-memory restore point could include or exclude a row that should have been in the snapshot. This would later lead to a crash during or following recovery. (Bug #41915)
References: See also Bug #47832.
Functionality Added or Changed
Two new server status variables
Ndb_scan_count gives the
number of scans executed since the cluster was last started.
the number of scans for which
NDBCLUSTER was able to use
partition pruning. Together, these variables can be used to help
determine in the MySQL server whether table scans are pruned by
The ndb_config utility program can now
provide an offline dump of all MySQL Cluster configuration
parameters including information such as default and permitted
values, brief description, and applicable section of the
config.ini file. A dump in text format is
produced when running ndb_config with the new
--configinfo option, and in XML format when the
--configinfo --xml are used together.
For more information and examples, see
ndb_config — Extract MySQL Cluster Configuration Information.
Important Change; Partitioning:
User-defined partitioning of an
NDBCLUSTER table without any
primary key sometimes failed, and could cause
mysqld to crash.
Now, if you wish to create an
NDBCLUSTER table with user-defined
partitioning, the table must have an explicit primary key, and
all columns listed in the partitioning expression must be part
of the primary key. The hidden primary key used by the
NDBCLUSTER storage engine is not
sufficient for this purpose. However, if the list of columns is
empty (that is, the table is defined using
[LINEAR] KEY()), then no explicit primary key is
This change does not effect the partitioning of tables using any
storage engine other than
pkg installer for MySQL Cluster on
Solaris did not perform a complete installation due to an
invalid directory reference in the postinstall script.
When ndb_config could not find the file
referenced by the
--config-file option, it
tried to read
my.cnf instead, then failed
with a misleading error message.
When a data node was down so long that its most recent local checkpoint depended on a global checkpoint that was no longer restorable, it was possible for it to be unable to use optimized node recovery when being restarted later. (Bug #44844)
References: See also Bug #26913.
did not output any entries for the
parameter. In addition, the default listed for
MaxNoOfFiles was outside the permitted range
References: See also Bug #44685, Bug #44746.
The output of ndb_config
did not provide information about all sections of the
References: See also Bug #44746, Bug #44749.
Inspection of the code revealed that several assignment
=) were used in place of
comparison operators (
References: See also Bug #44570.
It was possible for NDB API applications to insert corrupt data into the database, which could subquently lead to data node crashes. Now, stricter checking is enforced on input data for inserts and updates. (Bug #44132)
ndb_restore failed when trying to restore data on a big-endian machine from a backup file created on a little-endian machine. (Bug #44069)
ndberror.c contained a C++-style
comment, which caused builds to fail with some C compilers.
When trying to use a data node with an older version of the management server, the data node crashed on startup. (Bug #43699)
In some cases, data node restarts during a system restart could fail due to insufficient redo log space. (Bug #43156)
NDBCLUSTER did not build correctly
on Solaris 9 platforms.
References: See also Bug #39036, Bug #39038.
The output of ndbd
did not provide clear information about the program's
It was theoretically possible for the value of a nonexistent
column to be read as
NULL, rather than
causing an error.
Disk Data: This fix supersedes and improves on an earlier fix made for this bug in MySQL 5.1.18. (Bug #24521)
Cluster Replication: If data node failed during an event creation operation, there was a slight risk that a surviving data node could send an invalid table reference back to NDB, causing the operation to fail with a false Error 723 (No such table). This could take place when a data node failed as a mysqld process was setting up MySQL Cluster Replication. (Bug #43754)
Cluster API: Partition pruning did not work correctly for queries involving multiple range scans.
As part of the fix for this issue, several improvements have
been made in the NDB API, including the addition of a new
method, a new variant of
and a new
values less than 100 were treated as 100. This could cause scans
to time out unexpectedly.
A race condition could occur when a data node failed to restart just before being included in the next global checkpoint. This could cause other data nodes to fail. (Bug #43888)
was measured from the end of one local checkpoint to the
beginning of the next, rather than from the beginning of one LCP
to the beginning of the next. This meant that the time spent
performing the LCP was not taken into account when determining
interval, so that LCPs were not started often enough, possibly
causing data nodes to run out of redo log space prematurely.
Using indexes containing variable-sized columns could lead to internal errors when the indexes were being built. (Bug #43226)
When a data node process had been killed after allocating a node ID, but before making contact with any other data node processes, it was not possible to restart it due to a node ID allocation failure.
This issue could effect either ndbd or ndbmtd processes. (Bug #43224)
References: This bug was introduced by Bug #42973.
Some queries using combinations of logical and comparison
operators on an indexed column in the
clause could fail with the error Got error 4541
'IndexBound has no bound information' from
ndb_restore crashed when trying to restore a backup made to a MySQL Cluster running on a platform having different endianness from that on which the original backup was taken. (Bug #39540)
When aborting an operation involving both an insert and a delete, the insert and delete were aborted separately. This was because the transaction coordinator did not know that the operations affected on same row, and, in the case of a committed-read (tuple or index) scan, the abort of the insert was performed first, then the row was examined after the insert was aborted but before the delete was aborted. In some cases, this would leave the row in a inconsistent state. This could occur when a local checkpoint was performed during a backup. This issue did not affect primary ley operations or scans that used locks (these are serialized).
After this fix, for ordered indexes, all operations that follow the operation to be aborted are now also aborted.
Disk Data: When a log file group had an undo log file whose size was too small, restarting data nodes failed with Read underflow errors.
As a result of this fix, the minimum permitted
INTIAL_SIZE for an undo log file is now
1M (1 megabyte).
If the largest offset of a
RecordSpecification used for an
NdbRecord object was for the
NULL bits (and thus not a column), this
offset was not taken into account when calculating the size used
This meant that the space for the
could be overwritten by key or other information.
BIT columns created using the
native NDB API format that were not created as nullable could
still sometimes be overwritten, or cause other columns to be
This issue did not effect tables having
BIT columns created using the
mysqld format (always used by MySQL Cluster SQL nodes).
When performing insert or write operations,
NdbRecord permits key columns
to be specified in both the key record and in the attribute
record. Only one key column value for each key column should be
sent to the NDB kernel, but this was not guaranteed. This is now
ensured as follows: For insert and write operations, key column
values are taken from the key record; for scan takeover update
operations, key column values are taken from the attribute
Ordered index scans using
NdbRecord formerly expressed a
BoundEQ range as separate lower and upper
bounds, resulting in 2 copies of the column values being sent to
the NDB kernel.
Now, when a range is specified by
the passed pointers, key lengths, and inclusive bits are
compared, and only one copy of the equal key columns is sent to
the kernel. This makes such operations more efficient, as half
the amount of
KeyInfo is now sent for a
BoundEQ range as before.
Functionality Added or Changed
A new data node configuration parameter
been introduced to facilitate parallel node recovery by causing
a local checkpoint to be delayed while recovering nodes are
synchronizing data dictionaries and other meta-information. For
more information about this parameter, see
Defining MySQL Cluster Data Nodes.
Updates of the
SYSTAB_0 system table to
obtain a unique identifier did not use transaction hints for
tables having no primary key. In such cases the NDB kernel used
a cache size of 1. This meant that each insert into a table not
having a primary key required an update of the corresponding
SYSTAB_0 entry, creating a potential
With this fix, inserts on
NDB tables without
primary keys can be under some conditions be performed up to
100% faster than previously.
Packages for MySQL Cluster were missing the
ALTER TABLE ... REORGANIZE
PARTITION on an
NDBCLUSTER table having only one
partition caused mysqld to crash.
References: See also Bug #40389.
Backup IDs greater than 231 were not handled correctly, causing negative values to be used in backup directory names and printouts. (Bug #43042)
When using ndbmtd, NDB kernel threads could
hang while trying to start the data nodes with
set to 1.
When using multiple management servers and starting several API nodes (possibly including one or more SQL nodes) whose connection strings listed the management servers in different order, it was possible for 2 API nodes to be assigned the same node ID. When this happened it was possible for an API node not to get fully connected, consequently producing a number of errors whose cause was not easily recognizable. (Bug #42973)
ndb_error_reporter worked correctly only with GNU tar. (With other versions of tar, it produced empty archives.) (Bug #42753)
caused such tables to become locked.
References: See also Bug #16229, Bug #18135.
Given a MySQL Cluster containing no data (that is, whose data
nodes had all been started using
into which no data had yet been imported) and having an empty
backup directory, executing
START BACKUP with
a user-specified backup ID caused the data nodes to crash.
In some cases,
NDB did not check
correctly whether tables had changed before trying to use the
query cache. This could result in a crash of the debug MySQL
It was not possible to add an in-memory column online to a table
that used a table-level or column-level
DISK option. The same issue prevented
ONLINE TABLE ... REORGANIZE PARTITION from working on
Disk Data tables.
Disk Data: Creating a Disk Data tablespace with a very large extent size caused the data nodes to fail. The issue was observed when using extent sizes of 100 MB and larger. (Bug #39096)
Trying to execute a
GROUP statement using a value greater than
caused data nodes to crash.
As a result of this fix, the upper limit for
UNDO_BUFFER_SIZE is now
600M; attempting to set a higher value now
fails gracefully with an error.
References: See also Bug #36702.
Disk Data: When attempting to create a tablespace that already existed, the error message returned was Table or index with given name already exists. (Bug #32662)
Using a path or file name longer than 128 characters for Disk
Data undo log files and tablespace data files caused a number of
issues, including failures of
TABLESPACE statements, as well as crashes of
management nodes and data nodes.
With this fix, the maximum length for path and file names used for Disk Data undo log files and tablespace data files is now the same as the maximum for the operating system. (Bug #31769, Bug #31770, Bug #31772)
Disk Data: Attempting to perform a system restart of the cluster where there existed a logfile group without and undo log files caused the data nodes to crash.
While issuing a
GROUP statement without an
UNDOFILE option fails with an error in the MySQL
server, this situation could arise if an SQL node failed
during the execution of a valid
LOGFILE GROUP statement; it is also possible to
create a logfile group without any undo log files using the
Some error messages from ndb_mgmd contained
\n) characters. This could break the
MGM API protocol, which uses the newline as a line separator.
Cluster API: When using an ordered index scan without putting all key columns in the read mask, this invalid use of the NDB API went undetected, which resulted in the use of uninitialized memory. (Bug #42591)
Functionality Added or Changed
New options are introduced for ndb_restore for determining which tables or databases should be restored:
--include-databases can be used to restore
specific tables or databases.
--exclude-databases can be used to exclude
the specified tables or databases from being restored.
For more information about these options, see ndb_restore — Restore a MySQL Cluster Backup. (Bug #40429)
It is now possible to specify default locations for Disk Data
data files and undo log files, either together or separately,
using the data node configuration parameters
For information about these configuration parameters, see
Data file system parameters.
It is also now possible to specify a log file group, tablespace,
or both, that is created when the cluster is started, using the
node configuration parameters. For information about these
configuration parameters, see
Data object creation parameters.
When performing more than 32 index or tuple scans on a single fragment, the scans could be left hanging. This caused unnecessary timeouts, and in addition could possibly lead to a hang of an LCP. (Bug #42559)
References: This bug is a regression of Bug #42084.
A data node failure that occurred between calls to
was not correctly handled; a subsequent call to
caused a null pointer to be deferenced, leading to a segfault in
SHOW GLOBAL STATUS LIKE 'NDB%' before
mysqld had connected to the cluster caused a
Data node failures that occurred before all data nodes had connected to the cluster were not handled correctly, leading to additional data node failures. (Bug #42422)
When a cluster backup failed with Error 1304 (Node
node_id1: Backup request from
node_id2 failed to start), no clear
reason for the failure was provided.
As part of this fix, MySQL Cluster now retries backups in the event of sequence errors. (Bug #42354)
References: See also Bug #22698.
NDBCLUSTER STATUS on an SQL node before the management
server had connected to the cluster caused
mysqld to crash.
Functionality Added or Changed
Formerly, when the management server failed to create a
transporter for a data node connection,
elapsed before the data node was actually permitted to
disconnect. Now in such cases the disconnection occurs
References: See also Bug #41713.
It is now possible while in Single User Mode to restart all data
ALL RESTART in the management
client. Restarting of individual nodes while in Single User Mode
remains not permitted.
Formerly, when using MySQL Cluster Replication, records for
“empty” epochs—that is, epochs in which no
NDBCLUSTER data or
tables took place—were inserted into the
ndb_binlog_index tables on the slave even
disabled. Beginning with MySQL Cluster NDB 6.2.16 and MySQL
Cluster NDB 6.3.13 this was changed so that these
“empty” epochs were no longer logged. However, it
is now possible to re-enable the older behavior (and cause
“empty” epochs to be logged) by using the
--ndb-log-empty-epochs option. For more
information, see Replication Slave Options and Variables.
References: See also Bug #37472.
A maximum of 11
TUP scans were permitted in
Trying to execute an
ALTER ONLINE TABLE
... ADD COLUMN statement while inserting rows into the
table caused mysqld to crash.
If the master node failed during a global checkpoint, it was possible in some circumstances for the new master to use an incorrect value for the global checkpoint index. This could occur only when the cluster used more than one node group. (Bug #41469)
API nodes disconnected too agressively from cluster when data nodes were being restarted. This could sometimes lead to the API node being unable to access the cluster at all during a rolling restart. (Bug #41462)
It was not possible to perform online upgrades from a MySQL Cluster NDB 6.2 release to MySQL Cluster NDB 6.3.8 or a later MySQL Cluster NDB 6.3 release. (Bug #41435)
Cluster log files were opened twice by internal log-handling code, resulting in a resource leak. (Bug #41362)
A race condition in transaction coordinator takeovers (part of node failure handling) could lead to operations (locks) not being taken over and subsequently getting stale. This could lead to subsequent failures of node restarts, and to applications getting into an endless lock conflict with operations that would not complete until the node was restarted. (Bug #41297)
References: See also Bug #41295.
An abort path in the
DBLQH kernel block
failed to release a commit acknowledgment marker. This meant
that, during node failure handling, the local query handler
could be added multiple times to the marker record which could
lead to additional node failures due an array overflow.
During node failure handling (of a data node other than the
master), there was a chance that the master was waiting for a
GCP_NODEFINISHED signal from the failed node
after having received it from all other data nodes. If this
occurred while the failed node had a transaction that was still
being committed in the current epoch, the master node could
crash in the
DBTC kernel block when
discovering that a transaction actually belonged to an epoch
which was already completed.
EXIT in the management client
sometimes caused the client to hang.
In the event that a MySQL Cluster backup failed due to file permissions issues, conflicting reports were issued in the management client. (Bug #34526)
If all data nodes were shut down, MySQL clients were unable to
NDBCLUSTER tables and data
even after the data nodes were restarted, unless the MySQL
clients themselves were restarted.
Disk Data: Starting a cluster under load such that Disk Data tables used most of the undo buffer could cause data node failures.
The fix for this bug also corrected an issue in the
LGMAN kernel block where the amount of free
space left in the undo buffer was miscalculated, causing buffer
overruns. This could cause records in the buffer to be
overwritten, leading to problems when restarting data nodes.
mgmapi.h contained constructs which only
worked in C++, but not in C.
Functionality Added or Changed
methods have been added to help in diagnosing problems with NDB
API client connections. The
method tells whether or not the latest connection attempt
succeeded; if the attempt failed,
provides an error message giving the reason.
If a transaction was aborted during the handling of a data node failure, this could lead to the later handling of an API node failure not being completed. (Bug #41214)
SHOW TABLES repeatedly could cause
NDBCLUSTER tables to be dropped.
Statements of the form
UPDATE ... ORDER BY ...
LIMIT run against
NDBCLUSTER tables failed to update
all matching rows, or failed with the error Can't
find record in
Start phase reporting was inconsistent between the management client and the cluster log. (Bug #39667)
Status messages shown in the management client when restarting a
management node were inappropriate and misleading. Now, when
restarting a management node, the messages displayed are as
node_id is the
management node's node ID:
Shutting down MGM node
node_idfor restart Node
node_idis being restarted ndb_mgm>
Disk Data: This improves on a previous fix for this issue that was made in MySQL Cluster 6.3.8. (Bug #37116)
References: See also Bug #29186.
When creating a scan using an
NdbScanFilter object, it was
possible to specify conditions against a
column, but the correct rows were not returned when the scan was
As part of this fix, 4 new comparison operators have been
implemented for use with scans on
For more information about these operators, see The NdbScanFilter::BinaryCondition Type.
Functionality Added or Changed
Important Change; Cluster API: MGM API applications exited without raising any errors if the connection to the management server was lost. The fix for this issue includes two changes:
The MGM API now provides its own
SIGPIPE handler to catch the
“broken pipe” error that occurs when writing
to a closed or reset socket. This means that MGM API now
behaves the same as NDB API in this regard.
A new function
has been added to the MGM API. This function makes it
possible to bypass the
provided by the MGM API.
When performing an initial start of a data node, fragment log
files were always created sparsely—that is, not all bytes
were written. Now it is possible to override this behavior using
Failed operations on
TEXT columns were not always
reported correctly to the originating SQL node. Such errors were
sometimes reported as being due to timeouts, when the actual
problem was a transporter overload due to insufficient buffer
(Bug #39867, Bug #39879)
Undo logs and data files were created in 32K increments. Now these files are created in 512K increments, resulting in shorter creation times. (Bug #40815)
Redo log creation was very slow on some platforms, causing MySQL Cluster to start more slowly than necessary with some combinations of hardware and operating system. This was due to all write operations being synchronized to disk while creating a redo log file. Now this synchronization occurs only after the redo log has been created. (Bug #40734)
Transaction failures took longer to handle than was necessary.
When a data node acting as transaction coordinator (TC) failed, the surviving data nodes did not inform the API node initiating the transaction of this until the failure had been processed by all protocols. However, the API node needed only to know about failure handling by the transaction protocol—that is, it needed to be informed only about the TC takeover process. Now, API nodes (including MySQL servers acting as cluster SQL nodes) are informed as soon as the TC takeover is complete, so that it can carry on operating more quickly. (Bug #40697)
It was theoretically possible for stale data to be read from
NDBCLUSTER tables when the
transaction isolation level was set to
SET SESSION ndb_optimized_node_selection = 1
failed with an invalid warning message.
A restarting data node could fail with an error in the
DBDIH kernel block when a local or global
checkpoint was started or triggered just as the node made a
request for data from another data node.
Restoring a MySQL Cluster from a dump made using mysqldump failed due to a spurious error: Can't execute the given command because you have active locked tables or an active transaction. (Bug #40346)
O_DIRECT was incorrectly disabled when making
MySQL Cluster backups.
Heavy DDL usage caused the mysqld processes
to hang due to a timeout error (
error code 266).
Events logged after setting
STATISTICS=15 in the management client did not always
include the node ID of the reporting node.
A segfault in
ndbd to hang indefinitely. This fix improves
on an earlier one for this issue, first made in MySQL Cluster
NDB 6.2.16 and MySQL Cluster NDB 6.3.17.
References: See also Bug #38609.
Memory leaks could occur in handling of strings used for storing cluster metadata and providing output to users. (Bug #38662)
A duplicate key or other error raised when inserting into an
NDBCLUSTER table caused the current
transaction to abort, after which any SQL statement other than a
failed. With this fix, the
NDBCLUSTER storage engine now
performs an implicit rollback when a transaction is aborted in
this way; it is no longer necessary to issue an explicit
statement, and the next statement that is issued automatically
begins a new transaction.
It remains necessary in such cases to retry the complete transaction, regardless of which statement caused it to be aborted.
References: See also Bug #47654.
Error messages for
codes 1224 and 1227 were missing.
GROUP statements on separate SQL nodes caused a
resource leak that led to data node crashes when these
statements were used again later.
Disk Data: Disk-based variable-length columns were not always handled like their memory-based equivalents, which could potentially lead to a crash of cluster data nodes. (Bug #39645)
O_SYNC was incorrectly disabled on platforms
that do not support
O_DIRECT. This issue was
noted on Solaris but could have affected other platforms not
Cluster API: The MGM API reset error codes on management server handles before checking them. This meant that calling an MGM API function with a null handle caused applications to crash. (Bug #40455)
It was not always possible to access parent objects directly
NdbScanOperation objects. To
alleviate this problem, a new
method has been added to
NdbBlob and new
getNdbTransaction() methods have been added to
NdbScanOperation. In addition,
a const variant of
NdbOperation::getErrorLine() is now also
getBlobHandle() failed when used with
incorrect column names or numbers.
The MGM API function
As part of this fix, it is now possible to specify bind addresses in connection strings. See MySQL Cluster Connection Strings, for more information. (Bug #38473)
Cluster API: The NDB API example programs included in MySQL Cluster source distributions failed to compile. (Bug #37491)
References: See also Bug #40238.
Functionality Added or Changed
It is no longer a requirement for database autodiscovery that an
SQL node already be connected to the cluster at the time that a
database is created on another SQL node. It is no longer
necessary to issue
SCHEMA) statements on an SQL node joining the cluster
after a database is created for the new SQL node to see the
database and any
NDBCLUSTER tables that it
Starting the MySQL Server with the
--ndbcluster option plus an
invalid command-line option (for example, using
--foobar) caused it to hang while shutting down the
binary log thread.
Dropping and then re-creating a database on one SQL node caused other SQL nodes to hang. (Bug #39613)
Setting a low value of
100) and performing a large number of (certain) scans could
cause the Transaction Coordinator to run out of scan fragment
records, and then crash. Now when this resource is exhausted,
the cluster returns Error 291 (Out of scanfrag
records in TC (increase MaxNoOfLocalScans)) instead.
When a transaction included a multi-row insert to an
NDBCLUSTER table that caused a
constraint violation, the transaction failed to roll back.
Creating a unique index on an
NDBCLUSTER table caused a memory
leak in the
SUMA) which could lead to mysqld
hanging, due to the fact that the resource shortage was not
reported back to the
References: See also Bug #39450.
Embedded libmysqld with
NDB did not drop table events.
Unique identifiers in tables having no primary key were not
cached. This fix has been observed to increase the efficiency of
INSERT operations on such tables
by as much as 50%.
When restarting a data node, an excessively long shutdown message could cause the node process to crash. (Bug #38580)
After a forced shutdown and initial restart of the cluster, it
was possible for SQL nodes to retain
files corresponding to
tables that had been dropped, and thus to be unaware that these
tables no longer existed. In such cases, attempting to re-create
the tables using
CREATE TABLE IF NOT EXISTS
could fail with a spurious Table ... doesn't
A statement of the form
where there was no row whose primary key column had the stated
value appeared to succeed, with the
server reporting that 1 row had been changed.
This issue was only known to affect MySQL Cluster NDB 6.3.11 and later NDB 6.3 versions. (Bug #37153)
Support for the
InnoDB storage engine was
missing from the GPL source releases. An updated GPL source
includes code for building
InnoDB can be
MySQL FTP site.
MgmtSrvr::allocNodeId() left a mutex locked
following an Ambiguity for node if %d
An invalid path specification caused mysql-test-run.pl to fail. (Bug #39026)
During transactional coordinator takeover (directly after node
failure), the LQH finding an operation in the
LOG_COMMIT state sent an
LQH_TRANS_CONF signal twice, causing the TC
An invalid memory access caused the management server to crash on Solaris Sparc platforms. (Bug #38628)
A segfault in
ndbd to hang indefinitely.
ndb_mgmd failed to start on older Linux distributions (2.4 kernels) that did not support e-polling. (Bug #38592)
ndb_mgmd sometimes performed unnecessary network I/O with the client. This in combination with other factors led to long-running threads that were attempting to write to clients that no longer existed. (Bug #38563)
ndb_restore failed with a floating point exception due to a division by zero error when trying to restore certain data files. (Bug #38520)
A failed connection to the management server could cause a resource leak in ndb_mgmd. (Bug #38424)
Failure to parse configuration parameters could cause a memory leak in the NDB log parser. (Bug #38380)
NDBCLUSTER table on one
SQL node, caused a trigger on this table to be deleted on
another SQL node.
Attempting to add a
UNIQUE INDEX twice to an
NDBCLUSTER table, then deleting
rows from the table could cause the MySQL Server to crash.
ndb_restore failed when a single table was specified. (Bug #33801)
GCP_COMMIT did not wait for transaction
takeover during node failure. This could cause
GCP_SAVE_REQ to be executed too early. This
could also cause (very rarely) replication to skip rows.
Support for Multi-Range Read index scans using the old API
(using, for example,
were dropped in MySQL Cluster NDB 6.2. This functionality is
restored in MySQL Cluster NDB 6.3 beginning with 6.3.17, but
remains unavailable in MySQL Cluster NDB 6.2. Both MySQL Cluster
NDB 6.2 and 6.3 support Multi-Range Read scans through the
method could be called multiple times without error.
Certain Multi-Range Read scans involving
IS NOT NULL comparisons
failed with an error in the
local query handler.
Problems with the public headers prevented
NDB applications from being built
with warnings turned on.
object using an
NdbScanOperation object that
had not yet had its
method called resulted in a crash when later attempting to use
interpreted delete created with an
option caused the transaction to abort.
Functionality Added or Changed
Event buffer lag reports are now written to the cluster log. (Bug #37427)
--no-binlog option for
ndb_restore. When used, this option prevents
information being written to SQL node binary logs from the
restoration of a cluster backup.
Cluster API: Changing the system time on data nodes could cause MGM API applications to hang and the data nodes to crash. (Bug #35607)
Failure of a data node could sometimes cause mysqld to crash. (Bug #37628)
DELETE ... WHERE
deleted the wrong row from the table.
If subscription was terminated while a node was down, the epoch was not properly acknowledged by that node. (Bug #37442)
libmysqld failed to wait for the cluster
binary log thread to terminate before exiting.
In rare circumstances, a connection followed by a disconnection could give rise to a “stale” connection where the connection still existed but was not seen by the transporter. (Bug #37338)
When some operations succeeded and some failed following a call
AO_IgnoreOnError), a race condition could cause
spurious occurrences of NDB API Error 4011 (Internal
Cluster API: Creating a table on an SQL node, then starting an NDB API application that listened for events from this table, then dropping the table from an SQL node, prevented data node restarts. (Bug #32949, Bug #37279)
A buffer overrun in
erroneous results on Mac OS X.
In certain rare situations,
could fail with the error Can't use string
value") as a HASH ref while "strict
refs" in use.
A fail attempt to create an
table could in some cases lead to resource leaks or cluster
Attempting to create a native backup of
NDB tables having a large number of
NULL columns and data could lead to node
Checking of API node connections was not efficiently handled. (Bug #36843)
References: See also Bug #36851.
If the combined total of tables and indexes in the cluster was
greater than 4096, issuing
caused data nodes to fail.
Where column values to be compared in a query were of the
NDBCLUSTER passed a value padded to
the full size of the column, which caused unnecessary data to be
sent to the data nodes. This also had the effect of wasting CPU
and network bandwidth, and causing condition pushdown to be
disabled where it could (and should) otherwise have been
When dropping a table failed for any reason (such as when in
single user mode) then the corresponding
.ndb file was still removed.
Cluster API: Ordered index scans were not pruned correctly where a partitioning key was specified with an EQ-bound. (Bug #36950)
When an insert operation involving
BLOB data was attempted on a row
which already existed, no duplicate key error was correctly
reported and the transaction is incorrectly aborted. In some
cases, the existing row could also become corrupted.
References: See also Bug #26756.
NdbApi.hpp depended on
ndb_global.h, which was not actually
installed, causing the compilation of programs that used
NdbApi.hpp to fail.
SET GLOBAL ndb_extra_logging caused
mysqld to crash.
A race condition caused by a failure in epoll handling could cause data nodes to fail. (Bug #36537)
Under certain rare circumstances, the failure of the new master node while attempting a node takeover would cause takeover errors to repeat without being resolved. (Bug #36199, Bug #36246, Bug #36247, Bug #36276)
When more than one SQL node connected to the cluster at the same
time, creation of the
failed on one of them with an explicit Table
exists error, which was not necessary.
mysqld failed to start after running mysql_upgrade. (Bug #35708)
Notification of a cascading master node failures could sometimes
not be transmitted correctly (that is, transmission of the
NF_COMPLETEREP signal could fail), leading to
transactions hanging and timing out
NDB error 4012), scans hanging,
and failure of the management server process.
NDB error 1427 (Api node
died, when SUB_START_REQ reached node) was
incorrectly classified as a schema error rather than a temporary
If an API node disconnected and then reconnected during Start
Phase 8, then the connection could be
“blocked”—that is, the
QMGR kernel block failed to detect that the
API node was in fact connected to the cluster, causing issues
NDB Subscription Manager
Accessing the debug version of
dlopen() resulted in a segmentation
Attempting to pass a nonexistent column name to the
NdbOperation caused NDB API
applications to crash. Now the column name is checked, and an
error is returned in the event that the column is not found.
mysqld_safe now traps Signal 13
SIGPIPE) so that this signal no longer kills
the MySQL server process.
Node or system restarts could fail due an unitialized variable
DTUP kernel block. This issue was
found in MySQL Cluster NDB 6.3.11.
It was not possible to determine the value used for the
--ndb-cluster-connection-pool option in the
mysql client. Now this value is reported as a
system status variable.
The ndb_waiter utility wrongly calculated timeouts. (Bug #35435)
ndb_restore incorrectly handled some data types when applying log files from backups. (Bug #35343)
In some circumstances, a stopped data node was handled incorrectly, leading to redo log space being exhausted following an initial restart of the node, or an initial or partial restart of the cluster (the wrong CGI might be used in such cases). This could happen, for example, when a node was stopped following the creation of a new table, but before a new LCP could be executed. (Bug #35241)
SELECT ... LIKE ... queries yielded incorrect
results when used on
NDB tables. As
part of this fix, condition pushdown of such queries has been
disabled; re-enabling it is expected to be done as part of a
later, permanent fix for this issue.
ndb_mgmd reported errors to
STDOUT rather than to
Nested Multi-Range Read scans failed when the second Multi-Range Read released the first read's unprocessed operations, sometimes leading to an SQL node crash. (Bug #35137)
In some situations, a problem with synchronizing checkpoints between nodes could cause a system restart or a node restart to fail with Error 630 during restore of TX. (Bug #34756)
References: This bug is a regression of Bug #34033.
A node failure during an initial node restart followed by another node start could cause the master data node to fail, because it incorrectly gave the node permission to start even if the invalidated node's LCP was still running. (Bug #34702)
When a secondary index on a
DECIMAL column was used to
retrieve data from an
NDB table, no
results were returned even if the target table had a matched
value in the column that was defined with the secondary index.
If a data node in one node group was placed in the “not
started” state (using
), it was not possible to stop a data node in a
different node group.
NDBCLUSTER test failures
occurred in builds compiled using icc on IA64
START BACKUP command was issued while
ndb_restore was running, the backup being
restored could be overwritten.
Cluster API: Closing a scan before it was executed caused the application to segfault. (Bug #36375)
Using NDB API applications from older MySQL Cluster versions
libndbclient from newer ones caused the
cluster to fail.
Cluster API: Some ordered index scans could return tuples out of order. (Bug #35908)
Cluster API: Scans having no bounds set were handled incorrectly. (Bug #35876)
which was inadvertently removed in MySQL Cluster NDB 6.3.11, has
Due to the reduction of the number of local checkpoints from 3 to 2 in MySQL Cluster NDB 6.3.8, a data node using ndbd from MySQL Cluster NDB 6.3.8 or later started using a file system from an earlier version could incorrectly invalidate local checkpoints too early during the startup process, causing the node to fail. (Bug #34596)
Cluster failures could sometimes occur when performing more than
three parallel takeovers during node restarts or system
restarts. This affected MySQL Cluster NDB
x releases only.
Upgrades of a cluster using while a
DataMemory setting in
excess of 16 GB caused data nodes to fail.
In certain rare circumstances, a race condition could occur between an aborted insert and a delete leading a data node crash. (Bug #34260)
Multi-table updates using ordered indexes during handling of node failures could cause other data nodes to fail. (Bug #34216)
When configured with
MySQL failed to compile using gcc 4.3 on
64bit FreeBSD systems.
The failure of a DDL statement could sometimes lead to node failures when attempting to execute subsequent DDL statements. (Bug #34160)
When configured with
MySQL failed to compile on 64bit FreeBSD systems.
Statements executing multiple inserts performed poorly on
NDB tables having
The ndb_waiter utility polled ndb_mgmd excessively when obtaining the status of cluster data nodes. (Bug #32025)
References: See also Bug #32023.
Transaction atomicity was sometimes not preserved between reads and inserts under high loads. (Bug #31477)
Having tables with a great many columns could cause Cluster backups to fail. (Bug #30172)
Disk Data; Cluster Replication:
Statements violating unique keys on Disk Data tables (such as
attempting to insert
NULL into a
NULL column) could cause data nodes to fail. When the
statement was executed from the binary log, this could also
result in failure of the slave cluster.
Disk Data: Updating in-memory columns of one or more rows of Disk Data table, followed by deletion of these rows and re-insertion of them, caused data node failures. (Bug #33619)
Functionality Added or Changed
Important Change; Cluster API:
memory page sizes in bytes rather than kilobytes, it has been
page_size_bytes. The name
page_size_kb is now deprecated and thus
subject to removal in a future release, although it currently
remains supported for reasons of backward compatibility. See
Ndb_logevent_type Type, for more information
OPTIMIZE TABLE can now be
interrupted. This can be done, for example, by killing the SQL
thread performing the
Now only 2 local checkpoints are stored, rather than 3 as in previous MySQL Cluster versions. This lowers disk space requirements and reduces the size and number of redo log files needed.
The mysqld option
--ndb-batch-size has been added. This enables
control of the size of batches used for running transactions.
Node recovery can now be done in parallel, rather than sequentially, which can result in much faster recovery times.
NDB tables can now
be controlled using the session variables
NDB tables not to be checkpointed
has the same effect; in addition, when
ndb_table_temporary is used, no
NDB table schema files are created.
ndb_restore now supports basic
attribute promotion; that is, data from a
column of a given type can be restored to a column using a
“larger” type. For example, Cluster backup data
taken from a
SMALLINT column can
be restored to a
For more information, see ndb_restore — Restore a MySQL Cluster Backup.
Important Change; Disk Data:
It is no longer possible on 32-bit systems to issue statements
appearing to create Disk Data log files or data files greater
than 4 GB in size. (Trying to create log files or data files
larger than 4 GB on 32-bit systems led to unrecoverable data
node failures; such statements now fail with
NDB error 1515.)
Replication: The code implementing heartbeats did not check for possible errors in some circumstances; this kept the dump thread hanging while waiting for heartbeats loop even though the slave was no longer connected. (Bug #33332)
High numbers of insert operations, delete operations, or both
NDB error 899
(Rowid already allocated) to occur
A periodic failure to flush the send buffer by the
NDB TCP transporter could cause a
unnecessary delay of 10 ms between operations.
DROP TABLE did not free all data
memory. This bug was observed in MySQL Cluster NDB 6.3.7 only.
A race condition could occur (very rarely) when the release of a GCI was followed by a data node failure. (Bug #33793)
Some tuple scans caused the wrong memory page to be accessed, leading to invalid results. This issue could affect both in-memory and Disk Data tables. (Bug #33739)
A failure to initialize an internal variable led to sporadic crashes during cluster testing. (Bug #33715)
The server failed to reject properly the creation of an
NDB table having an unindexed
The Cluster backup process could not detect when there was no more disk space and instead continued to run until killed manually. Now the backup fails with an appropriate error when disk space is exhausted. (Bug #28647)
It was possible in
config.ini to define
cluster nodes having node IDs greater than the maximum permitted
Under some circumstances, a recovering data node did not use its own data, instead copying data from another node even when this was not required. This in effect bypassed the optimized node recovery protocol and caused recovery times to be unnecessarily long. (Bug #26913)
Transactions containing inserts or reads would hang during
made from NDB API applications built against a MySQL Cluster
version that did not support micro-GCPs accessing a later
version that supported micro-GCPs. This issue was observed while
upgrading from MySQL Cluster NDB 6.1.23 to MySQL Cluster NDB
6.2.10 when the API application built against the earlier
version attempted to access a data node already running the
later version, even after disabling micro-GCPs by setting
When reading a
BIT(64) value using
bytes were written to the buffer rather than the expected 8
Functionality Added or Changed
Compressed local checkpoints and backups are now supported,
resulting in a space savings of 50% or more over uncompressed
LCPs and backups. Compression of these can be enabled in the
config.ini file using the two new data node
It is now possible to cause statements occurring within the same
transaction to be run as a batch by setting the session variable
To use this feature,
autocommit must be disabled.
Only in-memory tables are supported.
OPTIMIZE still has no effect on Disk Data
Only variable-length columns are supported. However, you can
force columns defined using fixed-length data types to be
dynamic using the
COLUMN_FORMAT option with a
CREATE TABLE or
ALTER TABLE statement.
The performance of
NDB tables can be regulated by
adjusting the value of the
ndb_optimization_delay system variable.
When partition pruning on an
table resulted in an ordered index scan spanning only one
partition, any descending flag for the scan was wrongly
ORDER BY DESC to be
ORDER BY ASC,
MAX() to be handled incorrectly, and similar
When all data and SQL nodes in the cluster were shut down
abnormally (that is, other than by using
in the cluster management client), ndb_mgm
used excessive amounts of CPU.
When using micro-GCPs, if a node failed while preparing for a global checkpoint, the master node would use the wrong GCI. (Bug #32922)
Under some conditions, performing an
TABLE on an
table failed with a Table is full error,
even when only 25% of
DataMemory was in use
and the result should have been a table using less memory (for
example, changing a
VARCHAR(100) column to
Functionality Added or Changed
The output of the ndb_mgm client
now indicates when the cluster is in single user mode.
Unnecessary reads when performing a primary key or unique key update have been reduced, and in some cases, eliminated. (It is almost never necessary to read a record prior to an update, the lone exception to this being when a primary key is updated, since this requires a delete followed by an insert, which must be prepared by reading the record.) Depending on the number of primary key and unique key lookups that are performed per transaction, this can yield a considerable improvement in performance.
In a cluster running in diskless mode and with arbitration disabled, the failure of a data node during an insert operation caused other data node to fail. (Bug #31980)
An insert or update with combined range and equality constraints
failed when run against an
table with the error Got unknown error from
NDB. An example of such a statement would be
UPDATE t1 SET b = 5 WHERE a IN (7,8) OR a >=
An error with an
if statement in
sql/ha_ndbcluster.cc could potentially lead
to an infinite loop in case of failure when working with
AUTO_INCREMENT columns in
NDB storage engine code was not
safe for strict-alias optimization in gcc
ndb_restore displayed incorrect backup file version information. This meant (for example) that, when attempting to restore a backup made from a MySQL 5.1.22 cluster to a MySQL Cluster NDB 6.3.3 cluster, the restore process failed with the error Restore program older than backup version. Not supported. Use new restore program. (Bug #31723)
Following an upgrade, ndb_mgmd failed with an ArbitrationError. (Bug #31690)
NDB management client command
provided no output when
node_id was the node ID of a
management or API node. Now, when this occurs, the management
client responds with
node_id: is not a data
after a data node had been shut down could lead to inconsistent
data following a restart of the node.
UPDATE IGNORE could sometimes fail on
NDB tables due to the use of
unitialized data when checking for duplicate keys to be ignored.
Functionality Added or Changed
option for mysqld now permits a wider range
of values and corresponding behaviors for SQL nodes when
selecting a transaction coordinator.
You should be aware that the default value and behavior as well
as the value type used for this option have changed, and that
you may need to update the setting used for this option in your
my.cnf file prior to upgrading
Server System Variables, for more information.
It was possible in some cases for a node group to be “lost” due to missed local checkpoints following a system restart. (Bug #31525)
NDB tables having names containing
nonalphanumeric characters (such as
$”) were not discovered
A node failure during a local checkpoint could lead to a subsequent failure of the cluster during a system restart. (Bug #31257)
A cluster restart could sometimes fail due to an issue with table IDs. (Bug #30975)
Transaction timeouts were not handled well in some circumstances, leading to excessive number of transactions being aborted unnecessarily. (Bug #30379)
In some cases, the cluster managment server logged entries multiple times following a restart of ndb_mgmd. (Bug #29565)
--help did not
display any information about the
An interpreted program of sufficient size and complexity could cause all cluster data nodes to shut down due to buffer overruns. (Bug #29390)
The cluster log was formatted inconsistently and contained extraneous newline characters. (Bug #25064)
Functionality Added or Changed
NDB error codes to MySQL
storage engine error codes has been improved.
Attempting to restore a backup made on a cluster host using one endian to a machine using the other endian could cause the cluster to fail. (Bug #29674)
The description of the
Functionality Added or Changed
It is now possible to control whether fixed-width or
variable-width storage is used for a given column of an
NDB table by means of the
COLUMN_FORMAT specifier as part of the
column's definition in a
It is also possible to control whether a given column of an
NDB table is stored in memory or on
disk, using the
STORAGE specifier as part of
the column's definition in a
For permitted values and other information about
see CREATE TABLE Syntax.
A new cluster management server startup option
--bind-address makes it
possible to restrict management client connections to
ndb_mgmd to a single host and port. For more
ndb_mgmd — The MySQL Cluster Management Server Daemon.
DROP INDEX operations
can now be performed explicitly for
NDB tables—that is, without
copying or locking of the affected tables—using
ALTER ONLINE TABLE. Indexes can also be
created and dropped online using
respectively, using the
You can force operations that would otherwise be performed
online to be done offline using the
Renaming of tables and columns for
tables is performed in place without table copying.
NDB event was left behind
but the corresponding table was later recreated and received a
new table ID, the event could not be dropped.
When creating an
NDB table with a column that
COLUMN_FORMAT = DYNAMIC, but the table
ROW_FORMAT=FIXED, the table is
considered dynamic, but any columns for which the row format is
unspecified default to
FIXED. Now in such
cases the server issues the warning Row format FIXED
incompatible with dynamic attribute
An insufficiently descriptive and potentially misleading Error 4006 (Connect failure - out of connection objects...) was produced when either of the following two conditions occurred:
There were no more transaction records in the transaction coordinator
NDB object in the NDB API
was initialized with insufficient parallelism
Separate error messages are now generated for each of these two cases. (Bug #11313)
Functionality Added or Changed
Reporting functionality has been significantly enhanced in this release:
A new configuration parameter
now makes it possible to cause the management client to
provide status reports at regular intervals as well as for
such reports to be written to the cluster log (depending on
cluster event logging levels).
REPORT command has been added in
the cluster management client.
BackupStatus enables you to obtain a backup status
report at any time during a backup.
MemoryUsage reports the current data memory and
index memory used by each data node. For more about the
REPORT command, see
Commands in the MySQL Cluster Management Client.
ndb_restore now provides running reports of its progress when restoring a backup. In addition, a complete report status report on the backup is written to the cluster log.