The first version of MySQL Cluster 7.4 has now been released on MySQL Labs. Note that labs loads are not suitable for production use (in fact they’re even less mature than Development Milestone Releases); their purpose is to give users a chance to see what’s in the works, try it for themselves and then provide feedback. Having read that, if you’d like to try it out then Download MySQL Cluster 7.4 from MySQL Labs.
The focus of this first Cluster 7.4 load is performance and data node restart times.
Performance
MySQL Cluster was designed from the outset to be a distributed, in-memory database and has been deployed that way for many, many years (it’s interesting to see that the idea of in-memory databases has now really come into vogue with excitement around new arrivals on the scene such as Hekaton). Not surprisingly when people are considering MySQL Cluster, performance and scalability are key features (High Availability is another) and so performance improvements are always a key focus of every release and MySQL CLuster 7.4 is no exception.
The graphs show what’s already been acheived with Read Only Sysbench showing a 47% increase in throughput and a 38% improvement for the Read/Write benchmark. Even better improvements are seen when configuring the data nodes to use even more threads. For those not familiar with Sysbench, you should realise that each of the transactions involves quite a lot of work: 10 Primary Key lookups, 5 different types of scans where we fetch 100 records (normal select through ordered index followed by oder by, group by and so forth).
Restart Times
While less glamorous than performance, the time taken for a data node to restart can make a huge difference to how easy it is to manage your cluster. As the size and activity of the database increases, the restart time for a single data node will go up, if you then multiply that time by the number of data nodes you have, maintenance activities can start to take longer than you’d like.
This first MySQL Cluster 7.4 labs makes some signifficant improvements to the restart times – mostly by allowing more of the work to be done in parallel.