MySQL :: MySQL Replication Defaults After 5.7

The default or “out of the box” experience is simply the experience for most people, so it’s critically important for us to ensure that the default configuration offers a good experience for the vast majority of cases. This is always a challenge, due to the matrix of factors involved–hardware configurations, software stacks, application use cases, etc.–but it’s most certainly a worthy endeavor. Morgan Tocker and I spent a significant amount of time towards improving the Server defaults in MySQL 5.7, and we are continuing that effort moving forward.

The overall goals of this effort here are:

Simplicity — things should simply work well without any unnecessary tuning for most users.
Interoperability — MySQL replication should work well with other MySQL products (Utilities, Fabric, Group Replication, etc.) without additional re-configuration.
Security and durability — the default behavior should choose data integrity, consistency, and access security over raw performance and other factors.
Supportability — while the above helps to eliminate many potential issues, we should also provide additional information that makes it easier to debug problems.

I wanted to lay out my initial thoughts for improving the Replication specific defaults after 5.7 in order to get community input and feedback. I can’t overstate how valuable that input is for us! So here they are:

log-bin — Binary logging should be enabled by default, as it’s critical for MySQL Replication, point-in-time recovery, and integration with other data processing components commonly used today (e.g. Kafka, Impala, and Spark).
binlog-row-image = MINIMAL — ROW based binary logging is now the default in MySQL 5.7 and later. With a minimal row image you limit the potential overhead impacts of this change on disk I/O, disk space, and network I/O. Minimal images meet the basic requirements of MySQL replication and point-in-time recovery, so it’s a sensible default.
binlog-rows-query-log-events = ON — This option makes it far easier for a DBA to understand what’s being replicated, thus alleviating one of the common pain points in moving from STATEMENT to ROW based replication.
log-bin-trust-function-creators = TRUE — This setting eliminates the restrictions placed on stored programs when binary logging is enabled.
gtid-mode = ON — Global Transaction IDs are required for modern replication environments — Fabric, Group Replication, etc. — and they generally make replication simpler and more robust. Now that we’ve improved the behavior and upgrade path in MySQL 5.7, we should make this the default replication mode moving forward.
enforce-gtid-consistency = ON — This setting prevents the execution of statements that could unintentionally cause data drift within your replication groups. This is logically and technically required when using GTID based replication.
expire-logs-days = 90 — This setting simply places a default limit on the lifespan of binary logs. 90 days is a fairly safe minimum limit (removal happens at startup and when the binary log is flushed). This can prevent excessive amounts of disk space being wasted on binary logs that are no longer needed for replication or recovery purposes.
max-binlog-size = 1G — This setting places a reasonable approximate limit (transaction boundaries are respected and not split between logs) on the maximum size of each individual binary log. This works in conjunction with --expire-logs-days to limit excessive and unnecessary disk use, while also preventing potential issues with larger single files that have existed on some filesystems (although uncommon today).
master-info-repository = TABLE — This setting offers ACID guarantees and MVCC characteristics for binary log metadata. This is required for crash-safe replication and for reliable replication generally. We want the default behavior to prefer correctness and safety over raw performance, and this is one important example.
relay-log-info-repository = TABLE — This setting offers the same benefits on the slave side that I noted for the master in the above --master-info-repository paragraph. We want crash-safe replication to be the default. For those willing to sacrifice correctness, consistency, or reliability for raw performance, the options to do that still exist. Again, we want to reverse the legacy MySQL tendency to favor performance to correctness, thus getting rid of many historical MySQL “gotchas”.
relay-log-recovery = ON — This option ensures replication consistency after a crash on the slave side by ensuring that no possibly corrupted relay logs are processed. This is yet another part of ensuring crash-safe replication by default.
log-slave-updates = ON — This option ensures correct behavior in various replication chain setups, which have become the norm today.
slave-exec-mode = IDEMPOTENT — This setting ensures that the slave is in a consistent state with the master when the transaction completes, while at the same time preventing unnecessary stoppages of the replication stream. In essence, it ignores errors caused by cases where the data diverged prior to transaction execution, as long as the data is consistent with the master after the transaction executes, then we proceed (e.g. the exact key and row already existed on the slave at trx start time).
slave-parallel-type = LOGICAL_CLOCK — This setting offers the optimal concurrency method for parallel slave execution. As “slave lag” is one of the more common issues that MySQL DBAs have to deal with in production environments, the default configuration should limit that issue as much as possible.
slave-parallel-workers = 8 — This setting, combined with LOGICAL_CLOCK based parallelism, will offer a default behavior that eliminates most cases of temporary replication divergence due to the asynchronous application of data changes (“slave lag”).
slave-preserve-commit-order = ON — For multi-threaded slaves, this option ensures that transactions are externalized on the slave in the same order as they appear in the slave’s relay log (it has no effect on slaves without multi-threading enabled) and it ensures that the slave never enters a state that the master was not also in. In effect, this makes slaves well suited for read scale-out, which is the norm today with MySQL. It also prevents gaps and other potential inconsistencies that can otherwise occur with multi-threaded execution.
slave-rows-search-algorithms = 'INDEX_SCAN,HASH_SCAN' — This change–adding HASH_SCAN as an optional algorithm–simply offers the best possible performance when applying transactions on tables that have no viable index for uniquely identifying rows when applying ROW based events (tables without a primary key or other unique index). With the HASH_SCAN algorithm we can still try and avoid doing a full-table scan in such cases and instead search for the applicable row(s) using generated hashes.
slave-type-conversions = ALL_NON_LOSSY — This setting allows slave execution to proceed as long as data consistency can be retained. This avoids unnecessary stoppages of the replication stream (what we care about is that the data is consistent in the end).
sync-master-info = 1000 — Lowering the default for this setting from 10,000 to 1,000 limits the discrepancy between internal binary log data structures and those exposed via the mysql.slave_master_info system table. For monitoring the status, you can continue to use the legacy SHOW MASTER STATUS command.
sync-relay-log = 1000 — Lowering the default for this setting from 10,000 to 1,000 limits the number of events that we may lose if the slave crashes, thus decreasing the recovery time when --relay-log-recovery is enabled.
plugin-load = group_replication.so — The specifics remain uncertain here, but the goal is that MySQL Group Replication is ready for use in any new MySQL Server instance.

Please let us know what you think! Feel free to add comments directly here, shoot me an email, or start a discussion with me on Twitter. THANK YOU for using MySQL!