MySQL Blog Archive
For the latest blogs go to blogs.oracle.com/mysql
Planning the defaults for MySQL 5.8

In the MySQL team we pay close attention to the default configuration of MySQL, and aim for users to have the best out of the box experience possible.

Following on from a series of blog posts for MySQL 5.7, we are now planning the defaults for MySQL 5.8. We are looking for feedback from you, our users, to share insights on what can be improved.

An Introduction

Being the default means that these settings should appeal to the largest group of users possible. This creates a constraint where values are often the lowest common denominator. The general guidelines for new defaults is as follows:

  1. Must work on virtual machines and cloud instances out of the box. Currently this stands at 64-bit Linux with 512M memory and swap disabled.
  2. Follow the principle of least surprise. This means that if a new option is introduced to potentially lose data (at increased performance) it should be off by default.
  3. No change for the sake of change. That is to say that some changes may not be suitable if the upgrade effort is disproportionate to the value brought.
  4. The test-suite should pass. For some changes, this can result in significant work behind the scenes.

Areas of Interest

We are looking at making improvements to the following areas. I have included in italics what feedback would be most useful. Please leave a comment, or get in touch! We would love to hear from you.

Character Set and Collation

We are considering changing the default character set to utf8mb4. Modern applications frequently store 4 byte characters, as emoji input is common for mobile devices. This change has some impact. For some backstory:

  • In MySQL 5.7 we introduced a new type of internal on-disk InnoDB table that is used by the optimizer for storing intermediate results which do not fit within the bounds of tmp_table_size (which currently use the MEMORY/HEAP engine). We hope to further enhance the performance of internal InnoDB tables so that they may also replace the current use of the MEMORY/HEAP engine, which does not support variable length columns, and thus may under-perform and over-consume when using multi-byte character sets.
  • In MySQL 5.7 we changed the default row format to DYNAMIC, which supports longer indexes than earlier row formats (previous: 767 bytes now: up to 3072 bytes).
  • The DYNAMIC (and COMPACT) row format will use variable length storage for multi-byte characters for both data and indexes, leading to no regression in storage on disk.
  • As of MySQL 5.7, the sort buffer will also pack variable length columns.
  • As the character set is saved on a per column, table, and schema basis (and preserved by mysqldump etc), MySQL upgrades will work smoothly with the user continuing to use their existing character set. New schemas will by default use the new default character set.

In addition to utf8mb4, we are also considering switching the default collation to be utf8mb4_unicode_520_ci. We plan to add collations to include per-language collation rules as we currently offer with latin1 based collations. They will have a more modern version of the UCA (Unicode Collation Algorithm) than the
utf8mb4_unicode_520_ci.

We invite feedback on this change.

Event Scheduler

We see opportunities to expand the SYS schema to include functionality such as idle transaction timeout. Enabling the event_scheduler by default is a stepping stone in the process. We are also excited to see what opportunities this will have for third party applications, for example Yahoo recently announced a Partition Manager based on events.

We invite feedback on this change.

Misc Session Buffers

With our goal being to fit within 512M of RAM out of of the box, we are looking to take a conservative approach on increasing per-session buffers. Having said that, there are a few settings which we have been received feedback on:

  • group_concat_max_len (currently 1K) – This buffer is the maximum size of GROUP_CONCAT() elements before truncation.
  • join_buffer_size (currently 256K) – The minimum size of the buffer that is used for plain index scans, range index scans, and joins that do not use indexes and thus perform full table scans. Also used by the BKA optimization (which is disabled by default).
  • read_buffer_size (currently 128K) – This buffer is used by MyISAM as well as all other engines.
  • read_rnd_buffer_size (currently 256K) – This buffer is used by MyISAM and the MRR optimization introduced by MySQL 5.6.
  • sort_buffer_size (currently 256K) – In MySQL 5.7, we introduced an optimization to pack variable length columns inside the sort buffer. We are interested to see if this changes recommendations for configuring the sort buffer.
  • max_allowed_packet (currently 4M) – This is the per-connection maximum size of an SQL query or row returned. The value was last increased in MySQL 5.6.

We invite feedback on new values for these settings.

Security

Our goal is to make sure that when you download and install MySQL, it is secure by default. This means that you should not require any configuration changes in order to deploy in production, and you must opt out to have less secure options. We are proposing the following changes:

  • symbolic_links – Many of our configuration files for MySQL packages (as well as those that ship with Linux distributions) currently default to disabling symbolic links. We feel that changing a de facto default (disabling symbolic links using configuration files) to a compiled default will improve user experience.
  • local_infile – This setting controls whether LOCAL is supported for LOAD DATA INFILE statements. We are proposing that this be disabled by default.
  • default_authentication_plugin – We would like to move towards sha256_password being the default authentication plugin, as the current SHA1 hash used by mysql_native_password has known vulnerabilities that are projected to be exploitable in the near future. This has the attached constraint that sha256_password requires TLS connections or RSA encrypting the password in transit.

We invite feedback on new values for these settings.

Slow Query Log

The default long query time of 10 seconds is more than two orders of magnitude larger than what we would expect most queries would take in a modern web application (100us-100ms). We are considering lowering long_query_time to be in a range closer to one order of magnitude longer than typical query execution time. We would also like to consider changing the values of log_slow_admin_statements and log_slow_slave_statements.

We would be interested to hear how you configure the slow query log. Since the slow query log is not enabled by default, we do not run the risk of filling a volume accidentally, and users who enable the slow query log will benefit from more optimal settings.

InnoDB

We have identified the following settings in InnoDB as candidates for a change in default:

  • innodb_autoinc_lock_mode (currently 1) – With row-based replication now the default in MySQL 5.7, we would like to also change the default value of innodb_autoinc_lock_mode to 2. This has the advantage of auto increment allocation being concurrent for multi-insert statements (by default it is only concurrent for single insert statements).
  • innodb_log_file_size (currently 48M) – With the InnoDB buffer pool now dynamic, we see an opportunity to increase the default log file so that a running system can be re-configured dynamically to perform better. We are hesitant to make the log files too large, as it increases the first-start time, which is important for our Docker efforts.
  • innodb_print_all_deadlocks (currently OFF) – Since MySQL 5.6 we have the option to print all deadlocks to the error log, which can be useful to retroactively investigate locking problems. To date, we have not enabled this option by default, because on a high-throughput system some level of deadlocks is natural, and logging all has the ability to fill logs very quickly. We are seeking opinions from the community as to which is the better default.
  • innodb_sort_buffer_size (currently 1M) – The InnoDB sort buffer is used by InnoDB during the creation of indexes. Larger values can improve index creation performance.
  • innodb_stats_persistent_sample_pages (currently 20) – In MySQL 5.6 we introduced a new persistent version of InnoDB sampling statistics (enabling it default). As part of this change, we also increased the default sampling from 8 to 20 pages, and have seen a good result in improving query plan stability. We have seen customer cases where increasing the sampling to 100-200 has been beneficial, and are considering increasing the default.

We invite feedback on new values for these settings.

InnoDB IO Configuration

Our current defaults are conservative in their assumptions around IO performance. With SSDs being prevalent on even low-end cloud machines, we may consider a set of defaults which assume faster IO performance. This will result in a reversal where users with slower IO (hard drives) will now need to make changes in order to perform better. The specific settings are:

  • innodb_flush_method – InnoDB supports direct IO via either O_DIRECT or O_DIRECT_NO_FSYNC. The benefit of doing this, is that the buffer pool pages will not be double buffered in the fileystem’s cache. We think this is a good default for fast IO systems, but may be worse in slow IO systems where the filesystem cache has benefit. As part of this change, we would also like to switch Windows to use unbuffered IO.
  • innodb_flush_neighbors – InnoDB by default will attempt to merge writes to adjacent pages as part of its background flushing operations. This optimization works very well for slow IO devices (such as hard drives), but is not always useful on fast IO devices. We are considering changing the default to be OFF.
  • innodb_page_cleaners – In some workloads the default of one page cleaner is unable to keep up with the background work of flushing dirty pages from the buffer pool. With fast IO devices benefitting from writes occurring in parallel, we are considering a larger default value.

We invite feedback on whether you agree the defaults should now assume fast IO devices. Users with slower IO performance can of course restore previous defaults.

Timeouts

We are considering changing some of the default values for various timeouts, many of which are very long for today’s web applications. The settings we invite feedback on are:

We will likely need to be conservative in making changes to timeouts, but would be interested in hearing what values our users are using for these settings.

Misc

The following settings do not fit under a specific category, but have been previously raised as suggestions from our users:

  • lower_case_table_names (currently 0) – MySQL is currently case-sensitive on filesystems that are case-sensitive, and insensitive on those that are not. It has been suggested that being case-insensitive on all may provide a more consistent cross platform experience.
  • max_connect_errors (currently 100) – If more than this many successive connection requests from a host are interrupted without a successful connection, the server blocks that host from further connections.
  • max_digest_length (currently 1024) – We use the max_digest_length in a number of places in the server; such as the MySQL Firewall and statement rewrite. We have heard feedback from a number of users that the default is too low and leads to statement truncation. Changing the setting to a higher value is also inconvenient as it is not dynamic. We would like to increase default max_digest_length to a value which covers the maximum length for most typical queries.
  • thread_cache_size (autosized) – In MySQL 5.6 we changed the thread cache to autosize by default. We’ve heard feedback from users that they are still changing this setting, and are inviting feedback as to whether the autosize algorithm could be improved.

We invite feedback on new values for these settings.

Replication

Configuration changes to replication are covered in this dedicated post on the MySQL High Availability blog.

Conclusion

This post is our first in what will likely be many more to follow. We look forward to your feedback, and thank you for helping make MySQL better!