MySQL Blog Archive
For the latest blogs go to blogs.oracle.com/mysql
Offline mode error improvement

Problem

Your MySQL server can end up in offline (maintenance) mode, and suddenly your client connections start to get rejected with a generic error ER_SERVER_OFFLINE_MODE: “The server is currently in offline mode”.

There can be a number of reasons why this happens, for example an admin set the offline_mode system variable to put the server into this mode, or the mode was activated by some issue with a group replication node or perhaps by low disk space in a cloud instance.

Finding the root cause of this issue can be a pain, you need an admin user to connect to a machine and sift through the error log searching for possible clues. Also, the error log might not help if for example the plugin or component that put the server into offline mode did not add a clear log entry for this action. It may be that all you see is that your server is suddenly running in a degraded usability mode and wonder why.

We recognize the cause of this issue needs to be communicated better. So we've added a way for the server to attach a reason as to why the offline mode was set or what user account made the change.

The improvement described below landed in MySQL v9.0, the mechanism works regardless of the MySQL protocol used by the client (classic or X protocol). It is available in MySQL Community and Enterprise Editions, and HeatWave.

Solution

The improvement adds a generic option to attach key/value attributes to a global system variable when its value is changed.

For this particular use case, MySQL code was instrumented to attach the “reason” attribute to an “offline_mode” system variable each time we set the system to offline mode. The “reason” value is a piece of text describing why we put the system into offline mode and can be used to enrich the error message with additional information.

With these changes in place, the system will now choose between three possible errors for the offline mode being set. The choice depends on whether the "reason" value for the "offline_mode" system variable exist and the user and timestamp data for when the "offline" variable changed are present.

The “reason” attribute will not be attached when the system variable was modified via SQL (SET GLOBAL offline_mode=’ON’).

In short, the system will now choose the most verbose error in the following order:

  • ER_SERVER_OFFLINE_MODE_REASON (when “reason” attribute exists), example text:

    The server is currently in offline mode since 2024-03-18 07:47:36.000581, reason: (GR) autorejoin failed

  • ER_SERVER_OFFLINE_MODE_USER (no reason given, but user name available), example:
    The server is currently in offline mode since 2024-03-18 08:38:29.146141, set by user db_admin

  • ER_SERVER_OFFLINE_MODE (legacy error, no contextual information available), example:

    The server is currently in offline mode

Examples

Example #1: ER_SERVER_OFFLINE_MODE_USER error caused by setting global system variable “offline_mode” to ON.

Example #1

Example #2: ER_SERVER_OFFLINE_MODE error caused by starting server process with --offline_mode=ON command line parameter.

Example #2

Example #3: ER_SERVER_OFFLINE_MODE_REASON error caused by using test component functions to set the global system variable “offline_mode” to ON, and set its “reason” attribute to some string value.

Example #3

Conclusion

Understanding the reason behind why the server ended up in offline mode can help users troubleshoot
connectivity problems more quickly and reduce the load on support teams.

The implemented solution is a generic mechanism to add arbitrary attributes to any system variable and can be used in the future, for observability for example. For any component developer interested in using the mechanism for its own purposes, see the documentation here.

The improvement is available in MySQL Community and Enterprise Editions, and HeatWave.

Thank you for using MySQL!