WL#7293: Make sure that when 32-bit connection-IDs roll over, still in-use IDs are not re-issued
Status: Complete
PROBLEM Even when the server is compiled for 64-bit (and connection-IDs consistently had that type internally, which currently they don't), the wire-protocol still only specifies 32 bits for ID. Short of requiring all apps (including those our customers may not have the source for) being switched to a 64 bit protocol all at once, support for the 32 bit IDs must remain for the foreseeable future. The problem with this is that some customers have a sufficient number of connections for this 32-bit ID to roll over (exceed the data-type's number space, and start at 1 again, much like in the Y2K bug). In most cases, the "low numbers" were used by connections that have long disappeared, so the IDs could be "recycled" and be issued again. However, there are some outliers, thread_IDs that may persist for a long time (such as replication threads or the event scheduler, perhaps monitoring), so we must prevent thread_IDs that are actually still in use from being re-issued to new threads (as if two threads have the same ID, a KILLwould be ambiguous). Finally, there is a possible-but-unlikely race condition: thread A is created thread B saves A's connection-ID thread A eventually disconnects thread C is created, and gets the now free ID thread A had thread B tries to KILL thread A, but instead KILLs C as that now has the ID previously associated with A, and B is unaware that A went away and was "replaced". PROPOSAL REQUIRED: The server must be prevented from issuing a connection ID that is currently in use. * For minimum invasiveness in the code base and maximum compatibility with legacy clients, it is proposed that the ID space remain at 32-bit for the foreseeable future. * The current C-API (libmysql) and SQL functions will likewise remain available for backward compatibility. * The server MUST prevent in-use IDs from being re-issued. DEFERRED: As the aforementioned race condition is very unlikely to occur and the customer has confirmed they do not require this functionality, and even many operating systems do not protect against similar issues, the handling the race condition shall be considered beyond the scope of this implementation. If another case is later opened to add such a safeguard, the following method is suggested for consideration: * The server SHOULD send compatible clients a tracker-item containing a timestamp (time_t) on connect. * Clients wishing to avoid the aforementioned race condition MAY use new syntax that submits both connection ID and connect-timestamp, either as KILL STARTED AT or the equivalent mysql_kill_safe(MYSQL *, id, timestamp) * This will uniquely identify or current or historic connection throughout the lifetime of the MySQL server even if a connection-ID has been re-issued. INTERESTED PARTIES Runtime, Support, Protocol-interested Group, Replication. FURTHER READING BUG#44167 User Documentation ================== http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-5.html
Func-Req 1 In busy installations, more than 2^32 connections may be made, at which point the connection-ID will roll over and start at 0 once again. At this point, the implementation must necessarily re-issue connection-IDs that have been issued before: Func-Req 1.1 The implementation must avoid re-issuing IDs that are actually still in use. Func-Req 1.2 The implementation must address threads which have thread_IDs, but are not actually connections in the strict sense, such as the event scheduler. NFR 1 The update should clean up the code to the effect that the designated data-type for connection-IDs (my_thread_id) is used consistently throughout, and that format-strings referring to such items are adjusted and have their arguments cast accordingly.
Changes to interface specification: n/a The selected design has as its only observable feature the fact that duplicates of connection-IDs that are in use at a given time are prevented from being issued. Keeping the protocol unchanged was a specific design goal, and there are no configuration options or monitors: No changes to options, system variables, SQL, server errors, client errors, files, log, monitoring, protocol, plugins, API, or information schema.
OTHER CONSIDERATIONS / BACKGROUND This section discusses details of the current implementation in trunk, alternative solutions, and why they were not chosen. "32-bit connection-ID externally" The wire protocol currently reserves 4 bytes / 32 bit for connection IDs. Connection-IDs are sent by the server in the handshake packet. This specifically means that the ID is sent exactly once -- when the connection is established. (If the connection breaks and is re-established, either explicitly by the clients, or implicitly because auto-reconnect is enabled, this happens again, and we receive a new connection ID. Auto-reconnect is a special consideration here, as it transparently changes the connection-ID, see below.) Connection IDs are also used for temp-tables; the current code again works with 32 bit, even on 64-bit platforms. binlog likewise uses 32 bit. "64-bit connection-ID internally" my_thread_id is defined to ulong in my_pthread.h, meaning that connection-IDs are nominally 32-bit on 32-bit platforms, and 64-bit on 64-bit platforms. However, the my_thread_id type is not currently used everywhere it should be. On SQL level, everything has been updated (SELECT CONNECTION_ID(), KILL, column type for connection-IDs in log tables) to cope with 64-bit wide IDs. "64-bit everything" -- far future 64-bit IDs would push back the ID roll-over far enough for it to be considered "safe for all practical scenarios." Seeing as this would require a consistent use of the correct data-type, extensive testing, update of temp-table handling, update of the protocol, etc., this is no short-term solution. (It would also arguably imply redefining the data type so that connection-IDs are 64-bit even on 32-bit platforms.) Even so, unless this new 64-bit server would only accept connections from clients compiled against a matching 64-bit client library, the server would still have to support 32-bit clients. This would make the implementation more complicated and brittle -- would 32-bit clients draw IDs from 1..2^32-1, either with the same rollover issues as before, or with one of the rollover protections described here, while 64-bit clients draw safe IDs from 2^32..2^64-1? Seeing as 32-bit clients likely won't disappear soon, due to legacy clients (and new clients if we don't force 64-bit IDs even on 32-bit clients), "64-bit everything" is non-trivial. "Limitations of solutions utilizing the current SQL layer" Since the SQL layer nominally supports 64-bit (but needs more testing), clients could in theory SELECT CONNECTION_ID(); ... KILL @x; even for 64-bit IDs. If however an auto-reconnect happens during our above "...", we'd once again be using a stale connection-ID. (Our SELECT returned the ID our connection had before it broke; after reconnect, a new ID was issued. This could be alleviated with a 64-bit tracker item sent to compatible clients on connect, but this would once again raise the issue of different numberspaces and different handling for old and new clients. A workaround for having to handle 32-bit and 64-bit differently would be updating the server like so: - once mysqld in 64-bit exceeds 32-bit numberspace, it will always send wire protocol ID of 0. It will always send a correct 64-bit ID to compatible clients as a tracker item. - if mysqld receives a mysql_kill(0), or a KILL 0, it refuses the request This will prevent old clients from killing the wrong thread. This combination should have the following effect: mysqld32: all clientlib and app combinations work as before mysqld64: clientlib64 will always work. legacy clientlib32 will work as before in smaller installations; in high-request settings, the API function mysql_kill() will be rejected once 32-bit numberspace is exceeded. As a business consideration, this may break legacy apps that a client may not have the option to easily recompile. If we allow a mixed mode -- 64 bit for clients that support it, with a sub-range 1..2^32-1 used by legacy clients that don't, we're back where we started: we need to address the case of the 32-bit IDs rolling over. (But now, we've got two problems: one thread-ID counter for 32-bit IDs, and one for 64-bit IDs.) "Alternative protection against race-condition" (A footnote in case addressing the race-condition is reconsidered at a later date.) An alternative solution for the (deferred) issue of the race condition described implemented a server-side only model that would not make an ID available to be re-issued immediately after its connection ends, but would reserve it for a configurable time (not entirely unlike SO_LINGER/TIME_WAIT and friends). The advantage of that model is that it has no need for the client side to use a current/updated library. The disadvantage of that model is that it's more invasive on the server side (requiring memory for the block/linger-list, and code for housekeeping of that list, with the least locking overhead possible). In contrast, the overhead of the "shared solution" (updated server and updated client-lib co-operating if the client desires protection against the race condition) shows minimal overhead (memory-wise, the additional overhead per THD is sizeof(time_t); code-wise the onus is partially on the client-app (which has to use the new syntax)). On balance, the split solution would seem preferable because of its simpler locking, and lower memory and CPU requirements, because of which it should scale much better. The shared solution furthermore has the advantage that the protection does not just cover a linger time, but the entire runtime of the server. "ID space" The assumption is that there will be hundreds or thousands of connections at any one time, in a numberspace of 2^32, i.e. more than 4,000 million. Even at capacity (we actually have about max_connections connections), this will be a sparse use of the ID space. Further, IDs increment until roll-over (at 2^32-1). It is assumed that as a general rule, there will be few extremely long-lived connections (replication threads, perhaps monitoring, etc.). For the most part, connections will be made, used for a while, and then broken. It is therefore assumed that at the time of the roll-over / wrap- around, most connection-IDs will be in the top half of the numberspace (to say the very least), with only a few long lived outliers deviating from this (e.g. replication thread created at server start with e.g. connection-ID 3, never disconnected, still holds that ID). The design was made with this scenario in mind, as well as safety of operation in other, non-optimum scenarios. "ID selection" Several methods to select IDs and guards against duplicates (i.e. to make sure that the new ID does not coincide with any of the currently in-use IDs), for instance the ID space could be sub-partitioned, up to the case where each thread has their own "sub-space" that rolls over individually and specifically does not need a lock for protection from other threads. Unfortunately, this mitigates the situation in that we'd have to sub-divide equally, leading to a very quick roll over within each subpartition when max_connections is configured to large values. A more realistic approach is to leverage the infrastructure already in place: in trunk, atomic primitives are used to create a new connection-ID, which are then inserted into a std::set (an ADT that enforces uniqueness and is usually implemented as a tree). We can maintain the speed- and locking characteristics by simply changing the set so that it is sorted by ID. This would give us pseudocode along the lines of, thd->id= get_next_id(); // atomic lock(); while (true) { if (thd_list->insert(thd) == SUCCESS) break; thd->id= get_next_id(); } unlock(); As ID generation would remain unchanged (and atomic), and we'd still do one insertion into the thread-list which would remain the same data-type, no grave changes in terms of memory consumption, behavior, or performance are expected. * As a theoretical consideration, a pathological case could be artificially created where the "next ID" is at the beginning of a large range of in-use IDs, and a lot of connections are suddenly made at the same time, so all the new connections would start of with consecutive optimistic/prospective IDs which all happen to be in the in-use range that lies ahead. Even then the first connection that manages to get the lock for the thread_list will then traverse the in-use block, and take the first free ID behind it. The other new connections can than benefit from the updated global counter and need to update their ID only once, rather than having to each scan the in-use range themselves. Even so, this case is highly academic in its assumptions; even with a very high value for max_connections, there is a natural trade-off in that if many connections are still in use, then not a lot of new connections could be made that "congest" ID-creation; and conversely, if many new connections are made, no large clusters or ranges of in-use IDs should be expected ahead (rather than behind) the current ID counter in the already sparse ID-space.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.