WL#11926: GR: IPv6 support
Affects: Server-8.0
—
Status: Complete
EXECUTIVE SUMMARY ================= This worklog implements IPv6 support for MySQL Group Replication. After this worklog, the user will be able to fully deploy Group Replication not only in an IPv4 but also on a IPv6 network. USER/DEV STORIES ================ As a system administrator I want to deploy an IPv6 network while running MySQL Group Replication at the same time, so I can make use of IPv6 features. SCOPE ===== The scope of this worklog is to make XCom support IPv6. REFERENCES ========== - Add IPv6 support for Group Replication https://bugs.mysql.com/bug.php?id=90217 PROTOTYPE ========= As part of the effort to verify how much this task would take, we actually came up with a working prototype for GCS running on IPv6. It runs the simple_xcom example over IPv6. It is located in the branch mysql-trunk-xcom-ipv6 Considerations about the prototype can be found at https://confluence.oraclecorp.com/confluence/pages/viewpage.action? pageId=727464494
FR1: GCS/XCom must support IP v6 as a valid addressing protocol FR2: GCS/XCom must continue to support IP v4 as a valid addressing protocol FR3: A node must allow its local_node_address to be an IP v6 address FR4: A node must allow its list of peers to be a list of IP v6 addresses FR5: A node must allow its list of peers to be a list of mixed IP v4 and v6 addresses FR6: A node must allow its whitelist to be configured using IPv6 addresses FR7: A node must allow its whitelist to be configured using IPv4 and IPv6 addresses FR8: When a node is configured in IP V6, it must show its new address type in Performance Schema tables FR9: If a node with this feature implemented want to join a group that does not have this feature implemented, it must enter a group presenting itself with a local IPv4 address. FR10: If a node does not respect FR9, an error must be thrown when joining. FR11: If a node without this feature implemented wants to join a group in which there are nodes with this feature implemented, then all group members must present themselves with an IPv4 address configured FR12: If FR11 is not respected, the seed node must reject the node entering the group. FR13: group_replication_force_members must support IPv6 as an input NFR1: There must not be any performance regression due to the usage of IP V6 NFR2: None of the existing IPv4 functionality should be affected
1.Introduction ======================== As modern networks grow in size, IPv6 is finally taking its place even in internal professional networks, as replacement for the old and depleted IPv4. XCom, and consequently GCS, had only been built around IPv4 networking, with all its limitations, such as: - All internal references to addresses and their parsing is considers only v4 addresses - Whitelisting only considers input v4 addresses - Hardware queries only consider v4 networks and interfaces (ioctl). - All low-level network layer code only creates socket structures regarding v4 - Client code only considers v4 servers. The goal of this WL is to eliminate all of these limitations and allow XCom to be an IPv6 dual-stacked applicaiton, supporting both client and server IPv6 and v4 connections. Along the next chapters, one will enter in detail which code areas need to be changed in order to fully support IPv6. 2.IPv6 Address Storage ========================= Both in GCS and in XCom, the literal storage address is IPv4, which has a specific interpretation for the format "IP:PORT". One considers string literal inputs of that type for: - Member identification, which goes up to Group Replication; - Member configuration, both the local address and member seeds; - XCom server identification, for the sake of consensus; - XCom server addressing, for the sake of physical connections; Literal address configuration will continue to be a reality, but with some challenges, such as: - Different address formats - Different parsing rules IPv4 addresses are know for their XXX.XXX.XXX.XXX format, each block representing 1 byte, commonly followed by a port. An example is the classic localhost: "127.0.0.1:12345". IPv6 has a longer 128 bit format and each block is separated by colons. The generic format are 8 blocks such as XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX. In order to have a port here, the standard recommends the usage of square brackets. String literals fed into XCom will need to have this format:[2606:b400:8f0:80:8000::705]:12345 In this process, one will have to check if the input is a V4 or V6 address. We can simplify it if we check the existence of the square brackets in the input, for IP:PORT values, and the existence of colons in pure IP entries, such as the whitelist, leaving the validation for methods such as getaddrinfo. Last in line will be the failure from the connect itself if one can't parse the address. One can augment the error messages, processing the returned error code, to check if an incorrect address was used to connect to a remote member. 3.IPv6 physical interface retrieval ======================================= XCom uses physical interfaces for two reasons: - Whitelisting, in order to automatically add private addresses to the whitelist - Node identification, since one only adds node that match existing physical addresses Currently, one uses the legacy method ioctl to accomplish the task of interface retrieval. But ioctl only supports legacy IPv4 information. More modern implementations advise the use of getifaddrs, which contains the same information conveyed by ioctl, but it is able to retrieve IPv4 and IPv6 addresses. That method is only available in *Nix. In windows, it is advisable to use GetAdaptersInfo and GetAdaptersAddresses, which have a similar output to getifaddr. 4. Low-level network code =========================== All the network code that opens sockets, receives connections and does name resolution is not ready to cope with IPv6. Common issues are: - Most of the code is hardcoded to use sockaddr_in, which is a v4 structure; - Socket creation is made with IPv4 only, both for clients and for servers. To overcome the first limitation, one must start using the generic getaddrinfo, in order to have a struct sockaddr of the correct type that allow us to use it in a generic fashion in all socket library methods. Overcoming the last limitation will depend on the way one decides to implement v6. If we go for a dual-stack approach, that means that we only expose v6 and all we get are translated addresses, one needs to separate the creation of client and server sockets: - Server sockets would be created with v6 type but we would need to set dual stack mode via ioctl - Client sockets would need to distinguish if the input address is v4 or v6 and act accordingly OR always use v6 using v4-translated addresses. 5. Whitelisting ========================== Whitelisting has several challenges, regarding configuration since now we will need to store 128 bit addresses and be aware of longer SIDs. IPv6 interface also have their particular such as the notion of link only and global addresses. In case of AUTOMATIC setup, we need to retrieve the correct hardware configuration. But this case is covered in the above section with the usage of getifaddrs Finally, one needs to extend the current comparison, since the current octet comparison, is currently limited to 4 bytes. 6. Addresses in GR ============================== In GR, one has 3 items where we configure addresses: - Local Node address - Whitelist - Group Seeds All of them are addresses, but some of them have both physical and logical attributes that are not user-visible. Local addresses has two purposes: - Uniquely identify a member in the group; - Serve as the addresses that other group members will use to contact us back, after an add_node request is accepted Whitelist is purely physical. It is triggered when one receives a physical connection from other nodes. As such, no local address is at play here. Only physical counterparts. Group Seeds are also purely physical. It is an address in which we can send an add_node request. Since GR/GCS/XCom binds to all addresses in the host, one can send the request to any available address. 6.1 Practical implications =============================== Having said the above, when one adds a member to the group, it will contact a seed node in order to send an add_node request. Consider node A, already in the group, and node B attempting to join: 1. B will create a physical connection to A, using the seed address configured in Bs seed list; 2. A receives a physical connection from B and checks if B is allowed to connect, following the permissions configured in the whitelist. 2.1 If the physical address of B used to connect to A is not in A's whitelist, we will reject the connection. 2.2 If it belong to the whitelist, the physical connection is allowed to continue. 3. B sends an add_node request to A, that contains the Local Address of B. 4. A receives the add_node from B and runs a series of checks to see if B is allowed to join a group. 4.1 if B is rejected, it receives a REQUEST_FAIL answer 5. B proposes A to be addeded to the group 6. B will then receive physical connection from all group members, including A. 7. When B receives a physical connection from A, it will run step 2 of this algorithm. Considering the steps above, we see no issues in the following scenarios: - IPv4 only - IPv6 only - Mixed IPv4 and IPv6 with old IPv4 binaries, since they will all talk to each other using pure IPv4 or pure IPv6 clients and servers. With this WL implemented, there is an issue that emphasizes the separation between what is logical and what is physical. Lets use the following example: Node A: NIC 1: 10.10.172.123 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Node B: NIC 1: 10.10.172.124 2606:b400:8b0:40:3d9c:cc43:e006:19e8 Node A configuration Bootstrap = YES Local Address = 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Seeds = = 2606:b400:8b0:40:3d9c:cc43:e006:19e4 WhiteList = 10.10.172/24, 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Node B configuration Bootstrap = NO Local Address = 10.10.172.124 Seeds = = 2606:b400:8b0:40:3d9c:cc43:e006:19e4 WhiteList = 10.10.172/24, 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Node A will boot the group. Then node B will try to join the group and it will fail. The question is: Why? Node B will try to contact node A with its IPv6 address. As such, Node B will use a IPv6 connection. When it arrives on the other side, Node A will run step 2 of the join algorithm. The address that it will see will be the IPv6 address of Node B. And that address is not configured in Node A whitelist. 6.2 Correct configurations in a mixed scenario using IPv6 capable binaries ============================================================================ If one wants to maintain a mixed scenario as descibed above, we need to take into consideration that: - The protocol in which the seed is configured, is the protocol that we will use to create the connection. - Whitelist verification will use the address that is used to create the connection. The corolary of this is that we need to consider in the group whitelists, not only the logical local addresses but also the physical addresses of each participating node. And we need to remember that it needs to be reciprocal. As such, A needs to have B in the whitelist and vice-versa. A correct configuration of the scenario above will be: Node A: NIC 1: 10.10.172.123 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Node B: NIC 1: 10.10.172.124 2606:b400:8b0:40:3d9c:cc43:e006:19e8 Node A configuration Bootstrap = YES Local Address = 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Seeds = = 2606:b400:8b0:40:3d9c:cc43:e006:19e4 WhiteList = 10.10.172/24, 2606:b400:8b0:40:3d9c:cc43:e006:19e4,2606:b400:8b0:40:3d9c:cc43:e006:19e8 Node B configuration Bootstrap = NO Local Address = 10.10.172.124 Seeds = = 2606:b400:8b0:40:3d9c:cc43:e006:19e4 WhiteList = 10.10.172/24, 2606:b400:8b0:40:3d9c:cc43:e006:19e4 Note that we added Node B IPv6 address to Node A whitelist. Adding a range would also make the trick. 7. General Code Refactoring ========================== XCom network code is spread all around in the client code, in server code and in the whitelist code. This WL must take the chance to unify the socket and sockaddr creation, to avoid having duplicated code all over to accomplish the same task. If possible, also refactor the headers to have well-defined interfaces.
1. Introduction ========================== This section will enter in detail which code will suffer changes, which new methods will be used and finally which code will be refactored. The following sections will describe: - How and when to change parsing for IPs; - Implementation of a new way to retrieve physical interfaces; - Replace legacy structures by getaddrinfo; - Dual-Stack: how and where to implement; - Whitelist augmentation; 2. Address input and parsing ============================== From GCS, we have direct IP:PORT input from group_replication_local_address and group_replication_group_seeds. We also have inputs in IP format from whitelist, but it will be considered in another section. From now on, one will need to accept both IPv4 and IPv6 addresses. The correct way to accept those addresses is using the same notation that is recommended for browser URLs which is: [IPv6]:PORT. IPv4 will remain the same. IP parsing will start with checking if are in presence of an IPv4 or IPv6 address detecting the existence of square brackets in the address. To check its validity, a run through getaddrinfo will check the validity of the address. The parse will happen in GCS level in: - Gcs_xcom_node_address class, that decomposes a string into IP and PORT - is_valid_hostname And in XCom level in: - int end_token - char *get_name - xcom_port get_port 3. Physical address retrieval =============================== ioctl is not an option when it comes to IPv6 physical interface address retrieval. In GCS/XCom, this is used in two cases: - Whitelist configuration - To determine one's node index when adding a new member to the group. Currently this is done in the tryptic: sock_probe.c / \ / \ V V sock_probe_ix.c sock_probe_win32.c sock_probe.c includes either sock_probe_ix.c or sock_probe_win32.c, depending on the platform where the code is built. Both files implement their version of the methods: - static int init_sock_probe(sock_probe *s) - static void close_sock_probe(sock_probe *s) - static int number_of_interfaces(sock_probe *s) - static bool_t is_if_running(sock_probe *s, int count) - static sockaddr get_sockaddr(sock_probe *s, int count, struct sockaddr **out) The current implementation retrieves interfaces from ioctl and creates an index on top of it. getifaddrs simplifies this task since it returns a linked list of all existing interfaces. As such, one just needs to replace the current sock_probe content with that linked list reference. Note that one needs to be careful to only consider valid interfaces the ones that belong to the AF_INET4 and AF_INET6 families. In Windows, one must migrate the current solution, which is based in WSAIoctl to a more modern version using GetAdaptersAddresses, that retrieves all adapters addresses regarding all address families. It works the same way as getifaddrs. 4. Usage of getaddrinfo instead of raw structures =================================================== Most of the raw network code uses sockaddr_in structures. Some notable examples are all the client code within XCom, both in the synchronous client methods and in the dial() and connect() methods used to connect back to joining nodes. That code is tied to the usage of IP v4 and, in order to make it generic, one must change to use checked_getaddrinfo when possible, since the Socket API methods can use the returned structures directly without the need for casting back and forth between "struct sockaddr" and "struct sockaddr_in" An example is: [snip] struct addrinfo *addr = 0; char buffer[20]; sprintf(buffer, "%d", port); checked_getaddrinfo(server, buffer , 0, &addr); if (addr == 0) { return 0; } /* Connect socket to address */ SET_OS_ERR(0); if (timed_connect(fd.val, addr->ai_addr, addr->ai_addrlen) == -1) { [/snip] The code becomes much cleaner, since checked_getaddrinfo fills all necessary fields in a generic struct sockaddr. What needs to be taken care of is that the return of getaddrinfo is a linked list of addresses. This means that, if we are resolving a name, we need to be careful to check if it does not return both V4 and V6 versions of the same name. We need to take in consideration: - the Upgrade and Downgrade scenario, described in chapter 7; - Dual stack ability of the new code; The wise approach is to always default to V4, since it is the omnipresent protocol in both old and new nodes. As such, if a node is configured in DNS both with V4 and V6, the address to be used will always be the V4 address. If one want to use exclusively IPv6 with name configurations, name resolution for those addresses should always point to the V6 address.If needed, one can always create a new parameter, in order to decide the default name resolution decision: either v4 or v6. 5. Dual-Stack =========================== With this modification, one will support both IPv4 and IPv6 connections, as MySQL does. There are two ways to implement this: - Have two sockets bound, one in V6 and another in V4 - Use what MySQL uses, which is Kernel support for dual stacking. This works by creating an IPv6 server socket, and setting an option via ioctl. An example follows: int sock = socket(AF_INET6, SOCK_STREAM, 0); int mode = 0; setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY, (char*)&mode, sizeof(mode); This needs to be done when creating the server socket, which is done in: result announce_tcp(xcom_port port); It will allow the application to have only one open socket, but receive both IPv4 and IPv6 connections. The only caveat of this approach is that, when converting IPv4 addresses to text mode, they will be represented as IPv4-mapped addresses, which has the first 80 bits set to zeros, followed by the next 16 bits set to all ones and finally, the last 32 bits written in dotted decimal appended to then end forming 128 bit IPv6 address. An example of an IPv4 Class A address of 12.155.166.101 would look like this in IPv4 Mapped address 0000:0000:0000:0000:0000:FFFF:12.155.166.101 or ::FFFF:12.155.166.101 in IPv6's short form. As MySQL server, we need to accept both formats as inputs in the parameters: - ::FFFF:12.155.166.101 - 12.155.166.101 As physical storage in the "struct server", we shall not store the mapped version, since there is no need to maintain the IPv4-mapped version. 6. Whitelist augmentation ============================= Whitelisting in GCS/XCom has two moments where adding IPv6 becomes relevant: - Configuration - Runtime When configuring the whitelist: - bool Gcs_ip_whitelist::configure(const std::string &the_list) This is entry method where the input string from the whitelist is split into several strings. One needs to add support for splitting IPv6 addresses and detect if localhost is configured. If not, we must add both IPv4 and IPv6 localhost address. - bool Gcs_ip_whitelist::add_address(std::string addr, std::string mask) Indirectly, add_address uses Gcs_ip_whitelist_entry derivatives, which are Gcs_ip_whitelist_entry_ip and Gcs_ip_whitelist_entry_hostname. Gcs_ip_whitelist_entry_ip uses bool get_address_for_whitelist that needs to be checked if it has IPv6 support. When using the whitelist in runtime: - bool Gcs_ip_whitelist::do_check_block_whitelist This method already uses octet block to compare entries in the whitelist. One must ensure that the input is generic and it is able to compare either v4 or v6 addresses. - bool Gcs_ip_whitelist::do_check_block_xcom This method compares the new entry with the existing group. It also needs to take into account the new entries regarding IPv6 addresses. Whitelist also supports the AUTOMATIC feature, in which GCS automatically fills the whitelist field with private addresses. For more detail on that please refer to WL#9345. In this WL, we need to augment this in order to add IPv6 private addresses, which are of 3 types: - Localhost ::1 - Link-Only addresses that start with fe80::/10 - IPv6 reserved private addresses which start with fc00:/7 For more detail on this subject, please refer to the standard in: - https://tools.ietf.org/html/rfc4193 - https://tools.ietf.org/html/rfc5156#page-2 For general knowledge on IPv6 addressĩng please refer to https://tools.ietf.org/html/rfc4291 7. Upgrade/Downgrade ============================= 7.1 Upgrade ============================= Regarding upgrading the group, one cannot join the group with an IPv6 address since old nodes won't be able to contact you back. You need to join the group with an IPv4 address, and when all members are up-to-date, start switching the local addresses to the desired IPv6 address. Note that, if you present yourself to the group with an IPv6 address, the other nodes won't be able to contact you back. To really avoid that and have a meaningful error in the joiner node, one should bump the XCom protocol to validate that one is using a correctly configured address when contacting a lower version that does not support V6 connections. Since this will happen on an add node client request level, we should consider adding this check in the function: static int64_t xcom_send_client_app_data(connection_descriptor *fd, app_data_ptr a, int force) 7.2 Downgrade ============================= Regarding downgrading the group, we have the same issue as in Upgrade. Old nodes can't speak IPv6 and, even if they are reachable via IPv4, when they receive the new configuration, it will contain IPv6 addresses in String format, which they do not know how to interpret. Having said that, before starting a group downgrade, all nodes must be have their local addresses reconfigured to IPv4. After that, one can start joining older nodes that do not support IPv6 to the group. 8. Security ============================= There are no security considerations regarding adding IPv6.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.