WL#11652: Support multiple addresses for the --bind-address command option

Affects: Server-8.0   —   Status: Complete

The MySQL server can be configured to listen to:
1) One (and only one) IPv4 interface and one(and only one) IPv6 interface
 or
2) All interfaces that are configured on a server.
See
https://dev.mysql.com/doc/refman/5.7/en/server-options.html#option_mysqld_bind-
address

If at OS level, the server has multiple networks addresses, there is a need to
configure the MySQL server to listen to a set of addresses. This will enable the
MySQL server to restrict an address the server should *not* listen to as well.

A typical example is to have network segments split on parts like:

1. Business flow
2. Management and monitoring
3. Backup

Through the current --bind-address config variable, it is not possible to
configure the MySQL server to listen to only two network addresses.

User's Use case.
================
This worklog was inspired by user requests. Example of one such request
is quoted below:
Currently, mysql seems to support only exactly one IP address to bind to
(besides binding to IN_ADDR_ANY that means all addresses).

This is a disadvantage in HA Cluster configurations, where a cluster
framework stops/starts stand-alone mysql instances (non-NDB) on
physical servers within a cluster.

Within such frameworks, you don't want to bind to IN_ADDR_ANY
but rather to specific IP adresses in order to allow the cluster framework
to switch two instances of an application onto the same cluster node
(with the same (standard) port number).

This is currently possible with mysql, but only for exacly one IP address
per mysql instance.

In situations where you need more than one IP address for an HA
instance, you either have to bind to IN_ADDR_ANY and somehow
make sure that no two mysql instances will run on any cluster node
at the same time, or find other means by which to provide the second
address (tcp forwarder, NAT on a firewall - whatever).

Both solutions are not ideal, so this feature request is for a configuration
option allowing more than one IP address to be specified.
 

General description of a way to satisfy user's Feature Request.
===============================================================
The suggestion is to support a comma separated list of bind-addresses the
MySQL server should listen to incoming connections. Since the system call
"bind" only supports listening to one or all configured addresses of a server, 
separate sockets need to be configured per network address given in this list.
This will be very sensitive to connect performance, and should be implemented
in a way that if only one address is given to --bind-address, this penalty
is minimized.

The list of addresses cannot accept any of the wildcard options. The server
should not start if any of the addresses given through the --bind-address
option is not available at OS level.

Worklog scope
=============
This worklog won't affect Xplugin or Group Replication including XCom
in any way.

References.
===========
http://bugs.mysql.com/bug.php?id=14979 
Functional Requirements.
========================
FR.1 Server implementation must provide support for specifying multiple
addresses as a value for the command-line parameter --bind-address.

FR.1.2 The symbol ',' must be used as a delimiter.

Rationale for FR.1.2: Another possible symbol that typically used as a delimiter
is the symbol ';'. Since semicolon (;) is interpreted as a special character by
some command interpreters (Unix shells treat it as a command terminator, for
example.), using of semicolon as a value separator implies enclosing a whole
string value in some kind of quotes. To avoid using of quotes and make
a user's life easier it makes sense to use another symbol as a delimiter.
Good candidate for a delimiter in this case is the symbol ','.

FR.2 In case multiple values specified as a value of the option --bind-address,
the following special treated values '*', '::' and '0.0.0.0' mustn't be allowed
as a individual values in a comma separated value of the option.

FR.2.1 The new error code ER_INVALID_VALUE_OF_BIND_ADDRESSES must be introduced
to report about errors in parsing of multivalued command line option
--bind-address.

FR.2.2 Attempt to specify any of the special treated values '*', '::' and
'0.0.0.0' in multivalued command line option --bind-address must lead to output
of the error ER_INVALID_VALUE_OF_BIND_ADDRESSES to an error stream and stop
server running.

FR.3 Presence of several adjacent commas, or starting/ending value string with 
the comma must lead to parsing error in case multivalued command line option
--bind-address.

FR.3.1 In case a parse error happens for multivalued command line option
--bind-address, the error ER_INVALID_VALUE_OF_BIND_ADDRESSES must be output
to an error stream and server must be stopped.

FR.4 Server must bind successfully to every address specified in multivalued
command option --bind-address.

FR.4.1 Failure to bind to a some of values specified in multivalued command
line option --bind-address must lead to error and prohibit server to run.

FR.5 Server must accept connection request on any of the addressed specified by
multivalued command line option --bind-address.

Non-Functional Requirements.
============================
UFR.1 In case a single address specified in command line option --bind-address,
connect performance must be not penalized comparing with a server without
support for multi-valued command line option --bind-address

To support acceptance of incoming TCP connections on several IP addresses the
following steps are done:
* Addresses specified in the command line option --bind-address is parsed on
server starting up as part of network initialization;
* Sanity check is done for every specified part of bind-address' value;
* The error ER_INVALID_VALUE_OF_BIND_ADDRESSES is reported in case either
parsing error happens or any component of comma separated value of the option
--bind-address address contains one of the following values '*', '::' and 
'0.0.0.0';
* Checked values placed into a list of string objects that later passed to
a constructor of the class Mysqld_socket_listener and stored in class' data
member.

These steps is done during call of the function network_init().

Handling of incoming TCP connection requests is done without any changes.
To setup sockets for listening incoming connections the current server
implementation calls the member function Mysqld_socket_listener::setup_listener
to bind with passed address values. This member function iterates along the list
of values specified by the command line option --bind-address, creates TCP
socket for every address value being iterated and call the network API function
bind() to bind socket with address.

TCP Socket in MySQL server is represented by the wrapper class TCP_socket and
all activities required for TCP socket creation and binding with address
is encapsulated by current implementation inside the member function
TCP_socket::get_listener_socket().

Created instances of the class TCP_socket is stored in the data member
Mysqld_socket_listener::m_socket_map. This data member is used both to specify
socket descriptors to listen incoming connection requests and to close server
sockets during server shutting down.

Current server implementation supports two ways to wait for incoming network
connection requests: the first one uses the API function poll(), the second one
uses the API function select(). In case the API function poll() is used for
waiting server sockets for readiness to accept incoming TCP connection requests,
the data type Mysqld_socket_listener::m_poll_info must be modified to allow for
storing of variable number of sockets being listened. That means that type of
data members poll_info_t::m_fds and poll_info_t::m_pfs_fds must be changed from
array of predefined size (specified by the constant MAX_SOCKETS that had
the value 2) to the type std::vector. For case when the API function select()
is used for listening for ready incoming connection requests no modifications
are required in the nested data type Mysqld_socket_listener::select_info_t.

Since implementation of this worklog doesn't make changes in handling of
incoming TCP connection requests no performance degradation happens when
the option --bind-address specifies single value.