WL#10703: scale to 50k connections
Motivation
Routers network core relies on a
1:1 thread-connection
design to forward connections. The design limits the number of connections the router can handle. OSes apply limits on:
- number of threads per process
- max memory used per process
- number of open file-descriptors per process
which set the boundaries such a design can handle.
In WL#???? the limit of concurrent connections was raised from 500 to ~5000 connections by using poll() instead of select(), but the underlying design of 1-thread==1-connection was kept intact.
To go beyond 5k concurrent connections, the 1:1 design of the routing core needs to be replaced.
System Limits
MacOS X
number of threads per process is limited by the OS to based on installed memory:
RAM | max_threads | max_taskthreads |
---|---|---|
16G | 20480 | 4096 |
32G | 40960 | 8192 |
The values can be retrieved via sysctl
:
sysctl hw.memsize
sysctl kern.num_taskthreads
sysctl kern.num_thread
Goal
Refactor the routing plugin into a event-driven + IO-threadpool design which:
uses non-blocking IO
- no thread-stack issues
- low memory usage
- no thundering herd
uses a low number of threads (~num of cores)
- no more max-threads-per-process limits
- FR1
- router MUST handle 50k or more connections
Configuration Options
section io
backend
io backend which handles async operations. The generic poll
backend
is available on all platforms, while each platform may provide faster,
more scalable backends.
- possible values
-
platform specific. on Linux:
linux_epoll
andpoll
, elsewherepoll
. - default
- best available platform specific backend.
threads
number of IO threads which handle connections.
- possible values
-
0
== as many as available CPU cores/threads1..1024
== number of io-threads. At runtime the system may restrict the upper limit further.
- default
- as many as available CPU cores/threads
Example
[io]
backend=linux_epoll
threads=32
Implementation
Currently, the routing plugin spawns 1 thread per connect:
- wait for listen-socket is readable
- accepts a connection
- finds a valid backend according to the list of destinations
- spawns a thread
- forwards the client/server data as is in the thread with block socket ops
- closes the thread when done
As the number threads a system can handle is limited, the routing plugin is changed to:
- spawn io-threads
- wait for listen-socket is readable
- accepts a connection
- finds a valid backend according to the list of destinations
- assign connection to an io-thread
- async-wait for client-socket to be readable
- async-wait for server-socket to be readable
Instead of running blocking socket operations in a thread, non-blocking IO is used and a thread may be used on when an socket becomes ready.
Implementation
Implementation is based on the networking-ts which provides:
- portable socket layer
- allows async-blocking socket ops
- allows non-blocking socket-ops
- allows run completion-handlers (callbacks) in a pool of worker threads
error-codes
To handle the failure of a socket operation like recv()
- on Windows, WSAGetLastError() needs to a be called
- on POSIX, errno contains the value.
On Windows the errno will be of the kind WSAEWOULDBLOCK
,
on POSIX EWOULDBLOCK
which means the same thing, but uses different error-codes.
To handle this, the std::error_code
is used in all places.
net::impl::socket::last_error_code()
returns either:
std::error_code{WSAGetLastError(), std::system_category()}
orstd::error_code{errno, std::generic_category()}
or
which can be compared with:
std::error_code ec = impl::socket::last_error_code();
if (ec == make_error_condition(std::errc::operation_would_block)) {
// ...
}
Expected Return values
Contrary to the networking-ts functions error-reporting relies on
stdx::expected<T, std::error_code>
instead of throwing an exception or passing in a std::error_code
by reference.
// stdx::expected<size_t, std::error_code> recv(...) noexcept;
auto recv_res = impl::socket::recv(...);
if (!recv_res) {
// recv failed, .error() contains the error_code
auto ec = recv_res.error();
} else {
size_t written = recv_res.value();
}
Low-Level abstractions
Router already had an abstraction for socket and poll operations which is low-level and only partially covers portability. Its main concern was mock-ability in tests.
To improve on that, the low-level socket/poll layer is replaced with:
-> portable (win32/posix) socket layer -> portable (win32/posix) readiness layer -> returns std::error_code
Implemented in:
- net_ts/impl/socket.h
- net_ts/impl/poll.h
Example: socketpair()
On POSIX socketpair() returns two file handles according to address-family.
On Windows, no socketpair() call exists, but it can be implemented in the form of a AF_INET socket that's accepted from an randomly assigned port.
IO Readiness
An io_context owns the socket-descriptors that are waiting to for readiness and their callback to call when the socket becomes ready or cancelled.
- net_ts/io_context.h
On the low-level:
- poll
- linux_epoll
are supported:
- net_ts/impl/linux_epoll.h
- net_ts/impl/linux_epoll_io_service.h
- net_ts/impl/poll.h
- net_ts/impl/poll_io_service.h
Buffers
Buffers an abstraction over a memory-range (start-pointer and length) and have conversions for:
- std::vector
- std::array
- char arr[N]
They can be passed as net::const_buffer
to socket ops like send()
to
send a single buffer.
To avoid merging multiple buffers into a single buffer before sending the socket layer has the syscalls sendmsg()
and recvmsg()
.
They requirement a sequence of buffers which are implement as:
ConstBufferSequence
which can be anything type that is iterable and returns a const-buffer.
The application is free to provide:
- std::list
- std::vector
- ...
Socket, Tcp
The socket-layer provides socket-operations in a typesafe manner:
- instead of passing sockaddr-structs and requiring reinterpret-casting an endpoint-class ensures that types and sizes are correct.
- only an SocketAcceptor can actually call accept()
- ...
Implemented in:
- net_ts/socket.h
- net_ts/internet.h
Classic Protocol Codec
The old classic protocol tracker relied on a partial classic protocol implementation which handled IO itself.
- rewritten to work with
net::const_buffer
instead ofstd::vector<uint8>
- only works on buffers, does not socket IO
Classic Protocol Tracker
The routing plugin tracks the classic protocol's handshake to:
- track if SSL is enabled on the connection
- avoid max_connect_error on the server if client aborts connections early, by sending a client::Greeting message
It needs to be rewritten to use the new Codec implementation.