WL#4896: PERFORMANCE_SCHEMA Instrumenting Net IO

Affects: Server-5.6   —   Status: Complete   —   Priority: Very High

Add instrumentation for tracking network and interprocess communication to and 
from the server. This communication includes network activity such as socket 
instances, socket operations, number of bytes transmitted and received, etc. In 
future versions, it may also include interprocess communication via named pipes, 
shared memory, message queuing, etc.

Target Audience

* Server core developers

Data provided by this instrumentation can be used by implementers of the server to 
readily assess the volume, direction and impact of data being moved in and out of 
the server via network subprocesses and/or interprocess communication.

* Application Developers

Data provided by this instrumentation can by used by application developers, 
particularly those developing web-based applications, to assess the volume of 
network traffic directly attributable to queries generated by their application. 
This can be particularly useful when developing applications intended for large-
scale implementations. 

* Production Engineers

Production engineers can use the data provided by this instrumentation to
1) Diagnose problematic connections to the MySQL server,
2) Trace external connections to the MySQL server to a unique hardware address,
3) Identify firewall activity associated with the MySQL server,
4) Assess the connection status of remote MySQL servers and remote MySQL files 
(e.g. Federated tables),
5) Gauge per-socket throughput in the MySQL server.
Three new tables will be added to the Performance Schema: SOCKET_INSTANCES,
SOCKET_SUMMARY_BY_INSTANCE and SOCKET_SUMMARY_BY_EVENT_NAME. These tables are
described here in detail.

Also, some existing tables are modified to show more content,
also described in more details below.

TABLE performance_schema.SOCKET_INSTANCES
=========================================

Columns
-------

'EVENT_NAME' varchar(128) NOT NULL,
'OBJECT_INSTANCE_BEGIN' bigint(20) unsigned NOT NULL,
'THREAD_ID' int(11),
'SOCKET_ID' int(11) NOT NULL,
'IP' varchar(64) NOT NULL,
'PORT' int(11) NOT NULL,
'STATE' enum('IDLE','ACTIVE')

Operations
----------

SELECT is supported.

Normal privilege checks do apply: a user must be granted the proper table
privileges to perform an operation.

Semantic
--------

This is a real-time snapshot of the active connections to the MySQL server. Each
external TCP/IP connection is represented by a single row identified by a unique
IP:PORT combination.

EVENT_NAME is the "wait/io/socket/*" instrument identifier.

OBJECT_INSTANCE_BEGIN is an arbitrary identifier, guaranteed to be unique.

THREAD_ID is the internal thread identifier assigned by the server. Each client
connection is managed by a single thread, therefore, each client connection can
be mapped to a thread which can be mapped to a server process.

SOCKET_ID is the internal file handle assigned to the socket.

The IP column may contain either IPv4 or IPv6 addresses.

PORT is the TCP/IP port, 0 - 65535.

STATE is the socket status: IDLE or ACTIVE. Wait times for active sockets are
tracked using the corresponding socket instrument. Wait times for idle sockets
are tracked using the new "idle" instrument, detailed below under setup_timers
and setup_instruments.


TABLE performance_schema.SOCKET_SUMMARY_BY_INSTANCE
===================================================

Columns
-------

`EVENT_NAME` varchar(128) NOT NULL,
`OBJECT_INSTANCE_BEGIN` bigint(20) unsigned NOT NULL,
`COUNT_STAR` bigint(20) unsigned NOT NULL,
`SUM_TIMER_WAIT bigint(20) unsigned NOT NULL,
`MIN_TIMER_WAIT bigint(20) unsigned NOT NULL,
`AVG_TIMER_WAIT bigint(20) unsigned NOT NULL,
`MAX_TIMER_WAIT bigint(20) unsigned NOT NULL,

`COUNT_READ` bigint(20) unsigned NOT NULL,
`SUM_TIMER_READ` bigint(20) unsigned NOT NULL,
`MIN_TIMER_READ` bigint(20) unsigned NOT NULL,
`AVG_TIMER_READ` bigint(20) unsigned NOT NULL,
`MAX_TIMER_READ` bigint(20) unsigned NOT NULL,
`SUM_NUMBER_OF_BYTES_READ` bigint(20) unsigned NOT NULL,

`COUNT_WRITE` bigint(20) unsigned NOT NULL,
`SUM_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`MIN_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`AVG_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`MAX_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`SUM_NUMBER_OF_BYTES_WRITE` bigint(20) unsigned NOT NULL,

`COUNT_MISC` bigint(20) unsigned NOT NULL,
`SUM_TIMER_MISC` bigint(20) unsigned NOT NULL,
`MIN_TIMER_MISC` bigint(20) unsigned NOT NULL,
`AVG_TIMER_MISC` bigint(20) unsigned NOT NULL,
`MAX_TIMER_MISC` bigint(20) unsigned NOT NULL,
) ENGINE=PERFORMANCE_SCHEMA DEFAULT CHARSET=utf8

Operations
----------

SELECT is supported.
TRUNCATE is supported, and is used to reset all summaries.

Normal privilege checks do apply: a user must be granted the proper table
privileges to perform an operation.

Semantic
--------

This table aggregates the timer and byte count statistics generated by the
"wait/io/socket/*" instruments for all socket I/O operations, per socket
instance, where each socket instance is identified by OBJECT_INSTANCE_BEGIN.

The columns COUNT_STAR, SUM/MIN/AVG/MAX TIMER_WAIT aggregates all operations.

The columns COUNT/SUM/MIN/AVG/MAX TIMER_READ and NUMBER_OF_BYTES_READ aggregates
all RECEIVE operations (RECV, RECVFROM, RECVMSG)

The columns COUNT/SUM/MIN/AVG/MAX TIMER_WRITE and NUMBER_OF_BYTES_WRITE
aggregates all SEND (SEND, SENDTO, SENDMSG) operations.

The columns COUNT/SUM/MIN/AVG/MAX TIMER_MISC aggregates all remaining socket
operations, including CONNECT, LISTEN, ACCEPT, CLOSE, SHUTDOWN, etc. There are
no byte counts for these operations.

Convenience columns
-------------------

Columns such as SUM/MIN/AVG/MAX TIMER_WAIT/TIMER_READ/TIMER_WRITE, etc., are
derived from lower level columns that count statistics per operation. These
columns are redundant. Given that we anticipate that many users will want to
have these statistics, the columns are provided as convenience, instead of
forcing every user to define their own view to compute the same.

The timer wait and byte count summary columns are grouped together in order to
allow for per-operation throughput comparisons.


TABLE performance_schema.SOCKET_SUMMARY_BY_EVENT_NAME
=====================================================

Columns
-------

`EVENT_NAME` varchar(128) NOT NULL,
`COUNT_STAR` bigint(20) unsigned NOT NULL,
`SUM_TIMER_WAIT bigint(20) unsigned NOT NULL,
`MIN_TIMER_WAIT bigint(20) unsigned NOT NULL,
`AVG_TIMER_WAIT bigint(20) unsigned NOT NULL,
`MAX_TIMER_WAIT bigint(20) unsigned NOT NULL,

`COUNT_READ` bigint(20) unsigned NOT NULL,
`SUM_TIMER_READ` bigint(20) unsigned NOT NULL,
`MIN_TIMER_READ` bigint(20) unsigned NOT NULL,
`AVG_TIMER_READ` bigint(20) unsigned NOT NULL,
`MAX_TIMER_READ` bigint(20) unsigned NOT NULL,
`SUM_NUMBER_OF_BYTES_READ` bigint(20) unsigned NOT NULL,

`COUNT_WRITE` bigint(20) unsigned NOT NULL,
`SUM_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`MIN_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`AVG_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`MAX_TIMER_WRITE` bigint(20) unsigned NOT NULL,
`SUM_NUMBER_OF_BYTES_WRITE` bigint(20) unsigned NOT NULL,

`COUNT_MISC` bigint(20) unsigned NOT NULL,
`SUM_TIMER_MISC` bigint(20) unsigned NOT NULL,
`MIN_TIMER_MISC` bigint(20) unsigned NOT NULL,
`AVG_TIMER_MISC` bigint(20) unsigned NOT NULL,
`MAX_TIMER_MISC` bigint(20) unsigned NOT NULL,
) ENGINE=PERFORMANCE_SCHEMA DEFAULT CHARSET=utf8

Operations
----------

SELECT is supported.
TRUNCATE is supported, and is used to reset all summaries.

Normal privilege checks do apply: a user must be granted the proper table
privileges to perform an operation.

Semantic
--------

This table aggregates the timer and byte count statistics generated by the
"wait/io/socket/*" instruments for all socket I/O operations, per socket
instrument.

The columns COUNT_STAR, SUM/MIN/AVG/MAX TIMER_WAIT aggregates all operations.

The columns COUNT/SUM/MIN/AVG/MAX TIMER_READ and NUMBER_OF_BYTES_READ aggregates
all RECEIVE operations (RECV, RECVFROM, RECVMSG)

The columns COUNT/SUM/MIN/AVG/MAX TIMER_WRITE and NUMBER_OF_BYTES_WRITE
aggregates all SEND (SEND, SENDTO, SENDMSG) operations.

The columns COUNT/SUM/MIN/AVG/MAX TIMER_MISC aggregates all remaining socket
operations, including CONNECT, LISTEN, ACCEPT, CLOSE, SHUTDOWN, etc. There are
no byte counts for these operations.

Convenience columns
-------------------

Columns such as SUM/MIN/AVG/MAX TIMER_WAIT/TIMER_READ/TIMER_WRITE, etc., are
derived from lower level columns that count statistics per operation. These
columns are redundant. Given that we anticipate that many users will want to
have these statistics, the columns are provided as convenience, instead of
forcing every user to define their own view to compute the same.

The timer wait and byte count summary columns are grouped together in order to
allow for per-operation throughput comparisons.


TABLE performance_schema.events_waits_current
=============================================

The column OBJECT_INSTANCE_BEGIN will contain the unique identifier of the
socket event.
The column OBJECT_NAME will contain the IP:PORT of the socket.

The above also applies to the events_waits_history and events_waits_history_long
tables.

TABLE performance_schema.setup_timers
=====================================

A new timer is available, "idle".
The default timer_name for the "idle" timer is MICROSECOND.

TABLE performance_schema.setup_instruments
==========================================

Socket instrumentation is prefixed with "wait/io/socket/".

There are three new event names:
  'wait/io/socket/sql/server_tcpip_socket' - server listening socket - TCP
  'wait/io/socket/sql/server_unix_socket' - server listening socket - UNIX
domain
  'wait/io/socket/sql/server_client_connection' - where new connections are
handled

A new instrument name is available, "idle".

This instrument is used when timing the time spent while waiting
for the next command in the server.
No events are recorded in the socket when the server is idle.

Server variables
================

There are two new server variables:

  performance_schema_max_socket_classes:  Limits the number of socket
instruments in the server. It is global and read only, and therefore cannot be
changed during operation.

  performance_schema_max_socket_instances: Maximum number of socket instances
that can be tracked at any one time. It is global and read only, and therefore
cannot be changed during operation.

Status variables
================

There is one new status variable:

  performance_schema_socket_instances_lost: Count of socket instances that could
not be tracked because the number of active sockets exceeded
max_socket_instances.


[ REQUIREMENTS ]

(1) Table performance_schema.socket_instances
---------------------------------------------

Func-Req (1.1): A fresh MySQL installation of CURRENT_VERSION must create the
table performance_schema.socket_instances. See the HLS for the table structure.

Func-Req (1.2): An upgrade from PREVIOUS_VERSION to CURRENT_VERSION must create
the table performance_schema.socket_instances.

Func-Req (1.3): Security privileges for performance_schema.socket_instances are
enforced. Legal operations are SELECT, CREATE TABLE and DROP TABLE are currently
also legal, as they are used during install/upgrade.

Func-Req (1.4): Table performance_schema.socket_instances is visible in the
information_schema.

Func-Req (1.5): Table performance_schema.socket_instances is visible in SHOW
TABLES.

Func-Req (1.6): Each external TCP/IP connection to the MySQL server must be
represented by one row in performance_schema.socket_instances that is
identified by the connection's IP:PORT combination (not all rows in
socket_instances represent external connections to the server). See the HLS for
details.

Func-Req (1.7): When a connection to the MySQL server is terminated, the
corresponding row in performance_schema.socket_instances must be deleted.

Func-Req (1.8) The number of rows in performance_schema.socket_instances must
not exceed the amount defined by the server variable
'performance_schema_max_sockets'.


(2) Table performance_schema.socket_summary_by_instance.
--------------------------------------------------------

Func-Req (2.1): A fresh MySQL installation of CURRENT_VERSION must create the
table performance_schema.socket_summary_by_instance. See the HLS for the table
structure.

Func-Req (2.2): An upgrade from PREVIOUS_VERSION to CURRENT_VERSION must create
the table performance_schema.socket_summary_by_instance.

Func-Req (2.3): Security privileges for
performance_schema.socket_summary_by_instance are enforced. Legal operations are
SELECT, TRUNCATE, CREATE TABLE and DROP TABLE are currently also legal, as they
are used during install/upgrade.

Func-Req (2.4): Table performance_schema.socket_summary_by_instance is visible
in the information_schema.

Func-Req (2.5): Table performance_schema.socket_summary_by_instance is visible
in SHOW TABLES.

Func-Req (2.6): Each external TCP/IP connection to the MySQL server must be
represented by one row in  performance_schema.socket_summary_by_instance that is
identified by the connection's IP:PORT combination (not all rows in
socket_summary_by_instance represent external connections to the server). See
the HLS for details.

Func-Req (2.7): When a connection to the MySQL server is terminated, the
corresponding row in performance_schema.socket_summary_by_instance must be
deleted.


(3) Table performance_schema.socket_summary_by_event_name.
----------------------------------------------------------

Func-Req (3.1): A fresh MySQL installation of CURRENT_VERSION must create the
table performance_schema.socket_summary_by_event_name. See the HLS for the table
structure.

Func-Req (3.2): An upgrade from PREVIOUS_VERSION to CURRENT_VERSION must create
the table performance_schema.socket_summary_by_event_name.

Func-Req (3.3): Security privileges for
performance_schema.socket_summary_by_event_name are enforced. Legal operations
are SELECT, TRUNCATE, CREATE TABLE and DROP TABLE are currently also legal, as
they are used during install/upgrade.

Func-Req (3.4): Table performance_schema.socket_summary_by_event_name is visible
in the information_schema.

Func-Req (3.5): Table performance_schema.socket_summary_by_event_name is visible
in SHOW TABLES.

Func-Req (3.6): Each instrumented socket in the MySQL server must be represented
by one row in performance_schema.socket_summary_by_event_name, identified by a
unique instrumentation key prefixed with "wait/io/socket/sql/".


(4) Existing performance schema tables.
---------------------------------------

Func-Req (4.1) Event summary and history tables must contain the socket IP:PORT
combination in the OBJECT_NAME column.


(5) Server variables.
---------------------

Func-Req (5.1) performance_schema_max_socket_classes: Must limit the number of
socket instruments (currently there are three). It is global and read only, and
therefore cannot be changed during operation.

Func-Req (5.2) performance_schema_max_socket_instances: Must limit the number
socket instances that can be tracked simultaneously. It is global and read only,
and therefore cannot be changed during operation.


(6) Status variables.
---------------------

Func-Req (5.1) performance_schema_socket_instances_lost: Must accurately reflect
the number of socket instances that the performance schema could not track
because the number of active sockets exceded max_socket_instances.
The detailed design is documented in doxygen format. See instructions in the file 
Doxyfile-perfschema, in the project branch.

Included in the low-level design are details of the instrumentation interface of 
the socket API, relevant object classes and structures within the performance 
schema storage engine, and diagrams describing the flow of captured statistical 
data from the instrument buffers into a more meaningful, aggregated form.