WL#4813: PERFORMANCE_SCHEMA Instrumenting Stages
Affects: Server-Prototype Only
—
Status: Complete
Add EVENTS_STAGES viewed tables in the performance schema. Rows in the tables will have information about parts of statements (in progress or completed) for active users. What we say about stages applies mutatis mutandis to statements, and stages might be the first nesting event which we implement, so WL#2515 "PERFORMANCE_SCHEMA statements" refers often to this worklog task.
Contents -------- Terminology No Overlaps SETUP_INSTRUMENTS Table EVENTS_STAGES Tables Nesting-Event Summaries Chip and Operating System Counters SETUP_TIMERS Table SETUP_CONSUMERS Table Server Variables Change to EVENTS_WAITS Tables Requirements References Terminology ----------- A stage is a process that takes place during statement processing (it doesn't span a statement), and examples that we've talked about include "parse" and "filesort". The word "stages" is accepted. Other possible names were STATE (which MySQL uses already), or SEGMENT, or STEP, or PHASE, or PART OF A STATEMENT, or DATABASE CALL (which Oracle 11g uses), or [ fill in your own favourite here ] -- nobody voted for them. We follow the matryoshka design principle http://en.wikipedia.org/wiki/Matryoshka From biggest to smallest: Session -> Transaction -> Statement -> Stage -> Wait. We say that a wait is "nested" in a stage ("matryoshkificated" is too long), so a stage is a "nesting event" for a wait. In turn, the nesting event for a stage is a statement. The worklog task WL#5556 Nesting-Event Summaries" will describe the appropriate columns of any nesting event. SETUP_INSTRUMENTS Table ----------------------- There will be new rows in PERFORMANCE_SCHEMA.SETUP_INSTRUMENTS for stages. But what are they? We considered these possibilities: 1. Use the "backup-progress" values. 2. Use the "General Thread States" values. 3. Use modified "thd_proc_info" values. 4. Follow Oracle 11g. 5. Overhaul. 6. From optimizer. 7. [ Fill in your own favourite here. ] The winner was '2.'. For details about the rejected possibilities, see 'Progress' for 2010-08-16. The MySQL Reference Manual section 7.8.5.2. General Thread States http://dev.mysql.com/doc/refman/5.5/en/general-thread-states.html and following sections (Delayed-Insert Thread States, Query Cache Thread States, Replication Master Thread States, Replication Slave I/O Thread States, Replication Slave SQL Thread States, Replication Slave Connection Thread States, Event Scheduler Thread States) describe the values we are looking at. SHOW PROFILE CPU and SHOW PROCESSLIST have similar bases. Stages begin and end when thread state values change. For example, when the general thread state changes to "checking permissions", then an old stage ends and a new stage named "checking permissions" begins. For example, suppose a statement has 10 thread states: starting Opening tables System lock Table lock init end query end freeing items logging slow query cleaning up When the state changes to 'cleaning up', then instrumenting of the previous stage ('logging slow query') ends, and instrumenting of the new stage ('cleaning up') begins. So EVENTS_STAGES_CURRENT will have a 'cleaning up' row. The state value becomes part of EVENT_NAME in SETUP_INSTRUMENTS and in events tables, following our 'taxonomy' convention: Class: stage Order relevant code area, for example 'sql' or 'myisam' Family Thread States value For example: 'stage/sql/cleaning up' A stage's event_name has only three components, this differs from waits which have five components. So the columns of SETUP_INSTRUMENTS are: EVENT_NAME as above ENABLED 'NO' means the stage is disabled, 'YES' means the stage is enabled. TIMED 'NO' means the stage timers are off, 'YES' means the stage timers are on. For example: mysql> select * from setup_instruments where name like "stage/%" order by name limit 10; +-----------------------------------------------+---------+-------+ | NAME | ENABLED | TIMED | +-----------------------------------------------+---------+-------+ | stage/sql/After create | YES | YES | | stage/sql/allocating local table | YES | YES | | stage/sql/Changing master | YES | YES | | stage/sql/Checking master version | YES | YES | | stage/sql/checking permissions | YES | YES | | stage/sql/checking privileges on cached query | YES | YES | | stage/sql/checking query cache for query | YES | YES | | stage/sql/cleaning up | YES | YES | | stage/sql/closing tables | YES | YES | | stage/sql/Connecting to master | YES | YES | +-----------------------------------------------+---------+-------+ 10 rows in set (0.00 sec) The ENABLED and TIMED values affect only the instrumentation of the stage itself. They do not affect instrumentation of waits nested within the stage -- there is no 'cascading'. So waits can be enabled even if stages are disabled. No Overlaps ----------- Stages "should" be sequential. That is: * one stage should not be nested within another stage * one stage should not overlap with another stage. But we must accept that nesting and overlapping will happen. Keeping a stack is possible, but not for all the thousands of aggregator bytes. We considered and rejected that we could have: Row#1 First part of Event A Row#2 All of Event B Row#3 Second part of Event A For this, we could recover the fact that we are resuming Event A when Event B ends. So, when starting Event B, we could save that we were in Event A. And when ending Event B, we could recover that fact. And in Row#3 there could be an indication that "this is Part 2" or "Continued". However, this will not be done. Everything will appear to be linear. EVENTS_STAGES Tables -------------------- The EVENTS_STAGES tables in PERFORMANCE_SCHEMA are EVENTS_STAGES_CURRENT EVENTS_STAGES_HISTORY EVENTS_STAGES_HISTORY_LONG EVENTS_STAGES_SUMMARY_BY_THREAD_BY_EVENT_NAME EVENTS_STAGES_SUMMARY_GLOBAL_BY_EVENT_NAME They're analogous to EVENTS_WAITS tables (EVENTS_WAITS_CURRENT etc.) and the column descriptions are the same as in EVENTS_WAITS tables, plus summary columns (see section "Nesting-Event Summaries"), plus extra timer columns (see section "Chip and Operating System Counters"). (We may summarize _BY_FAMILY or _BY_GENUS rather than BY_EVENT_NAME, but until it's decided let us use _BY_EVENT_NAME.) To be more specific about the columns that are in common for CURRENT and HISTORY and HISTORY_LONG tables: THREAD_ID identifies the thread EVENT_ID is a unique identifier "of the event", that is, it's the same counter that we use for waits, therefore stage 101 does not follow stage 100, it follows wait 100 (assuming both are instrumented). EVENT_NAME see section "SETUP_INSTRUMENTS Table" SOURCE identifies the source code where instrumented TIMER_START when instrumentation starts TIMER_END when instrumentation ends (the assumption is that there are no blank areas between stages, so the end of one stage is always equal to the start of the next stage) TIMER_WAIT TIMER_END - TIMER_START NESTING_EVENT_ID the EVENT_ID of the nesting event, usually statement NESTING_EVENT_TYPE 'statement' or 'stage' or 'wait' Notice the removal of the columns SPINS, OBJECT_SCHEMA, OBJECT_NAME, OBJECT_TYPE, OBJECT_INSTANCE_BEGIN, OPERATION, NUMBER_OF_BYTES, FLAGS. So why have a different set of tables for stages? One thought was: their content is backed up by a different set of internal buffers. Another thought was: they obviously aren't WAITS. Indeed, they're obviously not EVENTS either, but we're sticking to the word EVENTS for all the tables with instrumentation data. EVENTS_STAGES_SUMMARY_BY_THREAD_BY_EVENT_NAME is the STAGES summary table corresponding to EVENTS_WAITS_SUMMARY_BY_THREAD_BY_EVENT_NAME. THREAD_ID BIGINT /* or INTEGER */ NOT NULL EVENT_NAME VARCHAR(128) NOT NULL /* i.e. stage name */ COUNT_STAR BIGINT unsigned NOT NULL SUM_TIMER_WAIT BIGINT UNSIGNED NOT NULL MIN_TIMER_WAIT BIGINT UNSIGNED NOT NULL AVG_TIMER_WAIT BIGINT UNSIGNED NOT NULL MAX_TIMER_WAIT BIGINT UNSIGNED NOT NULL EVENTS_STAGES_SUMMARY_GLOBAL_BY_EVENT_NAME is the STAGES summary table corresponding to EVENTS_WAITS_SUMMARY_GLOBAL_BY_EVENT_NAME. EVENT_NAME VARCHAR(128) NOT NULL /* i.e. stage name */ COUNT_STAR BIGINT unsigned NOT NULL SUM_TIMER_WAIT BIGINT UNSIGNED NOT NULL MIN_TIMER_WAIT BIGINT UNSIGNED NOT NULL AVG_TIMER_WAIT BIGINT UNSIGNED NOT NULL MAX_TIMER_WAIT BIGINT UNSIGNED NOT NULL There are also of course the nesting-event summaries. We will consider alternative ways of filling nesting-event summary columns in these tables, in WL#5556 Nesting-Event Summaries. Chip and Operating System Counters ---------------------------------- [ 2010-09-03 This section has been moved to WL#2515. Due to voting, stages will not have timer2 or timer3, and getrusage-like timers are okay for statements. ] SETUP_TIMERS Table ------------------ There will be an additional "stages" row in performance_schema.SETUP_TIMERS: mysql> select * from SETUP_TIMERS; +--------------------+--------------+ | NAME | TIMER_NAME | +--------------------+--------------+ | wait | CYCLE | | stage | NANOSECOND | +--------------------+--------------+ If the TIMER_NAME column (which is updatable) is set to 'NANOSECOND' for the 'stage' row, then stage instrumentation will be done with a nanosecond timer. Possible values for TIMER_NAME are what's in PERFORMANCE_TIMERS (CYCLE, NANOSECOND, MICROSECOND, MILLISECOND, TICK). The timer for 'stage' can differ from the timer for 'wait'. SETUP_CONSUMERS Table --------------------- For every 'events_waits_...' row in the PERFORMANCE_SCHEMA.SETUP_CONSUMERS table, there will be a corresponding 'events_stages_...' row, thus: mysql> select * from SETUP_CONSUMERS; +----------------------------+---------+ | NAME | ENABLED | +----------------------------+---------+ | events_stages_current | YES | | events_stages_history | YES | | events_stages_history_long | YES | +----------------------------+---------+ There might be more consumers depending on how aggregates are implemented, the implementor may decide this. For example, we may have events_stage_summary_... consumers. Server Variables ---------------- The default number of history rows is 10; the default number of history_long rows is 10000; this is settable with mysqld and visible thus: mysql> show variables like 'performance%stages%'; +----------------------------------------------------+---------+ | Variable_name | Value | +----------------------------------------------------+---------+ | performance_schema_events_stages_history_long_size | 10000 | | performance_schema_events_stages_history_size | 10 | +----------------------------------------------------+---------+ Change to EVENTS_WAITS Tables ----------------------------- The NESTING_EVENT_ID column in EVENTS_WAITS_CURRENT etc. (that is, EVENTS_WAITS_CURRENT and EVENTS_WAITS_HISTORY and EVENTS_WAITS_HISTORY_LONG) is NULL. It should be the EVENT_ID of the wait's STAGES row, if any. Or, it should be the EVENT_ID of another wait that this wait is nested in, and in that case NESTING_EVENT_TYPE should be 'stage'. (NESTING_EVENT_TYPE will have to be a new EVENTS_WAITS_CURRENT column.) Requirements ============ Note: This section is refining the general specification written in this HLS into more detailed technical or functional requirements, to help scope the tests. This is *not* yet another spec, just some logical consequences of the HLS sections written above. (1) Table performance_schema.events_stages_current. --------------------------------------------------- Func-Req (1.1): A fresh MySQL installation of CURRENT-VERSION must create the table performance_schema.events_stages_current. Func-Req (1.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must create the table performance_schema.events_stages_current. Func-Req (1.3): Security privileges for events_stages_current are enforced. Legal operations are SELECT, TRUNCATE. CREATE TABLE and DROP TABLE are currently also legal, as they are used during install/upgrade. Func-Req (1.4): Table performance_schema.events_stages_current is visible in the information_schema. Func-Req (1.5): Table performance_schema.events_stages_current is visible in SHOW TABLES. (2) Table performance_schema.events_stages_history. --------------------------------------------------- Func-Req (2.1): A fresh MySQL installation of CURRENT-VERSION must create the table performance_schema.events_stages_history. Func-Req (2.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must create the table performance_schema.events_stages_history. Func-Req (2.3): Security privileges for events_stages_history are enforced. Legal operations are SELECT, TRUNCATE. CREATE TABLE and DROP TABLE are currently also legal, as they are used during install/upgrade. Func-Req (2.4): Table performance_schema.events_stages_history is visible in the information_schema. Func-Req (2.5): Table performance_schema.events_stages_history is visible in SHOW TABLES. (3) Table performance_schema.events_stages_history_long. -------------------------------------------------------- Func-Req (3.1): A fresh MySQL installation of CURRENT-VERSION must create the table performance_schema.events_stages_history_long. Func-Req (3.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must create the table performance_schema.events_stages_history_long. Func-Req (3.3): Security privileges for events_stages_history_long are enforced. Legal operations are SELECT, TRUNCATE. CREATE TABLE and DROP TABLE are currently also legal, as they are used during install/upgrade. Func-Req (3.4): Table performance_schema.events_stages_history_long is visible in the information_schema. Func-Req (3.5): Table performance_schema.events_stages_history_long is visible in SHOW TABLES. (4) Table performance_schema.events_stages_summary_by_thread_by_event_name. --------------------------------------------------------------------------- Func-Req (4.1): A fresh MySQL installation of CURRENT-VERSION must create the table performance_schema.events_stages_summary_by_thread_by_event_name. Func-Req (4.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must create the table performance_schema.events_stages_summary_by_thread_by_event_name. Func-Req (4.3): Security privileges for events_stages_summary_by_thread_by_event_name are enforced. Legal operations are SELECT, TRUNCATE. CREATE TABLE and DROP TABLE are currently also legal, as they are used during install/upgrade. Func-Req (4.4): Table performance_schema.events_stages_summary_by_thread_by_event_name is visible in the information_schema. Func-Req (4.5): Table performance_schema.events_stages_summary_by_thread_by_event_name is visible in SHOW TABLES. (5) Table performance_schema.events_stages_summary_global_by_event_name. ------------------------------------------------------------------------ Func-Req (5.1): A fresh MySQL installation of CURRENT-VERSION must create the table performance_schema.events_stages_summary_global_by_event_name. Func-Req (5.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must create the table performance_schema.events_stages_summary_global_by_event_name. Func-Req (5.3): Security privileges for events_stages_summary_global_by_event_name are enforced. Legal operations are SELECT, TRUNCATE. CREATE TABLE and DROP TABLE are currently also legal, as they are used during install/upgrade. Func-Req (5.4): Table performance_schema.events_stages_summary_global_by_event_name is visible in the information_schema. Func-Req (5.5): Table performance_schema.events_stages_summary_global_by_event_name is visible in SHOW TABLES. (6) Table performance_schema.setup_timers. ------------------------------------------ Func-Req (6.1): Table setup_timers has a new timer named "stage". (7) Table performance_schema.setup_instruments. ----------------------------------------------- Func-Req (7.1): Table setup_instruments displays new rows for the stage instrumentation, with a prefix "stage/". (8) Table performance_schema.setup_consumers. --------------------------------------------- Func-Req (8.1): Table setup_consumers has a new consumer named "events_stages_current", which control the table of the same name. Func-Req (8.2): Table setup_consumers has a new consumer named "events_stages_history", which control the table of the same name. Func-Req (8.3): Table setup_consumers has a new consumer named "events_stages_history_long", which control the table of the same name. (9) Server start options and variables. --------------------------------------- Func-Req (9.1): A new server start option is available, "performance-schema-events-stages-history-size", to control the size of the table events_stage_history. Func-Req (9.2): A new server start option is available, "performance-schema-events-stages-history-long-size", to control the size of the table events_stage_history_long. Func-Req (9.3): A new server start option is available, "performance-schema-max-stage-classes", to control the maximum number of stage instrument classes. (10) Server status. ------------------- Func-Req (10.1): A new server status variable is available, "Performance_schema_stage_classes_lost". (11) Table performance_schema.events_waits_current. -------------------------------------------------- Func-Req (11.1): A fresh MySQL installation of CURRENT-VERSION must create table events_waits_current with the new NESTING_EVENT_TYPE column. Func-Req (11.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must upgrade table events_waits_current to add the new NESTING_EVENT_TYPE column. (12) Table performance_schema.events_waits_history. -------------------------------------------------- Func-Req (12.1): A fresh MySQL installation of CURRENT-VERSION must create table events_waits_history with the new NESTING_EVENT_TYPE column. Func-Req (12.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must upgrade table events_waits_history to add the new NESTING_EVENT_TYPE column. (13) Table performance_schema.events_waits_history_long. ------------------------------------------------------- Func-Req (13.1): A fresh MySQL installation of CURRENT-VERSION must create table events_waits_history_long with the new NESTING_EVENT_TYPE column. Func-Req (13.2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must upgrade table events_waits_history_long to add the new NESTING_EVENT_TYPE column. References ---------- dev-private email thread "Tables for showing what stage a statement has reached" starting with [mysql internal address]/secure/mailarchive/mail.php?folder=4&mail=21981 WL#1023 Check if we can get CPU time used by thread (1023) WL#4116 Online Backup: Record Online Backup Progress WL#4170 Provide progress indication for long queries and operations Blog article “Show profile” + “Information_schema.profiling” http://blogs.mysql.com/peterg/2008/11/06/show-profile-information_schemaprofiling/
The code in bzr will contain the low-level structures and main algorithms, commented and set apart from other code with if/endif as described in the HLS. That suffices as an LLD for a task of this nature.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.