WL#2333: SHOW ENGINE ... LOCK STATUS

Affects: Server-7.1 — Status: Un-Assigned

Description
Dependent Tasks

This description arises from a discussion on dev-public, thread
"Re: TODO: Jan, please implement SHOW LOCKS" in January 2005.
[mysql internal address]/secure/mailarchive/mail.php?folder=5&mail=31862
At the time we expected that InnoDB would act on lock display soon,
and we would have liked to agree on a common general idea for all
storage engines. Now (December 2008) we seem to have failed in the
original objective, but we still can propose what the server should have.
So here's an editable document.

PERFORMANCE_SCHEMA . EVENTS_WAITS_CURRENT
-----------------------------------------

Locks are just another type of event, so they'll be in
EVENTS_WAITS_CURRENT and all histories or summaries.
Phylum: Instrument
Class: Wait
Order Synchronization Object /* abbreviated 'Synch' */
Family Lock

Example: 'Wait/Synch/Lock/Row-Exclusive/InnoDB'.

That would be the end of this task description, if it weren't
for the fact that people also want to know what's blocking what.

PERFORMANCE_SCHEMA . LOCKS
--------------------------

LOCKS is a single (noticeably un-normalized) table with one row for
each lock that is "current" on a database, tablespace, table, or row.
In theory, a new lock is inserted when a request is made, updated when
a request changes (e.g. a shared lock becomes exclusive or status changes
to granted), deleted when lock is released (e.g. at end of transaction).
Values come from the storage engine or the performance schema instruments.

WL#2360 says:
"Values in PERFORMANCE_SCHEMA tables are non-deterministic,
non-consistent, non-repeatable. PERFORMANCE_SCHEMA tables cannot be locked,
indexed, or backed up. Selections on PERFORMANCE_SCHEMA tables should not
be cached."
That is true for PERFORMANCE_SCHEMA.LOCKS too.

WL#2360 also says:
"Every user has SELECT privilege for all PERFORMANCE_SCHEMA tables, and no other
privilege." That is true for PERFORMANCE_SCHEMA.LOCKS too. There is no
requirement for a SUPER privilege or a PROCESS privilege.

We'll try to use the same table for all storage engines. That means
that we can't guarantee that it's up to date or consistent, and it means
that some columns will be always NULL since they're not applicable.

Other DBMSs might include locks of internal spaces (e.g. the sequence number
instance) in tables like this. But we'd expect to control such things
with a mutex. Mutexes aren't part of this task. See later section "Mutexes".

Alternative names: PERFORMANCE_LOCKS, CURRENT_LOCKS.

Columns in PERFORMANCE_SCHEMA . LOCKS:

Name Comments
---- --------

/* The object_xxxx fields are whatever is necessary to identify
an object. "Resource" is a better word but events_waits_current
already has object_schema, object_name, object_type, object_instance_begin.
Often some of the object_xxxx fields will be blank (NULL).
The data type is always VARCHAR(64) CHARACTER SET UTF8,
but if a storage engine has a numeric identifier for an object,
that's fine, we will show it as decimal with no zerofill. */
object_schema Ordinarily this will have a database name.
object_name A table name, a routine name, a file name, or
some other resource identifier.
If one can lock all objects of a given type,
this will be a wildcard.
INNODB_LOCKS has: lock_table, lock_index.
object_type INNODB.LOCKS_TYPE has: 'RECORD', 'TABLE'.
We prefer the term 'row', and we expect there
will be more possibilities with more storage
engines and future considerations. So our list is:
DATABASE, TABLESPACE, TABLE, ROW, INDEX, INDEX KEY,
FILE, SEQUENCE, ROUTINE, BLOCK.
object_instance_begin The identifier of the lock. It's not enough to
identify a lock by object_name. We might
have a sequence number from a storage engine here,
or we might have a memory address.
INNODB_LOCKS has: lock_id.
location This is NULL except for rows, index keys, or blocks.
This has a location within a table / tablespace.
It might be a partial row primary key value,
a ROWID, a block number plus a record number within
the block (INNODB_LOCKS has lock_space, lock_page,
lock_rec). Usually it's numeric.
This might have the beginning of a range, but we do
not have a way to show the end of a range.

/* The transaction_xxxx_id columns have whatever is necessary to identify
a transaction. Again, depending on circumstances, some of these may
be blank (NULL). In a more normalized structure, we'd probably want
to have a single ID and use it to look up the other information.
These are always INTEGER fields. */
transaction_process_id Corresponds to "SHOW PROCESSLIST" number
transaction_thread_id Corresponds to events_waits_current.thread_id
transaction_serial_id Corresponds to InnoDB "trx id"

/* The transaction_xxxx_id columns show "who requested the lock".
The blocking_transaction_xxxx_id columns show "who currently has the
lock (and is therefore blocking)".
But the "who is blocking" question is tough. See later discussion. */
blocking_transaction_process_id Who is blocking.
blocking_transaction_thread_id Who is blocking.
blocking_transaction_serial_id Who is blocking.
INNODB_LOCKS has: lock_trx_id.

blocking_object_instance_id The lock that we're waiting for.
This should point to another row in
performance_schema.locks, with a status of
"Granted", and with the same resource_xxxx_id
columns. Sometimes we're trying to get a
row lock and we're waiting for a table lock,
so the resource_xxxx_id columns will differ.
Sometimes we're trying to get an exclusive
lock and we're blocked by several transactions
that have a shared lock, so this column might
only be showing one of many 'blocking' rows.

request_mode CHAR. Use the same abbreviations as in
INNODB_LOCKS.LOCK_MODE but allow other common
abbreviations for other possible storage engines.
'X' exclusive
'X.GAP' exclusive on gap between keys
'AUTO_INC' exclusive on sequence
'IX' intent exclusive
'IS' intent shared
'IS.GAP' intent shared on gap between keys
'IU' intent update
'S' shared
'S.GAP' shared on gap between keys
'U' update
The "mode that you request the lock" may differ from
the "mode that you hold the lock", but we show only
request_mode.

request_status CHAR or ENUM
'GRANTED' request was successful
'UPGRADE' e.g. converting from shared to exclusive
'WAIT' request is entered, probably blocked
'ESCALATE' maybe use this for granularity change
'FAIL' deadlock, timeout, table full

request_count If request_status='granted', the value here
is the number of other transactions which are
waiting for this lock to be released. Might be
just a binary value ("is blocking (yes/no)?").
In WL#321 "SHOW LOCK QUEUE" the column name is
number_of_threads_waiting_for_this_lock_to_be_released
We do not list the waiting transactions, and we do
not list the waiting lock requests, that would be
hard. Maybe 'waiting_count' would be a better name.

storage_engine CHAR. Typically this is 'INNODB'.
This is NULL when no storage engine is involved.
The storage engine might also be in EVENT_NAME.

/* These further columns exist in events_waits_current. */
event_id
event_name E.g. 'Wait/Synch/Lock/Row-Exclusive/InnoDB'
timer_start When lock was requested.
Doesn't appear in our model DBMSs.
Data type might be TIMESTAMP.
timer_end When lock was granted, or when it failed.
timer_wait This is either "timer_end - timer_start"
or "current_time - timer_start" in pseudo-picoseconds.
nesting_event_id

/* These further columns exist in Robin Schumacher's 'syslocks' request.
Technically user name and user host are unnecessary, because we can
look up them up via thread_id. But we don't want to force users to
do subqueries and joins every time, so if they always need these
columns, they should get them. */
user_name
user_host

Perhaps a way to think of a LOCKS table is as a join of
EVENTS_WAITS_HISTORY wait/synch/lock rows from different threads, where
thread-1.object_name,object-location = thread-2.object-name,object-location
and event_name like 'wait/synch/lock%'.

Additional columns in Event Summaries
-------------------------------------

Since locks are events, we have all the summary information
that we have for all events. For example:

select * from EVENTS_WAITS_SUMMARY_BY_EVENT_NAME\G
...
EVENT_NAME: wait/synch/lock/sql/table
COUNT_STAR: 8
SUM_TIMER_WAIT: 31412802980
MIN_TIMER_WAIT: 758914
AVG_TIMER_WAIT: 3926600372
MAX_TIMER_WAIT: 24824640490

But we want a breakdown by result -- what happened to those 8 requests?
Therefore we'd have more columns. For example:
COUNT_STAR_GRANTED 2
COUNT_STAR_ESCALATED 2 /* DB2 name = lock_escals */
COUNT_STAR_DEADLOCKED 2
COUNT_STAR_TIMEOUT 2
COUNT_STAR_OTHER_ERROR 0
COUNT_STAR_IN_PROGRESS 0
COUNT_STAR_KILLED 0

We must add these columns to EVENTS_WAITS_SUMMARY tables,
so a description is necessary in WL#4816 PERFORMANCE_SCHEMA Summaries.
They will always be NULL for non-lock events.
There is no need for separate LOCKS_SUMMARY views of these tables.

Two more possible columns are "high water marks", that is,
counters for the maximum events that were ever happening simultaneously.
Specifically:
HIGH_WATER_MARK_REQUESTED /* How many were ever active at the same time */
HIGH_WATER_MARK_WAITED /* How many of the above were blocked */
You can get the figures by counting rows in PERFORMANCE_SCHEMA.LOCKS.
Although this is good for calculating resource usage, and perhaps for
deciding how much memory to allocate for some lock tables, it is not
really monitoring. We won't have high water marks.

Deadlock counting suggestions should go to WL#4689 Deadlock Monitor.

Source of Information
---------------------

Often we can expect that the storage engine has an internal lock list
so the easiest way to get information is by querying the
storage engine directly. That would mean changing storage-engine API.

Alternatively there might be tables specific to the storage engine
that we can read in order to populate PERFORMANCE_SCHEMA.LOCKS.
For example, the InnoDB plugin supplies, in information_schema:
INNODB_TRX, INNODB_LOCKS, and INNODB_LOCK_WAITS
http://www.innodb.com/doc/innodb_plugin-1.0/innodb-information-schema.html#innodb-information-schema-transactions
If the plugin is unavailable one can struggle with SHOW ENGINE INNODB STATUS.

Alternatively the performance schema instruments will give us
* "who is blocked" and "what the last lock request was"
PERFORMANCE_SCHEMA.EVENTS_WAITS_CURRENT
* "how many times a table was locked, and for how long, and current status"
See WL#4816 HLS text after the words "[ OBJECT SUMMARIES ]"
* "locking events in the history"
PERFORMANCE_SCHEMA.EVENTS_WAITS_HISTORY
* "summary of locks by thread or globally"
EVENTS_WAITS_SUMMARY_BY_EVENT_NAME.COUNT_STAR
EVENTS_WAITS_SUMMARY_BY_EVENT_NAME.SUM_TIMER_WAIT
EVENTS_WAITS_SUMMARY_BY_EVENT_NAME.MIN_TIMER_WAIT
EVENTS_WAITS_SUMMARY_BY_EVENT_NAME.AVG_TIMER_WAIT

But PERFORMANCE_SCHEMA "history" tables are fixed size, so even a
long history might not have every row lock. Perhaps we should
have a count of row locks in the "object summaries" table, that is,
you could find out that thread x has y row locks on table z. Perhaps
we should have the last ten granted row locks in the "object summaries"
table, that is, you could find out the rows that thread x has on table
z, but only ten of them. With a combination of "long history" plus
"count of row locks by table" plus "last 10 row locks on table" you
probably can answer the important question "who is blocking". How
probably? Peter thinks: if it's 90% that's not as good as what the
the storage engine's own table has, but it would be universal, and
we could test in a real installation to see if it would be adequate.

So MySQL populates PERFORMANCE_SCHEMA.LOCKS in different ways for
different storage engines, depending whether we instrument the
storage engine code, and depending whether the storage engine
exposes information from its own internal tables. What the user
sees should be consistent, though -- MySQL must try to make the
information look the same, and be in one table.

SHOW Syntax
-----------

The original (January 2005) request was for a new SHOW statement.
Perhaps this requirement is now superseded. We only need the tables.
But the specification of "SHOW Syntax" remains in case somebody likes it.

SHOW ENGINE INNODB LOCK STATUS
[LIKE 'tablename']
[LIMIT number_of_locks_to_print]

SHOW LOCK STATUS
WHERE engine='InnoDB' /* later, 'engine='MyISAM' etc. */
[AND resource_object_id='something']
[LIMIT number_of_locks_to_print]

Some comments (from the discussion thread)

Nowadays MySQL warns that SHOW INNODB STATUS is deprecated syntax
and users should use SHOW ENGINE INNODB STATUS instead. Peter expects
that this is analogous.

Since MySQL has LIKE 'pattern' for other SHOW statements, e.g.
SHOW STATUS [LIKE 'pattern'], Peter believes users will expect to see
"LIKE 'tablename'" rather than "tablename".

And Peter believes "LIMIT number_of_locks_to_print", rather than
"number_of_locks_to_print", might help avoid a parsing problem later
if we add a new clause before number_of_locks_to_print. Specifically,
he worries that someone will want to have
SHOW ENGINE INNODB LOCK STATUS WHERE ... LIMIT ...

SHOW LOCK STATUS will be the same as SELECT * FROM PERFORMANCE_SCHEMA.LOCKS.

Mutexes
-------

Originally this specification included mutexes.
Now it doesn't.

There is a similarity. We could be showing mutexes and
filling in these columns of PERFORMANCE_SCHEMA.LOCKS:
object_name = event_name, or part of event_name
object_type = 'MUTEX'
object_instance_begin = address
request_mode = 'X' (mutex locking is always exclusive)
request_status = 'WAIT' or 'GRANTED'
request_count = how many others are blocked

Currently one can get some information per mutex
with MUTEX_INSTANCES, for example:
NAME: wait/synch/mutex/sql/LOG::LOCK_log
OBJECT_INSTANCE_BEGIN: 34754648
LOCKED_BY_THREAD_ID: NULL

Or one could have a separate table which lists
only the holders of currently-held mutexes.
In that case, it's simply a list of copies of
EVENTS_WAITS_CURRENT mutex rows which don't
disappear until the mutex is unlocked.

Other lock situations to consider are: GET_LOCKS, FLUSH TABLES WITH READ LOCK,
metadata locks.

Other worklog entries
---------------------

In the past, there have been other suggestions for MySQL lock display:
"SHOW TABLE LOCKS [FROM (db|table)] [FOR user];" (WL#320, Jeremy Cole)
"SHOW LOCK QUEUE [FROM (db|table)]" (WL#321, Jeremy Cole)
"SHOW FULL PROCESSLIST" (WL#1275, Mark Matthews)
These are raw-idea-bin suggestions, and Peter thinks we can ignore them.
(Sergei Golubchik, on the other hand, thinks we should consider them.)

And there is a task in the benchmarks queue:
WL#2288 Patch Innodb-code to add status variable to show number of row locks

Some of the "mysqladmin debug" code might be worth looking at.

Falcon has WL#3765 Falcon Diagnostics.

References
----------

BUG#9742: Put a SHOW LOCK TABLES

BUG#23076 No way of viewing locks from within the server (Mark Leith)
http://bugs.mysql.com/bug.php?id=23076

MySQL 5.6 Reference manual re INNODB_LOCKS table
http://dev.mysql.com/doc/refman/5.6/en/innodb-locks-table.html

"MySQL Internals: SHOW LOCKS or equivalent"
http://lists.mysql.com/internals/33907

Harrison Fisk's patch
[mysql internal address]/secure/mailarchive/mail.php?folder=4&mail=24248

BUG#33601 adding a view for information_schema to identify lock waiting
relations (another patch)
http://bugs.mysql.com/bug.php?id=33601

BUG#46391 No statistics are displayed for named locks from GET_LOCK() etc.

A table [performance_schema] locks is mentioned in the WL#2360 file attachment
[mysql internal address]/worklog/download/performance.txt?fid=400

Some more private references are visible by clicking on the 'Progress'
for 2009-04-13, to see what was deleted from 'References' on this date.