Disconnects are performed asynchronous. This is most probably good for the throughput of usual application but has some annoying consequences for the development of tests.
disconnect <connection> or
KILL <connection_id> or
--exec <client program which connects to the server>
occurs within a script for mysqltest than the next statements will be executed even if the disconnect or kill is not complete finished.
I guess this also applies to the end (-> end of statements to be executed by mysqltest) of a test. It triggers some disconnect for any open session but this is also asynchronous.
Not complete finished means that the session is visible within the processlist and the entry about the 'quit' of the session is not already appended to the general log.
So it could happen that a succeeding test suffers by unexpected events like
observing the session belonging to the previous test within the processlist
being surprised by an entry (the 'quit' of the disconnected session) within the general log which was not caused by its own activity etc.
According to my experiences the likelihood that some test could harm a successing test by unfinished disconnects increases with
additional parallel load on the testing box
"disconnect <connection>" is missing within the test
"shorter" distance between "disconnect <connection>", "KILL <connection_id>" or "--exec ..." and end of test
the session runs a "heavy" SQL statement just before its disconnect
In my opinion the most robust solution is to add a
just after the disconnect.
In case you don't believe me, run the sniplet following later
./mtr --skip-ndb --no-check-testcases --repeat=100 <sniplet>
and you will most probably observe something like
TEST RESULT TIME (ms) ------------------------------------------------------------ ... <sniplet> [ pass ] 4 <sniplet> [ fail ] ... CURRENT_TEST: <sniplet> --- <result> +++ <reject> @@ -1,4 +1,4 @@ SELECT COUNT(*) FROM information_schema.processlist WHERE id < CONNECTION_ID(); COUNT(*) -0 +1 SELECT SLEEP(10); ... mysqltest: Result content mismatch
Sniplet for demonstration purposes:
SELECT COUNT(*) FROM information_schema.processlist WHERE id < CONNECTION_ID(); # (1) connect (con1,localhost,root,,); send SELECT SLEEP(10); # (2) connection default; # Wait till the SQL statement of con1 is "in work" let $wait_condition= SELECT COUNT(*) = 1 FROM information_schema.processlist WHERE State = 'User sleep' AND Info = 'SELECT SLEEP(10)'; --source include/wait_condition.inc # (3) SELECT COUNT(*) = 1 FROM information_schema.processlist WHERE State = 'User sleep' AND Info = 'SELECT SLEEP(10)'; # (4) disconnect con1; (5) <end of script>
The sniplet is intended for demonstration purposes and contains code which is obvious "junk" but needed to enforce the intended effect. Some notes:
(1) Such a statement at the begin of a test makes this test very sensitive to not finished disconnects.
(2) The wait routine (3) prevents that our look on the process list (4) happens too early = before the server has started the processing of (2). Otherwise we get random result set differences for (4) under high parallel load.
How to prevent that this test harms the succeeding test?
(5) should be replaced by connection con1; disconnect con1; --source include/wait_until_disconnected.inc connection default;
Where is the "junk"?
There should be a 'reap' for every preceeding 'send <statement>'. This is not just for academic completeness it also ensures here that (2) is really finished.
Do we really need 10 seconds within the SLEEP? According to my experiences sleep times <= 2 seconds are critical under high load. What about SLEEP(5)?