WL#1616: new design of the relay log to decrease disk writes
Affects: Server-7.1 — Status: Un-Assigned — Priority: Medium
Currently the I/O thread flushes the relay log after writing every event to it. That is slower than 3.23's design where slave reads one event from the socket and executes it immediately without writing it anywhere, as it adds disk writes compared to 3.23. The reason to have a relay log in 4.0 was for the case when the slave lags behind the master: if slave is one hour of execution behind master, and master burns before this hour gap has been closed, slave will never be able to be up-to-date (it didn't have time to fetch binlog, as it fetches/executes/fetches/executes); with a relay log, slave can quickly fetch all the binlog gap from master, so that if master soon crashes, its binlog is safe and slave can become up-to-date. But it was not intended to slow down the normal case where slave is close to master (in terms of binlog lag). In this normal case, the I/O and SQL thread could often be writing/reading the same in-memory tail of the relay log, and so there is no need to write this tail to disk (like how it was in 3.23). In other words, the proposal is to change the design: - slave I/O thread should not flush the relay log after writing every event - relay log should still be IO_CACHE-like (that is, an IO_CACHE or a new variant of it). - slave I/O thread writes to the relay log (IO_CACHE); slave SQL thread "prunes" the IO_CACHE buffer as it advances into it (i.e. executed events are wiped out from the buffer, won't go to disk); the buffer's beginning can thus be re-used by the slave I/O thread, it's a circular buffer - as a result, if both threads stay close there is never any disk write. - but if a lag appears, the IO_CACHE buffer becomes full and then overflows to the disk relay log, then the slave SQL thread goes to read this disk relay log; as the SQL thread catches up it "enters again" the IO_CACHE buffer, and so the disk relay log is not written anymore and so is deleted. - the disk relay log becomes more of a temporary file, unparsable (as missing events or even pieces of events), just a temp file backing the IO_CACHE sometimes. - in case of crash, as the relay log is unreliable (missing parts etc), we just restart fetching from the master from the last *executed* position. - we *might* want to provide an option to give the old behaviour (relay log a parsable log, flushed after each event), if there is an interest (is there?).
Copyright (c) 2000, 2015, Oracle Corporation and/or its affiliates. All rights reserved.