WL#7299: Binlog_sender: do not reallocate the event buffer for every event sent
Affects: Server-5.7
—
Status: Complete
EXECUTIVE SUMMARY ----------------- This worklog implements an optimization on the dump thread that removes unnecessary reallocation of the send buffer. The user visible effect is that the CPU will be used less by each dump thread the master has spawned. MOTIVATION ---------- For several reasons: 1. Several and recurrent requests have been made by a high profile MySQL replication user; 2. To make the mysql server better utilize the hardware resources (adaptative memory allocation by the dump thread and less CPU usage); 2. The general direction is that we improve replication performance and scalability and this is yet one more step down that path. REFERENCES ---------- MySQL BUG#31932
Functional Requirements ======================= None. Non-Functional Requirements =========================== - NF1. The sender thread SHALL use less CPU (how much exactly, depends on the workload). - NF2. The buffer size SHALL grow automatically and dynamically, without the need of user intervention. - NF3. The buffer size SHALL shrink if over time the memory allocated is not used. - NF4. There SHALL not be any memory leak when the thread is killed.
No other visible user changes other than the fact that there should be less reallocations and less CPU utilization.
PROBLEM STATEMENT ----------------- For every connected slave, the master keeps a binary log sender thread, aka dump thread, running. The sender thread is responsible to read the binary log and push it to the slave receiver thread, aka IO thread. The send unit is an event. For every event that the sender thread reads from the binary log, it puts it in a memory buffer and then calls the network send primitive with the contents of this buffer as a paremeter. However, for every event read, the sender thread frees the memory of the buffer and then reallocates memory, when it handles the next event. This is sub-optimal and results in unnecessary CPU usage. ANALYSIS -------- The problem can be pin-pointed by looking at the code in mysql-trunk. In rpl_binlog_sender.cc we find that the buffer used is a String buffer in THD, called packet (THD::packet). The contents of this buffer is sent by calling the member function: Binlog_sender::send_packet. Crawling upwards in the call graph, one can find that this function call results from three mnajor points: 1. Binlog_sender::send_heartbeat_event Binlog_sender::send_packet_and_flush Binlog_sender::send_packet 2. Binlog_sender::send_format_description_event Binlog_sender::send_packet 3. Binlog_sender::fake_rotate_event Binlog_sender::send_packet 4. Binlog_sender::send_events Binlog_sender::send_packet There may be a 5th call to Binlog_sender::send_packet indirectly from Binlog_sender::send_events, but in that case, the buffer used is a temporary buffer: 5. Binlog_sender::send_events Binlog_sender::send_heartbeat_event Binlog_sender::send_packet_and_flush Binlog_sender::send_packet A temporary buffer here is used because the sender thread needs to send a heartbeat before actually sending the data that it has read from the binary log. Since it cannot just drop the data it read to use THD::packet again, the sender thread uses a temporary local buffer. Now... The problem is that before seanding an event the buffer needs to be reset. This happens on the member function Binlog_sender::reset_transmit_packet. And inside we find this code: packet->length(0); /* set() will free the original memory. It causes dump thread to free and reallocate memory for each sending event. It consumes a little bit more CPU resource. TODO: Use a shared send buffer to eliminate memory reallocating. */ packet->set("\0", 1, &my_charset_bin); As the comment says, the set function frees the buffer memory. Later this memory is either explicitly reallocated in the stacks above #1 and #3 or implicitly by read_log_event when it calls String::append(...). This happens in the stacks #2 and #4 above. SOLUTION -------- The solution to this problem is to not reallocate the event memory unless really needed. Doing this requires removing the reallocation calls from stack #1 and #3, and remove the resetting of the buffer using String::set in Binlog_sender::reset_transmit_packet. Furthermore, it requires that the buffer is pre-allocated before being actually used. We already know the size needed for the buffer beforehand in #1,#2,#3 and #5. In #4, we just need to read/peek the event header and determine the event_len before calling read_log_event. Therefore, once we know the size of the event before reset_transmit_packet, we just call into that function and allocate the buffer if needed. Conversely, to avoid that the buffer grows too big and remains that large, the buffer size must be re-evaluated periodically. As such, every N events a decision needs to be taken, whether to shrink or keep the current buffer size. The approach is further detailed down. LOW LEVEL IMPLEMENTATION CONSIDERATIONS --------------------------------------- The solution proposed will require three big blocks of changes: 1. Encapsulating the allocation and shrinkage of the buffer. - Growing the buffer To better encapsulate the logic to grow the buffer size, we move to Binlog_sender::reset_transmit_packet the action of actually reallocating the buffer. This function is called everytime the dump thread loads an event to the buffer, right before sending it. This means: a) we can remove the calls to packet->realloc from Binlog_sender::fake_rotate_event and Binlog_sender::send_heartbeat_packet. b) everytime an event is to be sent, the reset_transmit_packet function needs to be called and take as input the size of the event that is to be loaded into the buffer. Therefore, this requires a change in the function signature to contain a new parameter that states how much buffer size the event will require. This makes the reset_transmit_packet function able to decide whether to realloc the buffer or not: inline int Binlog_sender::reset_transmit_packet( String *packet, ushort flags, uint32 min_buff_size) Now, inside the function, we need to remove this: packet->set("\0", 1, &my_charset_bin); and replace it with: packet->qs_append('\0'); Then, the reallocation is done, if needed, after the call to the hook: /* reserve and set default header */ if (RUN_HOOK(binlog_transmit, reserve_header, (m_thd, flags, packet))) { set_unknow_error("Failed to run hook 'reserve_header'"); DBUG_RETURN(1); } needed_buffer_size= packet->length() + min_buff_size; /* Resizes the buffer if needed. */ this->grow_packet(cur_buffer_size, needed_buffer_size, packet); We encapsulate the realloc call inside grow_packet, since it hides the check to decide whether to reallocate or not. - Shrinking the buffer The buffer can be shrinked after the event is sent. This happens in Binlog_sender::send_packet. Then we can just deploy a call to a member function shrink_packet: /* Shrink the packet if needed. */ this->shrink_packet(packet); Inside this function we implement the logic to shrink the buffer size. 2. Logics to dynamically and online adjust the buffer size - Growing the buffer size If the buffer is too small, then increase the buffer to the required size, but at least by a factor K: new_size = max(needed_size, buffer_size * K) Implementation could look like this: inline void grow_packet(ulong cur_buffer_size, ulong needed_buffer_size, String *packet) { /* Grow the buffer if needed. If not, update the counters used to decide if we are ever going to shrink the buffer after sending the packet. */ if (needed_buffer_size > cur_buffer_size) { ulong new_buffer_size= min( max(static_cast(cur_buffer_size * PACKET_GROWTH_FACTOR), needed_buffer_size), m_thd->variables.max_allowed_packet); packet->realloc(new_buffer_size); } } - Shrinking the buffer size We will shrink the buffer by a factor M, if less than 1/M of the buffer has been used for the last N consecutive events. The implementation should be something similar to this: inline void shrink_packet(String *packet) { ulong cur_buffer_size= packet->alloced_length(); ulong buffer_used= packet->length(); if (buffer_used < static_cast ((cur_buffer_size * PACKET_SHRINKAGE_FACTOR))) this->m_half_buffer_size_req_counter ++; else this->m_half_buffer_size_req_counter= 0; /* Check if we should shrink the buffer. */ if (m_half_buffer_size_req_counter == PACKET_SHRINKING_COUNTER_THRESHOLD) { uint32 new_buffer_size= cur_buffer_size * PACKET_SHRINKAGE_FACTOR; if (new_buffer_size >= PACKET_MINIMUM_SIZE && new_buffer_size != cur_buffer_size) { /* The last PACKET_SHRINKING_COUNTER_THRESHOLD consecutive packets required less than half of the current buffer size. Lets shrink it to not waste memory. */ packet->shrink(new_buffer_size); } /* Reset the counter. */ this->m_half_buffer_size_req_counter= 0; } DBUG_ASSERT(packet->alloced_length() >= PACKET_MINIMUM_SIZE); } 3. Logics to read the size of a log event from the binlog. Change Binlog_sender::read_event in order to read the event header from the binary log before calling Binlog_sender::reset_transmit_packet. Thence, we can load the event header, read the event length and calculate how much buffer will be needed, before we actually end up calling Log_event::read_log_event. Something like this before calling reset_transmit_packet: char header[LOG_EVENT_MINIMAL_HEADER_LEN]; if (error= Log_event::peek_event_header(header, log_cache)) { error= (error == LOG_READ_EOF) ? LOG_READ_IO : error; set_fatal_error(log_read_error_msg(error)); DBUG_RETURN(1); } uint32 buffer_needed= uint4korr(header + EVENT_LEN_OFFSET); if (reset_transmit_packet(packet, 0, buffer_needed)) DBUG_RETURN(1);
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.