WL#11567: Robust event deserialization by libbinlogevents
Affects: Server-8.0
—
Status: Complete
Event deserialization in libbinlogevents must ensure that it will never read outside the limits of the buffer being deserialized. When an event deserialization tries to reach out of buffer positions, the event shall be considered as invalid and cannot be applied (by replication applier|worker threads) or displayed (by mysqlbinlog client program or SHOW BINARY|RELAY LOG EVENTS). This worklog aims to create a buffer reader interface and class at libbinlogevents to ease and standardize events deserialization with respect to avoiding reads out of buffer boundaries. This interface shall provide all buffer access types libbinlogevents needs to deserialize events, for example: - memcpy: copy a portion of the buffer to other memory buffer; - get ordinary values: copy a value from the buffer, such as byte, integer, long, sometimes doing additional operations such as le16toh, le32toh, le64toh; - strings: copy a string content from the buffer; - get_field_length: get a packed field length from the buffer; Use case: As a MySQL user, I want to know that my MySQL server is robust by default and will not crash or leak data when replicating corrupt events, regardless of the use of binary logging checksums.
Functional requirements ======================= F-1: MySQL server SHALL NOT crash or leak data when replicating corrupt events regardless of CRC usage. F-2: mysqlbinlog client program SHALL NOT crash or lead data when processing corrupt events regardless of CRC usage. F-3: No data collected from corrupted events shall be displayed. Error messages about corrupted events should be limited to inform that the event was assumed as corrupted. Debug builds may give additional information about the event corruption in debug traces. Non-Functional requirements =========================== NF-1: Any regression in replication performance should not be greater than 5%.
Interface Specification ======================= I-1: No new files I-2: No changes in existing syntax I-3: No new commands I-4: No new tools I-5: No impact on existing functionality Problems ======== While investigating to implement this worklog, the following problems were found: P1) libbinlogevents has no standard way of tagging an event as invalid Even though Log_event class has is_valid_param (false by default) to be set true by children classes when assuming all event parameters are valid, some of event parameters are collected at event deserialization in libbinlogevents. Example: Gtid_log_event Binary_log_event ------> Log_event_header ^ no check no check | | Binary_log::Gtid_event Log_event no check \ / is_valid_param=false \ / \ / \ / Gtid_log_event is_valid_param=true|false Gtid_log_event deserialization starts with calling Binary_log_event constructor, that will create a Log_event_header and a Log_event_footer. The Log_event_header will deserialize the event header. Then, Binary_log::Gtid_event constructor will be called to do the event specific deserialization. Finally, the Log_event constructor will be called, passing the header and footer objects to be used. Currently, is_valid_parameter is only set at Gtid_log_event constructor. P2) Event deserialization issues Not all event buffer reads check for buffer boundaries. Not all buffer boundaries checks are correct. There are also other issues related to reading from the buffer and checking for errors, aborting event deserialization in the middle of the process. P2a) Avoid reading out of buffer boundaries Some event constructors dealing with event deserialization check buffer boundaries, but not all reads are verified. Also, some checks are performed after reading from the buffer. All reads from the event buffer shall be guaranteed to be withing buffer boundaries. P2b) Be careful when calculating sizes and boundaries: avoiding overflow Many of buffer size, pointers and boundaries calculations are performed using unsigned numeric variables. Assume all values as 64 bit unsigned: - size: size of the buffer; - buffer: base pointer of the buffer; - ptr: current position in the buffer to be read; - wanted: the amount of bytes that is wanted to be read from the buffer; Consider (((ptr - buffer) + wanted) <= size) as one logic to check buffer boundaries. Assume: buffer = x; ptr = 16 + x; size = 32. Buffer has only 16 bytes available to be read. The "wanted" part of the logic was read from the buffer (as the size of a field about to be read). An ordinary "wanted" (i.e. 8) would pass the logic: 16 + 8 = 24; 24 <= 32: "wanted" is allowed to be read. A not so big invalid "wanted" (i.e. 32) would also pass the logic: 16 + 32 = 48; 48 > 32: "wanted" is not allowed to be read. But notice that a very large "wanted" (2^64 - 8) may mislead the logic: 16 + (2^64 - 8) = (overflowed) 8; 8 <= 32: "wanted" is allowed to be read. The correct logic in this case should trust in non-deserialized values: ((size - (ptr - buffer)) >= wanted) wanted = 8: (32 - 16) >= 8: "wanted" is allowed to be read. wanted = 32: (32 - 16) >= 32: "wanted" is not allowed to be read. wanted = 2^64 - 8: (32 - 16) >= 2^64 - 8: "wanted" is not allowed to be read. P2c) Pointers and extra buffers initialization and cleanup Some event destructors rely on the fact that some pointers always point to allocated buffers. Doing additional buffer boundary checks may lead to abort event deserialization before allocating memory for some structures. This worklog should make all event pointers initialized as nullptr by default, and free everything that was allocated when pointers are not nullptr. P2d) Some asserts should in fact be turned to error and be properly handled Example: in Log_event_head constructor, there is a line with: BAPI_ASSERT(type_code < ENUM_END_EVENT || flags & LOG_EVENT_IGNORABLE_F); This line is asserting that the event type being deserialized is supported by the server format description or the event has the LOG_EVENT_IGNORABLE_F flag set. This logic holds only if the code (both server and mysqlbinlog codes) ensure one of this conditions before trying to instantiate an event from a given buffer. Once having P1 solved, the Log_event_header can set the event as invalid if at least one of the conditions are not true. This would allow developers to be able to test event deserialization using invalid buffers and seeing the server (or mysqlbinlog) handling the errors without controlled crashes. Minor Problems -------------- P3) The event object is not fully aware of CRC payload All deserialized events have a "data_written" field on its reader with the amount of bytes the event had when it was serialized. This field takes into account also the CRC payload when the format description states that the events have CRC. When an event is being deserialized using a buffer and expecting a Log_event child object in return (Log_event::read_log_event(char *, uint, ...)), is the read_log_event function that adjusts the event_size to be passed to event constructors, removing the CRC payload if necessary. This leads to complex logic when, for example, the receiver thread is deserializing a Rotate event it received from the master. P4) There is almost no debugging information in libbinlogevents Log_event classes use to have DBUG_ENTER/DBUG_VOID_RETURN to mark class construction in debug traces. Some events also generate extra information about what is being deserialized. Unfortunately, libbinlogevents has almost no debugging information. High Level Specification ======================== Below are listed the proposed solutions that should be accomplished by this worklog. S1) Moved is_valid_param into libbinlogevents common_header The idea is to move is_valid_param functionality to the Log_event_header object, so both Binary_log_event and Log_event children classes can use it. By moving the is_valid_param logic to Log_event_header, all objects that read from the buffer to deserialize the event (both header and event specific parts) will have access to set the event as invalid and check it. Example: Gtid_log_event Binary_log_event ------------------> Log_event_header ^ if (!is_valid) return is_valid=true|false | ... | is_valid=true|false | | Binary_log::Gtid_event Log_event if (!is_valid) return \ / if (!is_valid) return ... \ / ... is_valid=true|false \ / is_valid=true|false \ / Gtid_log_event if (!is_valid) return ... is_valid_param=true|false So, all *_event constructors inheriting from Binary_log_events and Log_event should, before any other activity, check if a parent constructor set the event as invalid (so no further action is necessary). S2) A new class should be implemented to handle robust event buffer access Adapting events deserialization to use this new class instead of directly accessing the buffer should address all P2 issues. As this process will change all events constructors, we can address P4 too. The "Event_reader" class should be instantiated by passing as parameter the base pointer to the buffer and the buffer size. It should be part of the Log_event_header event, so all children classes will be able to use it to deserialize event content from the buffer with proper error handling. The class should have at least the following variables: /* The buffer to read from */ const char *m_buffer; /* The pointer to the current read position in the buffer */ const char *m_ptr; /* The length of the buffer */ unsigned long long m_length; The class should provide some error handling. In theory, any error detected by the Event_reader class would result in a single server error code: ER_READ_LOG_EVENT_FAILED eng "Error in Log_event::read_log_event(): '%s', data_len: %lu, event_type: %d." Note that the I/O thread also deserialize some events (Rotate, FD and GTID) and should also report about invalid events. Maybe we should create a new error type for a failure detected while deserializing an event. As read_log_event() error handling expects a string do give more information about the failure, the Event_reader class should also have something like: /* The pointer to the current error message, or nullptr */ const char *m_error; In this way, m_error == nullptr means no error, and m_error != nullptr means that an error happened while reading from the buffer and m_error points to the error message. The error message shall never contain any information obtained from the buffer, or else we risk leaking information. Some example error messages are: e1) Unable to read content from the event This should be used every time an event deserialization tries to read out of event buffer boundaries. e2) Invalid checksum algorithm This should be used when deserializing a format description event trying to use and invalid checksum algorithm. The class should support all type of operations performed by event deserialization functions, such as: f1) go_to(ulong) jump to a given buffer position; f2) forward(ulong) advance the read position; f3)read_ (): read a from the buffer and forwards sizeof( ); f4) read _and_le toh() read from the buffer, storing them into a variable, call le toh, forward /8 bytes; f5) copy(var, length) copy the length from the buffer into var, forwarding length bytes; All event classes constructors that deserialize events from a given buffer will have to do some standard verification before starting deserializing. Unfortunately, MySQL coding standard does not support exceptions. In theory, an event constructor will have to follow more or less the following pattern: { BAPI_ENTER("Constructor"); /* Check if event deserialization failed in parent objects */ if (!header->is_valid()) BAPI_VOID_RETURN; Event_reader reader = header->get_event_reader(); /* Read contents from the buffer */ var1_size = reader->read_and_le64toh(); if (reader->has_error()) goto err; var1_content = reader->ptr(); reader->forward(var_size); if (reader->has_error()) goto err; var2_size = reader->read_and_le64toh(); if (reader->has_error()) goto err; reader->copy(var2_content, var2_size); if (reader->has_error()) goto err; ... if ( ) { reader->set_error('Unexpected condition'); goto err; } ... err: if (reader->has_error()) header->set_is_valid(false); BAPI_VOID_RETURN; } The idea is to wrap the reader function with the check in a macro, maybe wrapping also the initial check and the 'err:' label: { BAPI_ENTER("Constructor"); EVENT_READER_EVENT_CONSTRUCTOR_HEADER; // initial check, declare reader /* Read contents from the buffer */ EVENT_READER_SET_AND_CHECK(var1_size, read_and_le64toh); EVENT_READER_SET(var1_content, ptr); EVENT_READER_CALL_AND_CHECK(forward, var1_size); EVENT_READER_SET_AND_CHECK(var2_size, read_and_le64toh); EVENT_READER_CALL_AND_CHECK(copy, var2_content, var2_size); ... if ( ) EVENT_READER_THROW('Unexpected condition'); ... EVENT_READER_EVENT_CONSTRUCTOR_FOOTER; // err: and call set_is_valid BAPI_VOID_RETURN; } S3) Move CRC logic to Log_event_footer This should address P3. This CRC logic will be moved to Log_event_footer, simplifying and standardizing the CRC payload processing. With this change, there will be no need to pass the event size to an event constructor: the event header will know its size and the event footer will know if it has CRC payload or not, and the CRC algorithm used.
Low Level Design ================ The worklog changes will be divided into 7 steps: Step 1) Cover HLS' S1: move is_valid_param to Log_event_header -------------------------------------------------------------- This step will not require specific test cases. The current MTR suites should be enough to validate the changes as no change in behavior should be expected. Step 2 to step 6) Cover HLS' S2: introduce the Event_reader class and use it ---------------------------------------------------------------------------- These steps will not require specific test cases too. The current MTR suites should be enough to validate the changes as no changes in behavior are expected. Maybe some test cases will have to be refactored as a result of invalidation of events. Step 2 will introduce the Event_reader class and should also cover HLS' S3. Three new files will be added to libbinlogevents: libbinlogevents/include/event_reader.h libbinlogevents/include/event_reader_macros.h libbinlogevents/src/event_reader.cpp They will contain the Event_reader class to be used by any function requiring event buffer deserialization. They should also contain the macros to standardize and "simulate" the exception behavior: #define READER_CONSTRUCTOR_HEADER \ if (!header()->is_valid()) BAPI_VOID_RETURN #define READER_CONSTRUCTOR_FOOTER \ event_reader_footer: \ header()->set_is_valid(READER_CALL(has_error) == false) #define READER_CHECK \ if (header()->reader()->get_error()) goto event_reader_footer #define READER_SET_AND_CHECK(var, func, ...) \ var = header()->reader()->func(__VA_ARGS__); \ EVENT_READER_CHECK #define READER_CALL_AND_CHECK(func, ...) \ header()->reader()->func(__VA_ARGS__); \ EVENT_READER_CHECK #define READER_THROW(message) \ { \ header()->reader()->set_error(message); \ goto event_reader_footer; \ } Step 3 will change control events. Step 4 will change statement events. Step 5 will change load data events. Step 6 will change rows events. Step 7) Related test cases and cleaning up ------------------------------------------ This step will introduce some test cases of related issues fixed by the worklog. It will also remove code not used anymore by libbinlogevents after migrating all the event deserialization code to the Event_reader class.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.