MySQL 8.4.3
Source Code Documentation
log_sanitizer.h
Go to the documentation of this file.
1// Copyright (c) 2022, 2024, Oracle and/or its affiliates.
2//
3// This program is free software; you can redistribute it and/or modify
4// it under the terms of the GNU General Public License, version 2.0,
5// as published by the Free Software Foundation.
6//
7// This program is designed to work with certain software (including
8// but not limited to OpenSSL) that is licensed under separate terms,
9// as designated in a particular file or component or in included license
10// documentation. The authors of MySQL hereby grant you an additional
11// permission to link the program and your derivative works with the
12// separately licensed software that they have either included with
13// the program or referenced in the documentation.
14//
15// This program is distributed in the hope that it will be useful,
16// but WITHOUT ANY WARRANTY; without even the implied warranty of
17// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18// GNU General Public License, version 2.0, for more details.
19//
20// You should have received a copy of the GNU General Public License
21// along with this program; if not, write to the Free Software
22// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
23
24#ifndef BINLOG_LOG_SANITIZER_H
25#define BINLOG_LOG_SANITIZER_H
26
27#include <functional>
29#include "sql/binlog.h"
30#include "sql/binlog/decompressing_event_object_istream.h" // binlog::Decompressing_event_object_istream
31#include "sql/binlog_ostream.h" // binlog::tools::Iterator
32#include "sql/binlog_reader.h" // Binlog_file_reader
33#include "sql/log_event.h" // Log_event
34#include "sql/xa.h" // XID
35
36namespace binlog {
37
38/// @brief Class used to recover binary / relay log file
39/// @details This base class is responsible for finding the last valid
40/// position of a relay log / binary log file, meaning, the position of the
41/// last finished event which occurs outside of transaction boundary.
42/// Validation starts when first reliable position has been found, i.e.:
43/// - source rotation event
44/// - source FDE
45/// - source STOP event
46/// - first finished transaction:
47/// * Query log event with: COMMIT / ROLLBACK / XA COMMIT / XA ROLLBACK /
48/// atomic DDL
49/// * XID Log event
50/// Validation ends at the end of the binlog file / relay log file or in case
51/// further reading is not possible.
52/// Binary log recovery:
53/// Binary log file always start with an FDE which is the first and valid
54/// position within a file. Binary log files are never removed by a log
55/// sanitizer.
56/// Relay log recovery:
57/// If no valid position has been found in any of the relay log files,
58/// Log sanitizer will keep all of the relay log files.
59/// In case a valid position has been found in any of the first relay log
60/// files, relay log files that do not contain a valid position outside of
61/// a transaction boundary, will be removed.
63 public:
64 /// @brief Ctor
66
67 /// @brief Dtor
68 virtual ~Log_sanitizer() = default;
69
70 /// @brief Retrieves the position of the last binlog/relay log event that
71 /// ended a transaction or position after the RLE/FDE/SE that comes from
72 /// the source
73 /// @return The position of the last binlog event that ended a transaction
74 my_off_t get_valid_pos() const;
75
76 /// @brief Retrieves the last valid source position of an event in
77 /// read from the binary log / relay log file, which may be:
78 /// - source position of the event ending a transaction
79 /// - source position written in the source RLE
80 /// @return The position of the last binlog event that ended a transaction
81 /// and indicator whether this position is valid
82 std::pair<my_off_t, bool> get_valid_source_pos() const;
83
84 /// @brief Retrieves the updated name of the binlog source file
85 /// @return Updated source file or empty string; indicator equal to true in
86 /// case filename is valid
87 std::pair<std::string, bool> get_valid_source_file() const;
88
89 /// @brief Retrieves whether or not the log was correctly processed in full.
90 /// @return true if the log processing ended with errors, false otherwise.
91 bool is_log_malformed() const;
92
93 /// @brief Retrieves the textual representation of the encontered failure, if
94 /// any.
95 /// @return the string containing the textual representation of the failure,
96 /// an empty string otherwise.
97 std::string const &get_failure_message() const;
98
99 std::string get_valid_file() const { return m_valid_file; }
100
101 /// @brief Checks whether a valid sanitized log file needs truncation of
102 /// the last, partially written transaction or events that cannot be
103 /// safely read
104 /// @return true in case log file needs to be truncated, false
105 /// otherwise
106 bool is_log_truncation_needed() const;
107
108 /// @brief Checks whether the fatal error occurred during log sanitization
109 /// (OOM / decompression error which we cannot handle)
110 /// @return true in case fatal error occurred, false otherwise
111 bool is_fatal_error() const;
112
113 protected:
114 /// @brief Function used to obtain memory key for derived classes
115 /// @returns Reference to a memory key
116 virtual PSI_memory_key &get_memory_key() const = 0;
117
118 /// @brief This function goes through the opened file and searches for
119 /// a valid position in a binary log file. It also gathers
120 /// information about XA transactions which will be used during the
121 /// binary log recovery
122 /// @param reader Log reader, must be opened
123 template <class Type_reader>
124 void process_logs(Type_reader &reader);
125
126 /// @brief This function goes iterates over the relay log files
127 /// in the 'list_of_files' container, starting from the most recent one.
128 /// It gathers information about XA transactions and performs
129 /// a small validation of the log files. Validation starts
130 /// in case a first reliable position has been found (FDE/RLE/SE from the
131 /// source or the end of a transaction), and proceeds till the end of file
132 /// or until a read error has occurred.
133 /// In case a valid position has been found within a file,
134 /// relay log files that were created after this file will be removed.
135 /// In case no valid position has been found within a file, sanitizer will
136 /// iterate over events in the previous (older) relay log file.
137 /// In case no valid position has been found in any of the files listed in
138 /// the 'list_of_files' container, relay log files won't be removed. It may
139 /// happen e.g. in case we cannot decrypt events.
140 /// @param reader Relay log file reader object
141 /// @param list_of_files The list of relay logs we know, obtained
142 /// from the relay log index
143 /// @param log MYSQL_BIN_LOG object used to manipulate relay log files
144 template <class Type_reader>
145 void process_logs(Type_reader &reader,
146 const std::list<std::string> &list_of_files,
147 MYSQL_BIN_LOG &log);
148
149 /// @brief This function will obtain the list of relay log files using the
150 /// object of MYSQL_BIN_LOG class and iterate over them to find the last
151 /// valid position within a relay log file. It will remove relay log files
152 /// that contain only parts of the last, partially written transaction
153 /// @param reader Relay log file reader object
154 /// @param log MYSQL_BIN_LOG object used to manipulate relay log files
155 template <class Type_reader>
156 void process_logs(Type_reader &reader, MYSQL_BIN_LOG &log);
157
158 /// @brief Reads and validates one log file
159 /// @param[in] filename Name of the log file to process
160 /// @param[in] reader Reference to reader able to read processed log
161 /// file
162 /// @returns true if processed log contains a valid log position outside
163 /// of transaction boundaries
164 template <class Type_reader>
165 bool process_one_log(Type_reader &reader, const std::string &filename);
166
167 /// @brief Indicates whether validation has started.
168 /// In case of relay log sanitization, we start validation
169 /// when we are sure that we are at transaction boundary and we are able
170 /// to recover source position, meaning, when we detect:
171 /// - first encountered Rotation Event, that comes from the source
172 /// - end of a transaction (Xid event, QLE containing
173 /// COMMIT/ROLLBACK/XA COMMIT/XA ROLLBACK)
174 /// - an atomic DDL transaction
175 /// Since binary logs always start at transaction boundary, when doing
176 /// a binary log recovery, we start validation right away.
177 /// By default, we are assuming that we are in the binary log recovery
178 /// procedure
180
181 /// Position of the last binlog/relay log event that ended a transaction
183 /// Position of the last binlog event that ended a transaction (source
184 /// position which corresponds to m_valid_pos)
186 /// Currently processed binlog file set in case source rotation
187 /// event is encountered
188 std::string m_valid_source_file{""};
189 /// Last log file containing finished transaction
190 std::string m_valid_file{""};
191 /// Whether or not the event being processed is within a transaction
192 bool m_in_transaction{false};
193 /// Whether or not the binary log is malformed/corrupted or error occurred
194 bool m_is_malformed{false};
195 /// Whether or not the binary log has a fatal error
196 bool m_fatal_error{false};
197 /// Textual representation of the encountered failure
198 std::string m_failure_message{""};
199 /// Memory pool to use for the XID lists
201 /// Memory pool allocator to use with the normal transaction list
203 /// Memory pool allocator to use with the XA transaction list
205 /// List of normal transactions fully written to the binary log
207 /// List of XA transactions and states that appear in the binary log
209
210 /// Information on whether log needs to be truncated, i.e.
211 /// log is not ending at transaction boundary or we cannot read it till the
212 /// end
214
215 /// Indicator on whether a valid position has been found in the log file
216 bool m_has_valid_pos{false};
217
218 /// Indicator on whether a valid source position has been found in the log
219 /// file
221
222 /// Last opened file size
224
225 /// @brief Invoked when a `Query_log_event` is read from the binary log file
226 /// reader.
227 /// @details The underlying query string is inspected to determine if the
228 /// SQL command starts or ends a transaction. The following commands are
229 /// searched for:
230 /// - BEGIN
231 /// - COMMIT
232 /// - ROLLBACK
233 /// - DDL
234 /// - XA START
235 /// - XA COMMIT
236 /// - XA ROLLBACK
237 /// Check below for the description of the action that is taken for each.
238 /// @param ev The `Query_log_event` to process
239 void process_query_event(Query_log_event const &ev);
240
241 /// @brief Invoked when a `Xid_log_event` is read from the binary log file
242 /// reader.
243 /// @details Actions taken to process the event:
244 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
245 /// to true, indicating that the binary log is malformed.
246 /// - The `m_in_transaction` flag is set to false, indicating that the
247 /// event ends a transaction.
248 /// - The XID of the transaction is extracted and added to the list of
249 /// internally coordinated transactions `m_internal_xids`.
250 /// - If the XID already exists in the list, `m_is_malformed` is set to
251 /// true, indicating that the binary log is malformed.
252 /// @param ev The `Xid_log_event` to process
253 void process_xid_event(Xid_log_event const &ev);
254
255 /// @brief Invoked when a `XA_prepare_log_event` is read from the binary log
256 /// file reader.
257 /// @details Actions taken to process the event:
258 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
259 /// to true, indicating that the binary log is malformed.
260 /// - The `m_in_transaction` flag is set to false, indicating that the
261 /// event ends a transaction.
262 /// - The XID of the transaction is extracted and added to the list of
263 /// externally coordinated transactions `m_external_xids`, along side the
264 /// state COMMITTED if the event represents an `XA COMMIT ONE_PHASE` or
265 /// PREPARED if not.
266 /// - If the XID already exists in the list associated with a state other
267 /// than `COMMITTED` or `ROLLEDBACK`, `m_is_malformed` is set to true,
268 /// indicating that the binary log is malformed.
269 /// @param ev The `XA_prepare_log_event` to process
271
272 /// @brief Invoked when a `BEGIN` or an `XA START' is found in a
273 /// `Query_log_event`.
274 /// @details Actions taken to process the statement:
275 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
276 /// to true, indicating that the binary log is malformed.
277 /// - The `m_in_transaction` flag is set to true, indicating that the
278 /// event starts a transaction.
279 void process_start();
280
281 /// @brief Invoked when a `COMMIT` is found in a `Query_log_event`.
282 /// @details Actions taken to process the statement:
283 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
284 /// to true, indicating that the binary log is malformed.
285 /// - The `m_in_transaction` flag is set to false, indicating that the
286 /// event starts a transaction.
287 void process_commit();
288
289 /// @brief Invoked when a `ROLLBACK` is found in a `Query_log_event`.
290 /// @details Actions taken to process the statement:
291 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
292 /// to true, indicating that the binary log is malformed.
293 /// - The `m_in_transaction` flag is set to false, indicating that the
294 /// event starts a transaction.
295 void process_rollback();
296
297 /// @brief Invoked when a DDL is found in a `Query_log_event`.
298 /// @details Actions taken to process the statement:
299 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
300 /// to true, indicating that the binary log is malformed.
301 /// - The XID of the transaction is extracted and added to the list of
302 /// internally coordinated transactions `m_internal_xids`.
303 /// - If the XID already exists in the list, `m_is_malformed` is set to
304 /// true, indicating that the binary log is malformed.
305 /// @param ev The `Query_log_event` to process
306 void process_atomic_ddl(Query_log_event const &ev);
307
308 /// @brief Invoked when an `XA COMMIT` is found in a `Query_log_event`.
309 /// @details Actions taken to process the statement:
310 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
311 /// to true, indicating that the binary log is malformed.
312 /// - The `m_in_transaction` flag is set to false, indicating that the
313 /// event ends a transaction.
314 /// - The XID of the transaction is extracted and added to the list of
315 /// externally coordinated transactions `m_external_xids`, alongside the
316 /// state COMMITTED.
317 /// - If the XID already exists in the list associated with a state other
318 /// than `PREPARED`, `m_is_malformed` is set to true, indicating that the
319 /// binary log is malformed.
320 /// @param query The query string to process
321 void process_xa_commit(std::string const &query);
322
323 /// @brief Invoked when an `XA ROLLBACK` is found in a `Query_log_event`.
324 /// @details Actions taken to process the statement:
325 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
326 /// to true, indicating that the binary log is malformed.
327 /// - The `m_in_transaction` flag is set to false, indicating that the
328 /// event ends a transaction.
329 /// - The XID of the transaction is extracted and added to the list of
330 /// externally coordinated transactions `m_external_xids`, along side the
331 /// state ROLLEDBACK.
332 /// - If the XID already exists in the list associated with a state other
333 /// than `PREPARED`, `m_is_malformed` is set to true, indicating that the
334 /// binary log is malformed.
335 /// @param query The query string to process
336 void process_xa_rollback(std::string const &query);
337
338 /// @brief Parses the provided string for an XID and adds it to the externally
339 /// coordinated transactions map, along side the provided state.
340 /// @param query The query to search and retrieve the XID from
341 /// @param state The state to add to the map, along side the XID
342 void add_external_xid(std::string const &query,
344};
345
346} // namespace binlog
347
349
350#endif // BINLOG_LOG_SANITIZER_H
Contains the classes representing events occurring in the replication stream.
Definition: binlog.h:139
Mem_root_allocator is a C++ STL memory allocator based on MEM_ROOT.
Definition: mem_root_allocator.h:68
A Query event is written to the binary log whenever the database is modified on the master,...
Definition: log_event.h:1285
Similar to Xid_log_event except that.
Definition: log_event.h:1821
std::map< XID, enum_ha_recover_xa_state, std::less< XID >, Xa_state_list::allocator > list
Definition: handler.h:1264
This is the subclass of Xid_event defined in libbinlogevent, An XID event is generated for a commit o...
Definition: log_event.h:1770
Class used to recover binary / relay log file.
Definition: log_sanitizer.h:62
bool m_has_valid_source_pos
Indicator on whether a valid source position has been found in the log file.
Definition: log_sanitizer.h:220
bool m_has_valid_pos
Indicator on whether a valid position has been found in the log file.
Definition: log_sanitizer.h:216
Xid_commit_list m_internal_xids
List of normal transactions fully written to the binary log.
Definition: log_sanitizer.h:206
Mem_root_allocator< my_xid > m_set_alloc
Memory pool allocator to use with the normal transaction list.
Definition: log_sanitizer.h:202
bool process_one_log(Type_reader &reader, const std::string &filename)
Reads and validates one log file.
Definition: log_sanitizer_impl.hpp:96
virtual PSI_memory_key & get_memory_key() const =0
Function used to obtain memory key for derived classes.
void process_logs(Type_reader &reader)
This function goes through the opened file and searches for a valid position in a binary log file.
Definition: log_sanitizer_impl.hpp:85
void process_start()
Invoked when a BEGIN or an ‘XA START’ is found in a Query_log_event.
Definition: log_sanitizer.cc:144
void process_xid_event(Xid_log_event const &ev)
Invoked when a Xid_log_event is read from the binary log file reader.
Definition: log_sanitizer.cc:96
bool m_fatal_error
Whether or not the binary log has a fatal error.
Definition: log_sanitizer.h:196
bool is_fatal_error() const
Checks whether the fatal error occurred during log sanitization (OOM / decompression error which we c...
Definition: log_sanitizer.cc:55
bool is_log_truncation_needed() const
Checks whether a valid sanitized log file needs truncation of the last, partially written transaction...
Definition: log_sanitizer.cc:61
bool m_validation_started
Indicates whether validation has started.
Definition: log_sanitizer.h:179
my_off_t m_last_file_size
Last opened file size.
Definition: log_sanitizer.h:223
Mem_root_allocator< std::pair< const XID, XID_STATE::xa_states > > m_map_alloc
Memory pool allocator to use with the XA transaction list.
Definition: log_sanitizer.h:204
void add_external_xid(std::string const &query, enum_ha_recover_xa_state state)
Parses the provided string for an XID and adds it to the externally coordinated transactions map,...
Definition: log_sanitizer.cc:218
MEM_ROOT m_mem_root
Memory pool to use for the XID lists.
Definition: log_sanitizer.h:200
bool is_log_malformed() const
Retrieves whether or not the log was correctly processed in full.
Definition: log_sanitizer.cc:53
std::pair< std::string, bool > get_valid_source_file() const
Retrieves the updated name of the binlog source file.
Definition: log_sanitizer.cc:48
std::string m_valid_source_file
Currently processed binlog file set in case source rotation event is encountered.
Definition: log_sanitizer.h:188
void process_atomic_ddl(Query_log_event const &ev)
Invoked when a DDL is found in a Query_log_event.
Definition: log_sanitizer.cc:171
bool m_in_transaction
Whether or not the event being processed is within a transaction.
Definition: log_sanitizer.h:192
std::pair< my_off_t, bool > get_valid_source_pos() const
Retrieves the last valid source position of an event in read from the binary log / relay log file,...
Definition: log_sanitizer.cc:44
bool m_is_log_truncation_needed
Information on whether log needs to be truncated, i.e.
Definition: log_sanitizer.h:213
bool m_is_malformed
Whether or not the binary log is malformed/corrupted or error occurred.
Definition: log_sanitizer.h:194
my_off_t m_valid_source_pos
Position of the last binlog event that ended a transaction (source position which corresponds to m_va...
Definition: log_sanitizer.h:185
void process_xa_rollback(std::string const &query)
Invoked when an XA ROLLBACK is found in a Query_log_event.
Definition: log_sanitizer.cc:202
virtual ~Log_sanitizer()=default
Dtor.
Xa_state_list::list m_external_xids
List of XA transactions and states that appear in the binary log.
Definition: log_sanitizer.h:208
void process_xa_prepare_event(XA_prepare_log_event const &ev)
Invoked when a XA_prepare_log_event is read from the binary log file reader.
Definition: log_sanitizer.cc:112
std::string get_valid_file() const
Definition: log_sanitizer.h:99
std::string m_valid_file
Last log file containing finished transaction.
Definition: log_sanitizer.h:190
void process_query_event(Query_log_event const &ev)
Invoked when a Query_log_event is read from the binary log file reader.
Definition: log_sanitizer.cc:65
my_off_t get_valid_pos() const
Retrieves the position of the last binlog/relay log event that ended a transaction or position after ...
Definition: log_sanitizer.cc:42
void process_xa_commit(std::string const &query)
Invoked when an XA COMMIT is found in a Query_log_event.
Definition: log_sanitizer.cc:186
std::string const & get_failure_message() const
Retrieves the textual representation of the encontered failure, if any.
Definition: log_sanitizer.cc:57
Log_sanitizer()
Ctor.
Definition: log_sanitizer.cc:34
my_off_t m_valid_pos
Position of the last binlog/relay log event that ended a transaction.
Definition: log_sanitizer.h:182
std::string m_failure_message
Textual representation of the encountered failure.
Definition: log_sanitizer.h:198
void process_commit()
Invoked when a COMMIT is found in a Query_log_event.
Definition: log_sanitizer.cc:153
void process_rollback()
Invoked when a ROLLBACK is found in a Query_log_event.
Definition: log_sanitizer.cc:162
Stream class that yields Log_event objects, including events contained in Transaction_payload_log_eve...
unsigned int PSI_memory_key
Instrumented memory key.
Definition: psi_memory_bits.h:49
Binary log event definitions.
ulonglong my_off_t
Definition: my_inttypes.h:72
static char * query
Definition: myisam_ftdump.cc:47
Definition: pfs.cc:38
const char * filename
Definition: pfs_example_component_population.cc:67
enum_ha_recover_xa_state
Enumeration of possible states for externally coordinated transactions (XA).
Definition: handler.h:1239
std::unordered_set< my_xid, std::hash< my_xid >, std::equal_to< my_xid >, Mem_root_allocator< my_xid > > Xid_commit_list
Single occurrence set of XIDs of internally coordinated transactions found as been committed in the t...
Definition: handler.h:1253
The MEM_ROOT is a simple arena, where allocations are carved out of larger blocks.
Definition: my_alloc.h:83