MySQL 8.4.0
Source Code Documentation
compressor.h
Go to the documentation of this file.
1/* Copyright (c) 2019, 2024, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is designed to work with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have either included with
13 the program or referenced in the documentation.
14
15 This program is distributed in the hope that it will be useful,
16 but WITHOUT ANY WARRANTY; without even the implied warranty of
17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18 GNU General Public License, version 2.0, for more details.
19
20 You should have received a copy of the GNU General Public License
21 along with this program; if not, write to the Free Software
22 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
23
24#ifndef MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
25#define MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
26
27#include <cstddef>
28#include <tuple>
32#include "mysql/binlog/event/nodiscard.h" // NODISCARD
33
34#include <limits> // std::numeric_limits
35
37
39
40/// Abstract base class for compressors.
41///
42/// Each subclass normally corresponds to a compression algorithm, and
43/// maintains the algorithm-specific state for it.
44///
45/// An instance of this class can be reused to compress several
46/// *frames*. A frame is a self-contained segment of data, in the
47/// sense that it can be decompressed without knowing about other
48/// frames, and compression does not take advantage of patterns that
49/// repeat between frames.
50///
51/// Input for a frame can be provided in pieces. All pieces for a
52/// frame will be compressed together; the decompressor will take
53/// advantage of patterns across in different pieces within the frame.
54/// Providing a frame in pieces is useful when not all input is known
55/// at once.
56///
57/// To compress one frame, use the API as follows:
58///
59/// 1. Repeat as many times as needed:
60/// 1.1. Call @c feed to provide a piece of input.
61/// 1.2. Call @c compress to consume the piece of input and possibly
62/// produce a prefix of the output.
63/// 2. Choose one of the following:
64/// 2.1. Call @c finish to produce the remainder of the output for this
65/// frame.
66/// 2.2. Call @c reset to abort this frame.
67///
68/// @note After 1.2, although the compression library has read all
69/// input given so far, it may not have produced all corresponding
70/// output. It usually holds some data in internal buffers, since it
71/// may be more compressible when more data has been given. Therefore,
72/// step 2.1 is always necessary in order to complete the frame.
73///
74/// @note To reuse the compressor object for another input, repeat the
75/// above procedure as many times as needed.
76///
77/// This class requires that the user provides a @c
78/// mysql::binlog::event::compression::buffer::Managed_buffer_sequence to
79/// store output.
81 public:
89 std::numeric_limits<Size_t>::max();
90
91 Compressor() = default;
92 Compressor(const Compressor &other) = delete;
93 Compressor(Compressor &&other) = delete;
94 Compressor &operator=(const Compressor &other) = delete;
95 Compressor &operator=(Compressor &&other) = delete;
96
97 virtual ~Compressor() = default;
98
99 /// @return the compression type.
100 type get_type_code() const;
101
102 /// Reset the frame.
103 ///
104 /// This cancels the current frame and starts a new one.
105 ///
106 /// This is allowed but unnecessary if the current frame has been
107 /// reset by @c finish or by an out_of_memory error from @c
108 /// compress.
109 void reset();
110
111 /// Submit data to be compressed.
112 ///
113 /// This will not consume any of the input; it should be followed by
114 /// a call to @c compress or @c finish.
115 ///
116 /// @note This object will not copy the input; the caller must
117 /// ensure that the input lives until it has been consumed or the
118 /// frame has been reset.
119 ///
120 /// @note Must not be called when there is still non-consumed input
121 /// left after a previous call to @c feed.
122 ///
123 /// @param input_data Data to be compressed. This object will keep a
124 /// shallow copy of the data and use it in subsequent calls to @c
125 /// compress or @c finish.
126 ///
127 /// @param input_size Size of data to be compressed.
128 template <class Input_char_t>
129 void feed(const Input_char_t *input_data, Size_t input_size) {
130 feed_char_t(reinterpret_cast<const Char_t *>(input_data), input_size);
131 }
132
133 /// Consume all input previously given in the feed function.
134 ///
135 /// This will consume the input, but may not produce all output;
136 /// there may be output still in compression library buffers. Use
137 /// the @c finish function to flush the output and end the frame.
138 ///
139 /// @param out Storage for compressed bytes. This may grow, if
140 /// needed.
141 ///
142 /// @retval success All input was consumed.
143 ///
144 /// @retval out_of_memory The operation failed due to an out of
145 /// memory error. The frame has been reset.
146 ///
147 /// @retval exceeds_max_size The @c out buffer was already at its
148 /// max capacity, and filled, and there were more bytes left to
149 /// produce. The frame has not been reset and it is not guaranteed
150 /// that all input has been consumed. The caller may resume
151 /// compression e.g. after increasing the capacity, or resetting
152 /// the output buffer (perhaps after moving existing data
153 /// elsewhere), or using a different output buffer, or similar.
155
156 /// Consume all input, produce all output, and end the frame.
157 ///
158 /// This will consume all input previously given by @c feed (it
159 /// internally calls @c compress). Then it ends the frame and
160 /// flushes the output, ensuring that all data that may reside in
161 /// the compression library's internal buffers gets compressed and
162 /// written to the output.
163 ///
164 /// The next call to @c feed will start a new frame.
165 ///
166 /// @param out Storage for compressed bytes. This may grow, if
167 /// needed.
168 ///
169 /// @retval success All input was consumed, all output was produced,
170 /// and the frame was reset.
171 ///
172 /// @retval out_of_memory The operation failed due to an out of
173 /// memory error, and the frame has been reset.
174 ///
175 /// @retval exceeds_max_size The @c out buffer was already at its
176 /// max capacity, and filled, and there were more bytes left to
177 /// produce. The frame has not been reset and it is not guaranteed
178 /// that all input has been consumed. The caller may resume
179 /// compression e.g. after increasing the capacity, or resetting
180 /// the output buffer (perhaps after moving existing data
181 /// elsewhere), or using a different output buffer, or similar.
183
184 /// Return a `Grow_constraint` that may be used with the
185 /// Managed_buffer_sequence storing the output, in order to
186 /// optimize memory usage for a particular compression algorithm.
187 ///
188 /// This may be implemented by subclasses such that it depends on
189 /// the pledged input size. Therefore, for the most optimal grow
190 /// constraint, call this after set_pledged_input_size.
192
193 /// Declare that the input size will be exactly as given.
194 ///
195 /// This may allow compressors and decompressors to use memory more
196 /// efficiently.
197 ///
198 /// This function may only be called if `feed` has never been
199 /// called, or if the compressor has been reset since the last call
200 /// to `feed`. The pledged size will be set back to
201 /// pledged_input_size_unset next time this compressor is reset.
202 ///
203 /// It is required that the total number of bytes passed to `feed`
204 /// before the call to `finish` matches the pledged number.
205 /// Otherwise, the behavior of `finish` is undefined.
207
208 /// Return the size previously provided to `set_pledged_input_size`,
209 /// or `pledged_input_size_unset` if no pledged size has been set.
211
212 private:
213 /// Worker function for @c feed, requiring the correct Char_t type.
214 ///
215 /// @see feed.
216 void feed_char_t(const Char_t *input_data, Size_t input_size);
217
218 /// implement @c get_type_code.
219 virtual type do_get_type_code() const = 0;
220
221 /// Implement @c reset.
222 virtual void do_reset() = 0;
223
224 /// Implement @c feed.
225 ///
226 /// This differs from @c feed in that it does not have to reset the
227 /// frame when returning out_of_memory; the caller does that.
228 virtual void do_feed(const Char_t *input_data, Size_t input_size) = 0;
229
230 /// Implement @c compress.
231 ///
232 /// This differs from @c compress in that it does not have to reset
233 /// the frame when returning out_of_memory; the caller does that.
236
237 /// Implement @c finish.
238 ///
239 /// This differs from @c finish in that it does not have to reset
240 /// the frame when returning out_of_memory; the caller does that.
241 ///
242 /// Implementations may assume that @c compress has been called,
243 /// since @c finish does that.
246
247 /// Implement @c get_grow_constraint_hint.
248 ///
249 /// In this base class, the function returns a default-constructed
250 /// Grow_constraint, i.e., one which does not limit the
251 /// Grow_calculator.
253
254 /// Implement @c set_pledged_input_size
255 ///
256 /// By default, this does nothing.
257 virtual void do_set_pledged_input_size([[maybe_unused]] Size_t size);
258
259 /// True when user has provided input that has not yet been consumed.
260 bool m_pending_input = false;
261
262 /// True when user has not provided any input since the last reset.
263 bool m_empty = true;
264
265 /// The number of bytes
267};
268
269} // namespace mysql::binlog::event::compression
270
271#endif // MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
Abstract base class for compressors.
Definition: compressor.h:80
Size_t m_pledged_input_size
The number of bytes.
Definition: compressor.h:266
void reset()
Reset the frame.
Definition: compressor.cpp:30
virtual Compress_status do_compress(Managed_buffer_sequence_t &out)=0
Implement compress.
Grow_constraint_t get_grow_constraint_hint() const
Return a Grow_constraint that may be used with the Managed_buffer_sequence storing the output,...
Definition: compressor.cpp:71
Managed_buffer_sequence_t::Size_t Size_t
Definition: compressor.h:85
void set_pledged_input_size(Size_t size)
Declare that the input size will be exactly as given.
Definition: compressor.cpp:79
virtual Grow_constraint_t do_get_grow_constraint_hint() const
Implement get_grow_constraint_hint.
Definition: compressor.cpp:75
virtual type do_get_type_code() const =0
implement get_type_code.
Compressor(const Compressor &other)=delete
mysql::binlog::event::compression::buffer::Managed_buffer_sequence<> Managed_buffer_sequence_t
Definition: compressor.h:83
virtual Compress_status do_finish(Managed_buffer_sequence_t &out)=0
Implement finish.
Compress_status finish(Managed_buffer_sequence_t &out)
Consume all input, produce all output, and end the frame.
Definition: compressor.cpp:56
Compressor & operator=(const Compressor &other)=delete
Size_t get_pledged_input_size() const
Return the size previously provided to set_pledged_input_size, or pledged_input_size_unset if no pled...
Definition: compressor.cpp:85
void feed_char_t(const Char_t *input_data, Size_t input_size)
Worker function for feed, requiring the correct Char_t type.
Definition: compressor.cpp:38
mysql::binlog::event::compression::buffer::Grow_constraint Grow_constraint_t
Definition: compressor.h:87
Compressor & operator=(Compressor &&other)=delete
bool m_empty
True when user has not provided any input since the last reset.
Definition: compressor.h:263
bool m_pending_input
True when user has provided input that has not yet been consumed.
Definition: compressor.h:260
type get_type_code() const
Definition: compressor.cpp:28
virtual void do_reset()=0
Implement reset.
Managed_buffer_sequence_t::Char_t Char_t
Definition: compressor.h:84
virtual void do_feed(const Char_t *input_data, Size_t input_size)=0
Implement feed.
Compress_status compress(Managed_buffer_sequence_t &out)
Consume all input previously given in the feed function.
Definition: compressor.cpp:47
static constexpr Size_t pledged_input_size_unset
Definition: compressor.h:88
void feed(const Input_char_t *input_data, Size_t input_size)
Submit data to be compressed.
Definition: compressor.h:129
virtual void do_set_pledged_input_size(Size_t size)
Implement set_pledged_input_size.
Definition: compressor.cpp:89
Description of a heuristic to determine how much memory to allocate.
Definition: grow_constraint.h:66
Owned, non-contiguous, growable memory buffer.
Definition: managed_buffer_sequence.h:115
typename Buffer_sequence_view_t::Char_t Char_t
Definition: rw_buffer_sequence.h:110
typename Buffer_sequence_view_t::Size_t Size_t
Definition: rw_buffer_sequence.h:111
Container class that provides a sequence of buffers to the caller.
Grow_status
Error statuses for classes that use Grow_calculator.
Definition: grow_status.h:38
Definition: base.cpp:27
size_t size(const char *const c)
Definition: base64.h:46
#define NODISCARD
The function attribute [[NODISCARD]] is a replacement for [[nodiscard]] to workaround a gcc bug.
Definition: nodiscard.h:47