MySQL 9.1.0
Source Code Documentation
compressor.h
Go to the documentation of this file.
1/* Copyright (c) 2019, 2024, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is designed to work with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have either included with
13 the program or referenced in the documentation.
14
15 This program is distributed in the hope that it will be useful,
16 but WITHOUT ANY WARRANTY; without even the implied warranty of
17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18 GNU General Public License, version 2.0, for more details.
19
20 You should have received a copy of the GNU General Public License
21 along with this program; if not, write to the Free Software
22 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
23
24#ifndef MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
25#define MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
26
27#include <cstddef>
28#include <tuple>
30#include "mysql/containers/buffers/grow_constraint.h" // Grow_constraint
31#include "mysql/containers/buffers/managed_buffer_sequence.h" // Managed_buffer_sequence
32#include "mysql/utils/nodiscard.h" // NODISCARD
33
34#include <limits> // std::numeric_limits
35
37
39
40/// Abstract base class for compressors.
41///
42/// Each subclass normally corresponds to a compression algorithm, and
43/// maintains the algorithm-specific state for it.
44///
45/// An instance of this class can be reused to compress several
46/// *frames*. A frame is a self-contained segment of data, in the
47/// sense that it can be decompressed without knowing about other
48/// frames, and compression does not take advantage of patterns that
49/// repeat between frames.
50///
51/// Input for a frame can be provided in pieces. All pieces for a
52/// frame will be compressed together; the decompressor will take
53/// advantage of patterns across in different pieces within the frame.
54/// Providing a frame in pieces is useful when not all input is known
55/// at once.
56///
57/// To compress one frame, use the API as follows:
58///
59/// 1. Repeat as many times as needed:
60/// 1.1. Call @c feed to provide a piece of input.
61/// 1.2. Call @c compress to consume the piece of input and possibly
62/// produce a prefix of the output.
63/// 2. Choose one of the following:
64/// 2.1. Call @c finish to produce the remainder of the output for this
65/// frame.
66/// 2.2. Call @c reset to abort this frame.
67///
68/// @note After 1.2, although the compression library has read all
69/// input given so far, it may not have produced all corresponding
70/// output. It usually holds some data in internal buffers, since it
71/// may be more compressible when more data has been given. Therefore,
72/// step 2.1 is always necessary in order to complete the frame.
73///
74/// @note To reuse the compressor object for another input, repeat the
75/// above procedure as many times as needed.
76///
77/// This class requires that the user provides a @c
78/// mysql::containers::buffers::Managed_buffer_sequence to
79/// store output.
81 public:
88 std::numeric_limits<Size_t>::max();
89
90 Compressor() = default;
91 Compressor(const Compressor &other) = delete;
92 Compressor(Compressor &&other) = delete;
93 Compressor &operator=(const Compressor &other) = delete;
94 Compressor &operator=(Compressor &&other) = delete;
95
96 virtual ~Compressor() = default;
97
98 /// @return the compression type.
99 type get_type_code() const;
100
101 /// Reset the frame.
102 ///
103 /// This cancels the current frame and starts a new one.
104 ///
105 /// This is allowed but unnecessary if the current frame has been
106 /// reset by @c finish or by an out_of_memory error from @c
107 /// compress.
108 void reset();
109
110 /// Submit data to be compressed.
111 ///
112 /// This will not consume any of the input; it should be followed by
113 /// a call to @c compress or @c finish.
114 ///
115 /// @note This object will not copy the input; the caller must
116 /// ensure that the input lives until it has been consumed or the
117 /// frame has been reset.
118 ///
119 /// @note Must not be called when there is still non-consumed input
120 /// left after a previous call to @c feed.
121 ///
122 /// @param input_data Data to be compressed. This object will keep a
123 /// shallow copy of the data and use it in subsequent calls to @c
124 /// compress or @c finish.
125 ///
126 /// @param input_size Size of data to be compressed.
127 template <class Input_char_t>
128 void feed(const Input_char_t *input_data, Size_t input_size) {
129 feed_char_t(reinterpret_cast<const Char_t *>(input_data), input_size);
130 }
131
132 /// Consume all input previously given in the feed function.
133 ///
134 /// This will consume the input, but may not produce all output;
135 /// there may be output still in compression library buffers. Use
136 /// the @c finish function to flush the output and end the frame.
137 ///
138 /// @param out Storage for compressed bytes. This may grow, if
139 /// needed.
140 ///
141 /// @retval success All input was consumed.
142 ///
143 /// @retval out_of_memory The operation failed due to an out of
144 /// memory error. The frame has been reset.
145 ///
146 /// @retval exceeds_max_size The @c out buffer was already at its
147 /// max capacity, and filled, and there were more bytes left to
148 /// produce. The frame has not been reset and it is not guaranteed
149 /// that all input has been consumed. The caller may resume
150 /// compression e.g. after increasing the capacity, or resetting
151 /// the output buffer (perhaps after moving existing data
152 /// elsewhere), or using a different output buffer, or similar.
154
155 /// Consume all input, produce all output, and end the frame.
156 ///
157 /// This will consume all input previously given by @c feed (it
158 /// internally calls @c compress). Then it ends the frame and
159 /// flushes the output, ensuring that all data that may reside in
160 /// the compression library's internal buffers gets compressed and
161 /// written to the output.
162 ///
163 /// The next call to @c feed will start a new frame.
164 ///
165 /// @param out Storage for compressed bytes. This may grow, if
166 /// needed.
167 ///
168 /// @retval success All input was consumed, all output was produced,
169 /// and the frame was reset.
170 ///
171 /// @retval out_of_memory The operation failed due to an out of
172 /// memory error, and the frame has been reset.
173 ///
174 /// @retval exceeds_max_size The @c out buffer was already at its
175 /// max capacity, and filled, and there were more bytes left to
176 /// produce. The frame has not been reset and it is not guaranteed
177 /// that all input has been consumed. The caller may resume
178 /// compression e.g. after increasing the capacity, or resetting
179 /// the output buffer (perhaps after moving existing data
180 /// elsewhere), or using a different output buffer, or similar.
182
183 /// Return a `Grow_constraint` that may be used with the
184 /// Managed_buffer_sequence storing the output, in order to
185 /// optimize memory usage for a particular compression algorithm.
186 ///
187 /// This may be implemented by subclasses such that it depends on
188 /// the pledged input size. Therefore, for the most optimal grow
189 /// constraint, call this after set_pledged_input_size.
191
192 /// Declare that the input size will be exactly as given.
193 ///
194 /// This may allow compressors and decompressors to use memory more
195 /// efficiently.
196 ///
197 /// This function may only be called if `feed` has never been
198 /// called, or if the compressor has been reset since the last call
199 /// to `feed`. The pledged size will be set back to
200 /// pledged_input_size_unset next time this compressor is reset.
201 ///
202 /// It is required that the total number of bytes passed to `feed`
203 /// before the call to `finish` matches the pledged number.
204 /// Otherwise, the behavior of `finish` is undefined.
206
207 /// Return the size previously provided to `set_pledged_input_size`,
208 /// or `pledged_input_size_unset` if no pledged size has been set.
210
211 private:
212 /// Worker function for @c feed, requiring the correct Char_t type.
213 ///
214 /// @see feed.
215 void feed_char_t(const Char_t *input_data, Size_t input_size);
216
217 /// implement @c get_type_code.
218 virtual type do_get_type_code() const = 0;
219
220 /// Implement @c reset.
221 virtual void do_reset() = 0;
222
223 /// Implement @c feed.
224 ///
225 /// This differs from @c feed in that it does not have to reset the
226 /// frame when returning out_of_memory; the caller does that.
227 virtual void do_feed(const Char_t *input_data, Size_t input_size) = 0;
228
229 /// Implement @c compress.
230 ///
231 /// This differs from @c compress in that it does not have to reset
232 /// the frame when returning out_of_memory; the caller does that.
235
236 /// Implement @c finish.
237 ///
238 /// This differs from @c finish in that it does not have to reset
239 /// the frame when returning out_of_memory; the caller does that.
240 ///
241 /// Implementations may assume that @c compress has been called,
242 /// since @c finish does that.
245
246 /// Implement @c get_grow_constraint_hint.
247 ///
248 /// In this base class, the function returns a default-constructed
249 /// Grow_constraint, i.e., one which does not limit the
250 /// Grow_calculator.
252
253 /// Implement @c set_pledged_input_size
254 ///
255 /// By default, this does nothing.
256 virtual void do_set_pledged_input_size([[maybe_unused]] Size_t size);
257
258 /// True when user has provided input that has not yet been consumed.
259 bool m_pending_input = false;
260
261 /// True when user has not provided any input since the last reset.
262 bool m_empty = true;
263
264 /// The number of bytes
266};
267
268} // namespace mysql::binlog::event::compression
269
270#endif // MYSQL_BINLOG_EVENT_COMPRESSION_COMPRESSOR_H
Abstract base class for compressors.
Definition: compressor.h:80
Size_t m_pledged_input_size
The number of bytes.
Definition: compressor.h:265
void reset()
Reset the frame.
Definition: compressor.cpp:30
virtual Compress_status do_compress(Managed_buffer_sequence_t &out)=0
Implement compress.
Grow_constraint_t get_grow_constraint_hint() const
Return a Grow_constraint that may be used with the Managed_buffer_sequence storing the output,...
Definition: compressor.cpp:71
Managed_buffer_sequence_t::Size_t Size_t
Definition: compressor.h:85
void set_pledged_input_size(Size_t size)
Declare that the input size will be exactly as given.
Definition: compressor.cpp:79
virtual Grow_constraint_t do_get_grow_constraint_hint() const
Implement get_grow_constraint_hint.
Definition: compressor.cpp:75
virtual type do_get_type_code() const =0
implement get_type_code.
Compressor(const Compressor &other)=delete
virtual Compress_status do_finish(Managed_buffer_sequence_t &out)=0
Implement finish.
Compress_status finish(Managed_buffer_sequence_t &out)
Consume all input, produce all output, and end the frame.
Definition: compressor.cpp:56
Compressor & operator=(const Compressor &other)=delete
Size_t get_pledged_input_size() const
Return the size previously provided to set_pledged_input_size, or pledged_input_size_unset if no pled...
Definition: compressor.cpp:85
void feed_char_t(const Char_t *input_data, Size_t input_size)
Worker function for feed, requiring the correct Char_t type.
Definition: compressor.cpp:38
Compressor & operator=(Compressor &&other)=delete
bool m_empty
True when user has not provided any input since the last reset.
Definition: compressor.h:262
bool m_pending_input
True when user has provided input that has not yet been consumed.
Definition: compressor.h:259
type get_type_code() const
Definition: compressor.cpp:28
virtual void do_reset()=0
Implement reset.
Managed_buffer_sequence_t::Char_t Char_t
Definition: compressor.h:84
virtual void do_feed(const Char_t *input_data, Size_t input_size)=0
Implement feed.
mysql::containers::buffers::Grow_constraint Grow_constraint_t
Definition: compressor.h:86
mysql::containers::buffers::Managed_buffer_sequence<> Managed_buffer_sequence_t
Definition: compressor.h:83
Compress_status compress(Managed_buffer_sequence_t &out)
Consume all input previously given in the feed function.
Definition: compressor.cpp:47
static constexpr Size_t pledged_input_size_unset
Definition: compressor.h:87
void feed(const Input_char_t *input_data, Size_t input_size)
Submit data to be compressed.
Definition: compressor.h:128
virtual void do_set_pledged_input_size(Size_t size)
Implement set_pledged_input_size.
Definition: compressor.cpp:89
Description of a heuristic to determine how much memory to allocate.
Definition: grow_constraint.h:66
Owned, non-contiguous, growable memory buffer.
Definition: managed_buffer_sequence.h:115
typename Buffer_sequence_view_t::Size_t Size_t
Definition: rw_buffer_sequence.h:111
typename Buffer_sequence_view_t::Char_t Char_t
Definition: rw_buffer_sequence.h:110
Container class that provides a sequence of buffers to the caller.
Definition: base.cpp:27
Grow_status
Error statuses for classes that use Grow_calculator.
Definition: grow_status.h:38
size_t size(const char *const c)
Definition: base64.h:46
#define NODISCARD
The function attribute [[NODISCARD]] is a replacement for [[nodiscard]] to workaround a gcc bug.
Definition: nodiscard.h:47