MySQL 9.1.0
Source Code Documentation
|
A Sharded_bitset<SHARDS_COUNT>(n) is like a bitset<n> in that it represents a vector of n bits, which can be set(pos) or reset(pos) for 0<=pos<n. More...
#include <ut0sharded_bitset.h>
Public Member Functions | |
Sharded_bitset (size_t n, ut::PSI_memory_key_t mem_key) | |
Initializes a data structure capable of storing n bits. More... | |
void | set (size_t pos) |
Sets pos-th bit. More... | |
void | reset (size_t pos) |
Resets pos-th bit. More... | |
size_t | find_set_in_this_shard (size_t start_pos) |
Finds a smallest position which is set and belongs to the same shard as start_pos, and is not smaller than start_pos. More... | |
Private Types | |
using | words_allocator = typename decltype(words)::allocator_type |
Private Member Functions | |
size_t | words_per_shard () const |
How many words are devoted to each shard in words[]. More... | |
Bitset | get_shard (size_t shard_id) const |
Private Attributes | |
ut::vector< uint64_t > | words |
The bits for each shard are stored separately to avoid data-races, false-sharing, and make linear scans within shard faster. More... | |
A Sharded_bitset<SHARDS_COUNT>(n) is like a bitset<n> in that it represents a vector of n bits, which can be set(pos) or reset(pos) for 0<=pos<n.
One difference is that the size n is specified at runtime via constructor. The other difference is that SHARDS_COUNT specifies sharding policy used and enforced by the caller:
shard_id = pos % SHARDS_COUNT
which is then respected by this class, by ensuring no data-race is possible, as long as the caller ensures concurrent calls for two different pos from same shard are not possible. For example, if one thread calls set(pos1) and another calls set(pos2) in parallel, then it is fine, as long as
pos1 % SHARDS_COUNT != pos2 % SHARDS_COUNT
This is the main reason to prefer this data structure to bitset, which doesn't give you such guarantee as bit from different shards can be in the same word. Another reason to use this data structure is because it has a very fast implementation of find_set_in_this_shard(pos) which finds smallest pos+SHARDS_COUNT*i which is set (for 0<=i). This is useful to quickly iterate over all bits which are set, while also respecting sharding policy. For example: for (size_t shard=0; shard < SHARDS_COUNT; ++shard) { mutex_enter(mutex[shard]); for (size_t pos = find_set_in_this_shard(shard); pos < n; pos = find_set_in_this_shard(pos+SHARDS_COUNT)) {
cout << pos << " is set !" << endl; } mutex_exit(mutex[shard]); }
|
private |
|
inline |
Initializes a data structure capable of storing n bits.
Initializes all bits to unset.
[in] | n | The maximum position + 1. |
[in] | mem_key | PSI memory key used to track allocations. |
|
inline |
Finds a smallest position which is set and belongs to the same shard as start_pos, and is not smaller than start_pos.
[in] | start_pos | The position, such as passed to set(pos) |
|
inlineprivate |
|
inline |
Resets pos-th bit.
[in] | pos | The position of the bit to be reset to 0 |
|
inline |
Sets pos-th bit.
[in] | pos | The position of the bit to be set to 1 |
|
inlineprivate |
How many words are devoted to each shard in words[].
All shards get equal fragment of words[], even if n is not divisible by SHARDS_COUNT, and thus the words_per_shard() might be slightly larger than necessary as we round up. The region of words[] for a given shard is words[shard*words_per_shard()]...words[(shard+1)*words_per_shard()-1]
|
private |
The bits for each shard are stored separately to avoid data-races, false-sharing, and make linear scans within shard faster.
For similar reasons they are packed to aligned uint64_t words. But, they are allocated as one big array, so instead of indirect pointers, we can just compute where each shard starts or ends,