sharder: avoid small tail shards
A container is typically sharded when it has grown to have an object count of shard_container_threshold + N, where N << shard_container_threshold. If sharded using the default rows_per_shard of shard_container_threshold / 2 then this would previously result in 3 shards: the tail shard would typically be small, having only N rows. This behaviour caused more shards to be generated than desirable. This patch adds a minimum-shard-size option to swift-manage-shard-ranges, and a corresponding option in the sharder config, which can be used to avoid small tail shards. If set to greater than one then the final shard range may be extended to more than rows_per_shard in order to avoid a further shard range with less than minimum-shard-size rows. In the example given, if minimum-shard-size is set to M > N then the container would shard into two shards having rows_per_shard rows and rows_per_shard + N respectively. The default value for minimum-shard-size is rows_per_shard // 5. If all options have their default values this results in minimum-shard-size being 100000. Closes-Bug: #1928370 Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Change-Id: I3baa278c6eaf488e3f390a936eebbec13f2c3e55
This commit is contained in:
9 changed files with 236 additions and 36 deletions
@@ -329,6 +329,18 @@ rows_per_shard 500000 This defines the initial
containers. The default
is shard_container_threshold // 2.
minimum_shard_size 100000 Minimum size of the final
shard range. If this is
greater than one then the
final shard range may be
extended to more than
rows_per_shard in order
to avoid a further shard
range with less than
minimum_shard_size rows.
The default value is
rows_per_shard // 5.
shrink_threshold This defines the
object count below which
a 'donor' shard container
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.