proxy: use cooperative tokens to coalesce updating shard range requests into backend

Code Issues Proposed changes

The cost of memcache misses could be deadly. For example, when
updating shard range cache query miss, PUT requests would have to
query the backend to figure out which shard to upload the objects.
And when a lot of requests are sending to the backend at the same
time, this could easily overload the root containers and cause a
lot of 500/503 errors; and when proxy-servers receive responses of
all those 200 backend shard range queries, they could in turn try
to write the same shard range data into memcached servers at the
same time, and cause memcached to return OOM failures too.
We have seen cache misses frequently to updating shard range cache
in production, due to Memcached out-of-memory and cache evictions.
To cope with those kind of situations, a memcached based cooperative
token mechanism can be added into proxy-server to coalesce lots of
in-flight backend requests into a few: when updating shard range
cache misses, only the first few of requests will get global
cooperative tokens and then be able to fetch updating shard ranges
from backend container servers. And the following cache miss
requests will wait for cache filling to finish, instead of all
querying the backend container servers. This will prevent a flood
of backend requests to overload both container servers and memcached
servers.
Drive-by fix: when memcache is not available, object controller will
only need to retrieve a specific shard range from the container server
to send the update request to.
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Yan Xiao <yanxiao@nvidia.com>
Co-Authored-By: Shreeya Deshpande <shreeyad@nvidia.com>
Signed-off-by: Jianjian Huo <jhuo@nvidia.com>
Change-Id: I38c11b7aae8c4112bb3d671fa96012ab0c44d5a2

This commit is contained in:

Jianjian Huo

2024年03月20日 13:06:35 -07:00

parent 707a65ab3c

commit d9883d0834

11 changed files with 1199 additions and 85 deletions

Show all changes Ignore whitespace when comparing lines Ignore changes in amount of whitespace Ignore changes in whitespace at EOL

Download Patch File Download Diff File Expand all files Collapse all files

13
etc/proxy-server.conf-sample

View File

@@ -207,6 +207,19 @@ use = egg:swift#proxy

# container_listing_shard_ranges_skip_cache_pct = 0.0

# account_existence_skip_cache_pct = 0.0

#

# Use cooperative token on updating namespace cache to coalesce the requests

# which fetch updating namespaces from the backend and set them in memcached.

# Number of cooperative tokens per each token session, 0 means to disable the

# usage of cooperative token and directly talk to the backend and memcache.

# namespace_cache_tokens_per_session = 3

#

# The average time spent on getting updating namespaces from the container

# servers, this will be used as basic unit for cooperative token to figure out

# intervals for the retries when requests didn't acquire a token and are

# waiting for other requests to fill in the cache; and a cooperative token

# session (`token_ttl`) will be 10 times of this value.

# namespace_avg_backend_fetch_time = 0.3

#

# object_chunk_size = 65536

# client_chunk_size = 65536

#

proxy: use cooperative tokens to coalesce updating shard range requests into backend

13 etc/proxy-server.conf-sample Unescape Escape View File

13
etc/proxy-server.conf-sample

View File