52730e1037563ad8ba0e09da93886c856ed9e875
693 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Alistair Coles
|
18f20daf38 |
Add absolute values for shard shrinking config options
Add two new sharder config options for configuring shrinking behaviour: - shrink_threshold: the size below which a shard may shrink - expansion_limit: the maximum size to which an acceptor shard may grow The new options match the 'swift-manage-shard-ranges' command line options and take absolute values. The new options provide alternatives to the current equivalent options 'shard_shrink_point' and 'shard_shrink_merge_point', which are expressed as percentages of 'shard_container_threshold'. 'shard_shrink_point' and 'shard_shrink_merge_point' are deprecated and will be overridden by the new options if the new options are explicitly set in a config file. The default values of the new options are the same as the values that would result from the default 'shard_container_threshold', 'shard_shrink_point' and 'shard_shrink_merge_point' i.e.: - shrink_threshold: 100000 - expansion_limit: 750000 Change-Id: I087eac961c1eab53540fe56be4881e01ded1f60e |
||
|
Alistair Coles
|
f7fd99a880 |
Use ContainerSharderConf class in sharder and manage-shard-ranges
Change the swift-manage-shard-ranges default expansion-limit to equal the sharder daemon default merge_size i.e 750000. The previous default of 500000 had erroneously differed from the sharder default value. Introduce a ContainerSharderConf class to encapsulate loading of sharder conf and the definition of defaults. ContainerSharder inherits this and swift-manage-shard-ranges instantiates it. Rename ContainerSharder member vars to match the equivalent vars and cli options in manage_shard_ranges: shrink_size -> shrink_threshold merge_size -> expansion_limit split_size -> rows_per_shard (This direction of renaming is chosen so that the manage_shard_ranges cli options are not changed.) Rename ContainerSharder member vars to match the conf file option name: scanner_batch_size -> shard_scanner_batch_size Remove some ContainerSharder member vars that were not used outside of the __init__ method: shrink_merge_point shard_shrink_point Change-Id: I8a58a82c08ac3abaddb43c11d26fda9fb45fe6c1 |
||
|
Zuul
|
5ec3826246 | Merge "Quarantine stale EC fragments after checking handoffs" | ||
|
Alistair Coles
|
46ea3aeae8 |
Quarantine stale EC fragments after checking handoffs
If the reconstructor finds a fragment that appears to be stale then it will now quarantine the fragment. Fragments are considered stale if insufficient fragments at the same timestamp can be found to rebuild missing fragments, and the number found is less than or equal to a new reconstructor 'quarantine_threshold' config option. Before quarantining a fragment the reconstructor will attempt to fetch fragments from handoff nodes in addition to the usual primary nodes. The handoff requests are limited by a new 'request_node_count' config option. 'quarantine_threshold' defaults to zero i.e. no fragments will be quarantined. 'request node count' defaults to '2 * replicas'. Closes-Bug: 1655608 Change-Id: I08e1200291833dea3deba32cdb364baa99dc2816 |
||
|
Matthew Oliver
|
4ce907a4ae |
relinker: Add /recon/relinker endpoint and drop progress stats
To further benefit the stats capturing for the relinker, drop partition progress to a new relinker.recon recon cache and add a new recon endpoint: GET /recon/relinker To gather get live relinking progress data: $ curl http://127.0.0.3:6030/recon/relinker |python -mjson.tool { "devices": { "sdb3": { "parts_done": 523, "policies": { "1": { "next_part_power": 11, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 1630, "hash_dirs": 1630, "linked": 1630, "policies": 1, "removed": 0 }, "timestamp": 1618998730.24672, "total_parts": 1029, "total_time": 5.400741815567017 }}, "start_time": 1618998724.845946, "stats": { "errors": 0, "files": 836, "hash_dirs": 836, "linked": 836, "removed": 0 }, "timestamp": 1618998730.24672, "total_parts": 523, "total_time": 5.400741815567017 }, "sdb7": { "parts_done": 506, "policies": { "1": { "next_part_power": 11, "part_power": 10, "parts_done": 506, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 794, "hash_dirs": 794, "linked": 794, "removed": 0 }, "step": "relink", "timestamp": 1618998730.166175, "total_parts": 506, "total_time": 5.320528984069824 } }, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 794, "hash_dirs": 794, "linked": 794, "removed": 0 }, "timestamp": 1618998730.166175, "total_parts": 506, "total_time": 5.320528984069824 } }, "workers": { "100": { "drives": ["sda1"], "return_code": 0, "timestamp": 1618998730.166175} }} Also, add a constant DEFAULT_RECON_CACHE_PATH to help fix failing tests by mocking recon_cache_path, so that errors are not logged due to dump_recon_cache exceptions. Mock recon_cache_path more widely and assert no error logs more widely. Change-Id: I625147dadd44f008a7c48eb5d6ac1c54c4c0ef05 |
||
|
Tim Burke
|
c374a7a851 |
Allow floats for all intervals
Change-Id: I91e9bc02d94fe7ea6e89307305705c383087845a |
||
|
Zuul
|
e8580f0346 | Merge "s3api: Add config option to return 429s on ratelimit" | ||
|
Tim Burke
|
abfa6bee72 |
relinker: Parallelize per disk
Add a new option, workers, that works more or less like the same option from background daemons. Disks will be distributed across N worker sub-processes so we can make the best use of the I/O available. While we're at it, log final stats at warning if there were errors. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I039d2b8861f69a64bd9d2cdf68f1f534c236b2ba |
||
|
Zuul
|
7594d97f38 | Merge "relinker: retry links from older part powers" | ||
|
Alistair Coles
|
3bdd01cf4a |
relinker: retry links from older part powers
If a previous partition power increase failed to cleanup all files in their old partition locations, then during the next partition power increase the relinker may find the same file to relink in more than one source partition. This currently leads to an error log due to the second relink attempt getting an EEXIST error. With this patch, when an EEXIST is raised, the relinker will attempt to create/verify a link from older partition power locations to the next part power location, and if such a link is found then suppress the error log. During the relink step, if an alternative link is verified and if a file is found that is neither linked to the next partition power location nor in the current part power location, then the file is removed during the relink step. That prevents the same EEXIST occuring again during the cleanup step when it may no longer be possible to verify that an alternative link exists. For example, consider identical filenames in the N+1th, Nth and N-1th partition power locations, with the N+1th being linked to the Nth: - During relink, the Nth location is visited and its link is verified. Then the N-1th location is visited and an EEXIST error is encountered, but the new check verifies that a link exists to the Nth location, which is OK. - During cleanup the locations are visited in the same order, but files are removed so that the Nth location file no longer exists when the N-1th location is visited. If the N-1th location still has a conflicting file then existence of an alternative link to the Nth location can no longer be verified, so an error would be raised. Therefore, the N-1th location file must be removed during relink. The error is only suppressed for tombstones. The number of partition power location that the relinker will look back over may be configured using the link_check_limit option in a conf file or --link-check-limit on the command line, and defaults to 2. Closes-Bug: 1921718 Change-Id: If9beb9efabdad64e81d92708f862146d5fafb16c |
||
|
Alistair Coles
|
71a4aea31a |
Update docs to discourage policy names being numbers
There are times when it is convenient to specify a policy by name or by index (see Related-Change), but policy names can unfortunately collide with indexes. Using a number as a policy name should at least be discouraged. Change-Id: I0cdd3b86b527d6656b7fb50c699e3c0cc566e732 Related-Change: Icf1517bd930c74e9552b88250a7b4019e0ab413e |
||
|
Tim Burke
|
e35365df51 |
s3api: Add config option to return 429s on ratelimit
Change-Id: If04c083ccc9f63696b1f53ac13edc932740a0654 |
||
|
Zuul
|
310298a948 | Merge "s3api: Allow CORS preflight requests" | ||
|
Tim Burke
|
27a734c78a |
s3api: Allow CORS preflight requests
Unfortunately, we can't identify the user, so we can't map to an account, so we can't respect whatever CORS metadata might be set on the container. As a result, the allowed origins must be configured cluster-wide. Add a new config option, cors_preflight_allow_origin, for that; default it to blank (ie, deny preflights from all origins, preserving existing behavior), but allow either a comma-separated list of origins or * (to allow all origins). Change-Id: I985143bf03125a05792e79bc5e5f83722d6431b3 Co-Authored-By: Matthew Oliver <matt@oliver.net.au> |
||
|
Matthew Oliver
|
fb186f6710 |
Add a config file option to swift-manage-shard-ranges
While working on the shrinking recon drops, we want to display numbers that directly relate to how tool should behave. But currently all options of the s-m-s-r tool is driven by cli options. This creates a disconnect, defining what should be used in the sharder and in the tool via options are bound for failure. It would be much better to be able to define the required default options for your environment in one place that both the sharder and tool could use. This patch does some refactoring and adding max_shrinking and max_expanding options to the sharding config. As well as adds a --config option to the tool. The --config option expects a config with at '[container-sharder]' section. It only supports the shard options: - max_shrinking - max_expanding - shard_container_threshold - shard_shrink_point - shard_merge_point The latter 2 are used to generate the s-m-s-r's: - shrink_threshold - expansion_limit - rows_per_shard Use of cli arguments take precedence over that of the config. Change-Id: I4d0147ce284a1a318b3cd88975e060956d186aec |
||
|
Zuul
|
5c3eb488f2 | Merge "Report final in_progress when sharding is complete" | ||
|
Matthew Oliver
|
1de9834816 |
Report final in_progress when sharding is complete
On every sharder cycle up update in progress recon stats for each sharding container. However, we tend to not run it one final time once sharding is complete because the DB state is changed to SHARDED and therefore the in_progress stats never get their final update. For those collecting this data to monitor, this makes sharding/cleaving shards never complete. This patch, adds a new option `recon_shared_timeout` which will now allow sharded containers to be processed by `_record_sharding_progress()` after they've finished sharding for an amount of time. Change-Id: I5fa39d41f9cd3b211e45d2012fd709f4135f595e |
||
|
Zuul
|
0c2cc63b59 | Merge "tempauth: Add .reseller_reader group" | ||
|
Tim Burke
|
53c0fc3403 |
relinker: Add option to ratelimit relinking
Sure, you could use stuff like ionice or cgroups to limit relinker I/O, but sometimes a nice simple blunt instrument is handy. Change-Id: I7fe29c7913a9e09bdf7a787ccad8bba2c77cf995 |
||
|
Tim Burke
|
cf4f320644 |
tempauth: Add .reseller_reader group
Change-Id: I8c5197ed327fbb175c8a2c0e788b1ae14e6dfe23 |
||
|
Zuul
|
0c072e244c | Merge "relinker: Allow conf files for configuration" | ||
|
Pete Zaitcev
|
98a0275a9d |
Add a read-only role to keystoneauth
An idea was floated recently of a read-only role that can be used for cluster-wide audits, and is otherwise safe. It was also included into the "Consistent and Secure Default Policies" effort in OpenStack, where it implements "reader" personas in system, domain, and project scopes. This patch implements it for system scope, where it's most useful for operators. Change-Id: I5f5fff2e61a3e5fb4f4464262a8ea558a6e7d7ef |
||
|
Tim Burke
|
1b7dd34d38 |
relinker: Allow conf files for configuration
Swap out the standard logger stuff in place of --logfile. Keep --device as a CLI-only option. Everything else is pretty standard stuff that ought to be in [DEFAULT]. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I32f979f068592eaac39dcc6807b3114caeaaa814 |
||
|
Zuul
|
48b26ba833 | Merge "docs: Clarify that encryption should not be in reconciler pipeline" | ||
|
Tim Burke
|
13c0980e71 |
docs: Clarify that encryption should not be in reconciler pipeline
UpgradeImpact ============= Operators should verify that encryption is not enabled in their reconciler pipelines; having it enabled there may harm data durability. For more information, see https://launchpad.net/bugs/1910804 Change-Id: I1a1d78ed91d940ef0b4eba186dcafd714b4fb808 Closes-Bug: #1910804 |
||
|
Alistair Coles
|
6896f1f54b |
s3api: actually execute check_pipeline in real world
Previously, S3ApiMiddleware.check_pipeline would always exit early because the __file__ attribute of the Config instance passed to check_pipeline was never set. The __file__ key is typically passed to the S3ApiMiddleware constructor in the wsgi config dict, so this dict is now passed to check_pipeline() for it to test for the existence of __file__. Also, the use of a Config object is replaced with a dict where it mimics the wsgi conf object in the unit tests setup. UpgradeImpact ============= The bug prevented the pipeline order checks described in proxy-server.conf-sample being made on the proxy-server pipeline when s3api middleware was included. With this change, these checks will now be made and an invalid pipeline configuration will result in a ValueError being raised during proxy-server startup. A valid pipeline has another middleware (presumed to be an auth middleware) between s3api and the proxy-server app. If keystoneauth is found, then a further check is made that s3token is configured after s3api and before keystoneauth. The pipeline order checks can be disabled by setting the s3api auth_pipeline_check option to False in proxy-server.conf. This mitigation is recommended if previously operating with what will now be considered an invalid pipeline. The bug also prevented a check for slo middleware being in the pipeline between s3api and the proxy-server app. If the slo middleware is not found then multipart uploads will now not be supported, regardless of the value of the allow_multipart_uploads option described in proxy-server.conf-sample. In this case a warning will be logged during startup but no exception is raised. Closes-Bug: #1912391 Change-Id: I357537492733b97e5afab4a7b8e6a5c527c650e4 |
||
|
Tim Burke
|
10d9a737d8 |
s3api: Make allowable clock skew configurable
While we're at it, make the default match AWS's 15 minute limit (instead of our old 5 minute limit). UpgradeImpact ============= This (somewhat) weakens some security protections for requests over the S3 API; operators may want to preserve the prior behavior by setting allowable_clock_skew = 300 in the [filter:s3api] section of their proxy-server.conf Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I0da777fcccf056e537b48af4d3277835b265d5c9 |
||
|
Zuul
|
d5bb644a17 | Merge "Use cached shard ranges for container GETs" | ||
|
Zuul
|
8c611be876 | Merge "Memcached client TLS support" | ||
|
Grzegorz Grasza
|
6930bc24b2 |
Memcached client TLS support
This patch specifies a set of configuration options required to build a TLS context, which is used to wrap the client connection socket. Closes-Bug: #1906846 Change-Id: I03a92168b90508956f367fbb60b7712f95b97f60 |
||
|
Alistair Coles
|
077ba77ea6 |
Use cached shard ranges for container GETs
This patch makes four significant changes to the handling of GET requests for sharding or sharded containers: - container server GET requests may now result in the entire list of shard ranges being returned for the 'listing' state regardless of any request parameter constraints. - the proxy server may cache that list of shard ranges in memcache and the requests environ infocache dict, and subsequently use the cached shard ranges when handling GET requests for the same container. - the proxy now caches more container metadata so that it can synthesize a complete set of container GET response headers from cache. - the proxy server now enforces more container GET request validity checks that were previously only enforced by the backend server, e.g. checks for valid request parameter values With this change, when the proxy learns from container metadata that the container is sharded then it will cache shard ranges fetched from the backend during a container GET in memcache. On subsequent container GETs the proxy will use the cached shard ranges to gather object listings from shard containers, avoiding further GET requests to the root container until the cached shard ranges expire from cache. Cached shard ranges are most useful if they cover the entire object name space in the container. The proxy therefore uses a new X-Backend-Override-Shard-Name-Filter header to instruct the container server to ignore any request parameters that would constrain the returned shard range listing i.e. 'marker', 'end_marker', 'includes' and 'reverse' parameters. Having obtained the entire shard range listing (either from the server or from cache) the proxy now applies those request parameter constraints itself when constructing the client response. When using cached shard ranges the proxy will synthesize response headers from the container metadata that is also in cache. To enable the full set of container GET response headers to be synthezised in this way, the set of metadata that the proxy caches when handling a backend container GET response is expanded to include various timestamps. The X-Newest header may be used to disable looking up shard ranges in cache. Change-Id: I5fc696625d69d1ee9218ee2a508a1b9be6cf9685 |
||
|
Samuel Merritt
|
b971280907 |
Let developers/operators add watchers to object audit
Swift operators may find it useful to operate on each object in their cluster in some way. This commit provides them a way to hook into the object auditor with a simple, clearly-defined boundary so that they can iterate over their objects without additional disk IO. For example, a cluster operator may want to ensure a semantic consistency with all SLO segments accounted in their manifests, or locate objects that aren't in container listings. Now that Swift has encryption support, this could be used to locate unencrypted objects. The list goes on. This commit makes the auditor locate, via entry points, the watchers named in its config file. A watcher is a class with at least these four methods: __init__(self, conf, logger, **kwargs) start(self, audit_type, **kwargs) see_object(self, object_metadata, data_file_path, **kwargs) end(self, **kwargs) The auditor will call watcher.start(audit_type) at the start of an audit pass, watcher.see_object(...) for each object audited, and watcher.end() at the end of an audit pass. All method arguments are passed as keyword args. This version of the API is implemented on the context of the auditor itself, without spawning any additional processes. If the plugins are not working well -- hang, crash, or leak -- it's easier to debug them when there's no additional complication of processes that run by themselves. In addition, we include a reference implementation of plugin for the watcher API, as a help to plugin writers. Change-Id: I1be1faec53b2cdfaabf927598f1460e23c206b0a |
||
|
Zuul
|
ebfc3a61fa | Merge "Use socket_timeout kwarg instead of useless eventlet.wsgi.WRITE_TIMEOUT" | ||
|
Zuul
|
cd228fafad | Merge "Add a new URL parameter to allow for async cleanup of SLO segments" | ||
|
Tim Burke
|
918ab8543e |
Use socket_timeout kwarg instead of useless eventlet.wsgi.WRITE_TIMEOUT
No version of eventlet that I'm aware of hasany sort of support for eventlet.wsgi.WRITE_TIMEOUT; I don't know why we've been setting that. On the other hand, the socket_timeout argument for eventlet.wsgi.Server has been supported for a while -- since 0.14 in 2013. Drive-by: Fix up handling of sub-second client_timeouts. Change-Id: I1dca3c3a51a83c9d5212ee5a0ad2ba1343c68cf9 Related-Change: I1d4d028ac5e864084a9b7537b140229cb235c7a3 Related-Change: I433c97df99193ec31c863038b9b6fd20bb3705b8 |
||
|
Tim Burke
|
e78377624a |
Add a new URL parameter to allow for async cleanup of SLO segments
Add a new config option to SLO, allow_async_delete, to allow operators to opt-in to this new behavior. If their expirer queues get out of hand, they can always turn it back off. If the option is disabled, handle the delete inline; this matches the behavior of old Swift. Only allow an async delete if all segments are in the same container and none are nested SLOs, that way we only have two auth checks to make. Have s3api try to use this new mode if the data seems to have been uploaded via S3 (since it should be safe to assume that the above criteria are met). Drive-by: Allow the expirer queue and swift-container-deleter to use high-precision timestamps. Change-Id: I0bbe1ccd06776ef3e23438b40d8fb9a7c2de8921 |
||
|
Zuul
|
2593f7f264 | Merge "memcache: Make error-limiting values configurable" | ||
|
Tim Burke
|
aff65242ff |
memcache: Make error-limiting values configurable
Previously these were all hardcoded; let operators tweak them as needed. Significantly, this also allows operators to disable error-limiting entirely, which may be a useful protection in case proxies are configured with a single memcached server. Use error_suppression_limit and error_suppression_interval to mirror the option names used by the proxy-server to ratelimit backend Swift servers. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: Ife005cb8545dd966d7b0e34e5496a0354c003881 |
||
|
Zuul
|
b9a404b4d1 | Merge "ec: Add an option to write fragments with legacy crc" | ||
|
Clay Gerrard
|
b05ad82959 |
Add tasks_per_second option to expirer
This allows operators to throttle expirers as needed. Partial-Bug: #1784753 Change-Id: If75dabb431bddd4ad6100e41395bb6c31a4ce569 |
||
|
Tim Burke
|
599f63e762 |
ec: Add an option to write fragments with legacy crc
When upgrading from liberasurecode<=1.5.0, you may want to continue writing legacy CRCs until all nodes are upgraded and capabale of reading fragments with zlib CRCs. Starting in liberasurecode>=1.6.2, we can use the environment variable LIBERASURECODE_WRITE_LEGACY_CRC to control whether we write zlib or legacy CRCs, but for many operators it's easier to manage swift configs than environment variables. Add a new option, write_legacy_ec_crc, to the proxy-server app and object-reconstructor; if set to true, ensure legacy frags are written. Note that more daemons instantiate proxy-server apps than just the proxy-server. The complete set of impacted daemons should be: * proxy-server * object-reconstructor * container-reconciler * any users of internal-client.conf UpgradeImpact ============= To ensure a smooth liberasurecode upgrade: 1. Determine whether your cluster writes legacy or zlib CRCs. Depending on the order in which shared libraries are loaded, your servers may already be reading and writing zlib CRCs, even with old liberasurecode. In that case, no special action is required and WRITING LEGACY CRCS DURING THE UPGRADE WILL CAUSE AN OUTAGE. Just upgrade liberasurecode normally. See the closed bug for more information and a script to determine which CRC is used. 2. On all nodes, ensure Swift is upgraded to a version that includes write_legacy_ec_crc support and write_legacy_ec_crc is enabled on all daemons. 3. On each node, upgrade liberasurecode and restart Swift services. Because of (2), they will continue writing legacy CRCs which will still be readable by nodes that have not yet upgraded. 4. Once all nodes are upgraded, remove the write_legacy_ec_crc option from all configs across all nodes. After restarting daemons, they will write zlib CRCs which will also be readable by all nodes. Change-Id: Iff71069f808623453c0ff36b798559015e604c7d Related-Bug: #1666320 Closes-Bug: #1886088 Depends-On: https://review.opendev.org/#/c/738959/ |
||
|
Clay Gerrard
|
754defc39c |
Client should retry when there's just one 404 and a bunch of errors
During a rebalance, it's expected that we may get a 404 for data that does exist elsewhere in the cluster. Normally this isn't a problem; the proxy sees the 404, keeps digging, and one of the other primaries will serve the response. Previously, if the other replicas were heavily loaded, the proxy would see a bunch of timeouts and the fresh (empty) primary, treat the 404 as good, and send that on to the client. Now, have the proxy throw out that first 404 (provided it doesn't have a timestamp); it will then return a 503 to the client, indicating that it should try again. Add a new (per-policy) proxy-server config option, rebalance_missing_suppression_count; operators may use this to increase the number of 404-no-timestamp responses to discard if their rebalances are going faster than replication can keep up, or set it to zero to return to the previous behavior. Change-Id: If4bd39788642c00d66579b26144af8f116735b4d |
||
|
Zuul
|
cca5e8b1de | Merge "Make all concurrent_get options per-policy" | ||
|
Zuul
|
20e1544ad8 | Merge "Extend concurrent_gets to EC GET requests" | ||
|
Clay Gerrard
|
f043aedec1 |
Make all concurrent_get options per-policy
Change-Id: Ib81f77cc343c3435d7e6258d4631563fa022d449 |
||
|
Zuul
|
7015ac2fdc | Merge "py3: Work with proper native string paths in crypto meta" | ||
|
Clay Gerrard
|
8f60e0a260 |
Extend concurrent_gets to EC GET requests
After the initial requests are started, if the proxy still does not have enough backend responses to return a client response additional requests will be spawned to remaining primaries at the frequency configured by the concurrency_timeout. A new tunable concurrent_ec_extra_requests allows operators to control how many requests to backend fragments are started immediately with a client request to an object stored in an EC storage policy. By default the minimum ndata backend requests are started immediately, but operators may increase concurrent_ec_extra_requests up to nparity which is similar in effect to a concurrency_timeout of 0. Change-Id: Ia0a9398107a400815be2e0097b1b8e76336a0253 |
||
|
Zuul
|
a4f2252e2b | Merge "proxy-logging: Be able to configure log_route" | ||
|
Zuul
|
50800aba37 | Merge "Update SAIO & docker image to use 62xx ports" | ||
|
Tim Burke
|
7d429318dd |
py3: Work with proper native string paths in crypto meta
Previously, we would work with these paths as WSGI strings -- this would work fine when all data were read and written on the same major version of Python, but fail pretty badly during and after upgrading Python. In particular, if a py3 proxy-server tried to read existing data that was written down by a py2 proxy-server, it would hit an error and respond 500. Worse, if an un-upgraded py2 proxy tried to read data that was freshly-written by a py3 proxy, it would serve corrupt data back to the client (including a corrupt/invalid ETag and Content-Type). Now, ensure that both py2 and py3 write down paths as native strings. Make an effort to still work with WSGI-string metadata, though it can be ambiguous as to whether a string is a WSGI string or not. The heuristic used is if * the path from metadata does not match the (native-string) request path and * the path from metadata (when interpreted as a WSGI string) can be "un-wsgi-fied" without any encode/decode errors and * the native-string path from metadata *does* match the native-string request path then trust the path from the request. By contrast, we usually prefer the path from metadata in case there was a pipeline misconfiguration (see related bug). Add the ability to read and write a new, unambiguous version of metadata that always has the path as a native string. To support rolling upgrades, a new config option is added: meta_version_to_write. This defaults to 2 to support rolling upgrades without configuration changes, but the default may change to 3 in a future release. UpgradeImpact ============= When upgrading from Swift 2.20.0 or Swift 2.19.1 or earlier, set meta_version_to_write = 1 in your keymaster's configuration. Regardless of prior Swift version, set meta_version_to_write = 3 after upgrading all proxy servers. When switching from Python 2 to Python 3, first upgrade Swift while on Python 2, then upgrade to Python 3. Change-Id: I00c6693c42c1a0220b64d8016d380d5985339658 Closes-Bug: #1888037 Related-Bug: #1813725 |