swift

Code Issues Proposed changes

11,034 Commits 22 Branches 150 Tags

c26c7b8edd464d1fcf1d219f7c3fb040914d9da3

687 Commits

Author	SHA1	Message	Date
Zuul	e10c2bafcb	Merge "proxy-logging: create field for access_user_id"	2025年08月26日 03:43:46 +00:00
Vitaly Bordyug	32eaab20b1	proxy-logging: create field for access_user_id Added the new field to be able to log the access key during the s3api calls, while reserving the field to be filled with auth relevant information in case of other middlewares. Added respective code to the tempauth and keystone middlewares. Since s3api creates a copy of the environ dict for the downstream request object when translating the s3req.to_swift_req the environ dict that is seen/modifed in other mw module is not the same instance seen in proxy-logging - using mutable objects get transfered into the swift_req.environ. Change the assert in test_proxy_logging from "the last field" to the index 21 in the interests of maintainability. Also added some regression tests for object, bucket and s3 v4 apis and updated the documentation with the details about the new field. Signed-off-by: Vitaly Bordyug <vbordug@gmail.com> Change-Id: I0ce4e92458e2b05a4848cc7675604c1aa2b64d64	2025年08月26日 01:14:37 +00:00
Tim Burke	ae062f8b09	ring: Introduce a v2 ring format There's a bunch of moving pieces here: - Add a new RingWriter class. Stick it in a new swift.common.ring.io module. You can use it like the old gzip file, but you can also define named sections which can be referenced later on read. Section names may be arbitrary strings, but the "swift/" prefix is reserved for upstream use. Sections must contain a single length-value encoded BLOB. If sections are used, an additional BLOB is written at the end containing a JSON section-index, followed by an uncompressed offset for the index. Move RingReader to ring/io.py, too. - Clean up some ring metadata handling: - Drop MD5 tracking in RingReader. It was brittle at best anyway, and nothing uses it. YAGNI - Fix size/raw_size attributes when loading only metadata. - Add the ability to seek within RingReaders, though you need to know what you're doing and only seek to flush points. - Let RingBuilder objects change how wide their replica2part2dev_id arrays are. Add a dev_id_bytes key to serialized ring metadata. dev_id_bytes may be either 2 or 4, but 4 requires v2 rings. We considered allowing dev_id_bytes of 1, but dropped it as unnecessary complexity for a niche use case. - swift-ring-builder version subcommand added, which takes a ring. This lets operators see the serialization format of a ring on disk: $ swift-ring-builder object.ring.gz version object.ring.gz: Serialization version: 2 (2-byte IDs), build version: 54 Signed-off-by: Tim Burke <tim.burke@gmail.com> Change-Id: Ia0ac4ea2006d8965d7fdb6659d355c77386adb70	2025年07月21日 11:37:15 -07:00
Tim Burke	74030236ad	tempauth: Support fernet tokens Tempauth fernet tokens use a secret shared among all proxies to encrypt user group information. Because they are encrypted, clients can neither view nor edit this information; it is an opaque bearer token similar to the existing memcached-backed tokens (just much longer). Note that tokens still expire after the configured token_life. Add a new set of config options of the form fernet_key_<keyid> = <32 url-safe base64-encoded bytes> Any of the configured keys will be used to attempt to decrypt tokens starting with "ftk" and extract group information. Another new config option active_fernet_key_id = <keyid> dictates which key should be used when minting tokens. Such tokens will start with "ftk" to distinguish them from memcached-backed tokens (which continue to start with "tk"). If active_fernet_key_id is not configured, memcached-backed tokens continue to be used. Together, these allow seamless transitions from memcached-backed tokens to fernet tokens, as well as transitions from one fernet key to another: 1. Add a new fernet_key_<keyid> entry. 2. Ensure all proxies have the new config with fernet_key_<keyid>. 3. Set active_fernet_key_id = <keyid>. 4. Ensure all proxies have the new config with the new active_fernet_key_id. This is similar to the key-rotation process for the encryption feature, except that old keys may be pruned following a token_life period. Additionally, opportunistically compress groups before minting tokens. Compressed tokens will begin with "zftk" but otherwise behave just like "ftk" tokens. Change-Id: I0bdc98765d05e91f872ef39d4722f91711a5641f	2025年04月25日 14:49:12 -07:00
Clay Gerrard	0e2791a88a	Remove deprecated statsd label_mode Hopefully if we never do a release that supports signalfx no one will ever use it and we won't have to maintain it. Drive-by: refactor label model dispatch to fix a weird bug where a config name could be a class attribute and blow up weird. Change-Id: I2c67b59820c5ca094077bf47628426f4b0445ba0	2025年04月04日 13:02:37 +01:00
Tim Burke	7e5235894b	stats: API for native labeled metrics Introduce a LabeledStatsdClient API; no callers yet. Include three config options: - statsd_label_mode, which specifies which label format to use - statsd_emit_legacy, which dictates whether to emit old-style metrics dotted metrics - statsd_user_label_<name> = <value>, which supports user defined labels in restricted ASCII characters Co-Authored-By: yanxiao@nvidia.com Co-Authored-By: alistairncoles@gmail.com Change-Id: I115ffb1dc601652a979895d7944e011b951a91c1	2025年04月03日 14:26:08 -04:00
Clay Gerrard	b69a2bef45	Deprecate expirer options The following configuration options are deprecated: * expiring_objects_container_divisor * expiring_objects_account_name The upstream maintainers are not aware of any clusters where these have been configured to non-default values. UpgradeImpact: Operators are encouraged to remove their "container_divisor" setting and use the default value of 86400. If a cluster was deployed with a non-standard "account_name", operators should remove the option from all configs so they are using a supported configuration going forward, but will need to deploy stand-alone expirer processes with legacy expirer config to clean-up old expiration tasks from the previously configured account name. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Co-Authored-By: Jianjian Huo <jhuo@nvidia.com> Change-Id: I5ea9e6dc8b44c8c5f55837debe24dd76be7d6248	2025年02月07日 08:33:34 -08:00
Tim Burke	ae6300af86	wsgi: Reap stale workers (after a timeout) following a reload Add a new tunable, `stale_worker_timeout`, defaulting to 86400 (i.e. 24 hours). Once this time elapses following a reload, the manager process will issue SIGKILLs to any remaining stale workers. This gives operators a way to configure a limit for how long old code and configs may still be running in their cluster. To enable this, the temporary reload child (which waits for the reload to complete then closes the accept socket on all the old workers) has grown the ability to send state to the re-exec'ed manager. Currently, this is limited to just the set of pre-re-exec child PIDs and their reload times, though it was designed to be reasonably extensible. This allows the new manager to recognize stale workers as they exit instead of logging Ignoring wait() result from unknown PID ... With the improved knowledge of subprocesses, we can kick the log level for the above message up from info to warning; we no longer expect it to trigger in practice. Drive-by: Add logging to ServersPerPortStrategy.register_worker_exit that's comparable to what WorkersStrategy does. Change-Id: I8227939d04fda8db66fb2f131f2c71ce8741c7d9	2025年01月16日 13:44:21 +11:00
Zuul	94d3a5dee8	Merge "obj: Add option to tune down etag validation in object-server"	2025年01月08日 20:59:29 +00:00
Tim Burke	3d8fb046cb	obj: Add option to tune down etag validation in object-server Historically, the object-server would validate the ETag of an object whenever it was streaming the complete object. This minimizes the possibility of returning corrupted data to clients, but - Clients that only ever make ranged requests get no benefit and - MD5 can be rather CPU-intensive; this is especially noticeable in all-flash clusters/policies where Swift is not disk-constrained. Add a new `etag_validate_pct` option to tune down this validation. This takes values from 100 (default; all whole-object downloads are validated) down to 0 (none are). Note that even with etag validation turned off, the object-auditor should eventually detect and quarantine corrupted objects. However, transient read errors may cause clients to download corrupted data. Hat-tip to Jianjian for all the profiling work! Co-Authored-By: Jianjian Huo <jhuo@nvidia.com> Change-Id: Iae48e8db642f6772114c0ae7c6bdd9c653cd035b	2025年01月08日 18:21:30 +00:00
Tim Burke	a55a48ffc8	docs: Call out that xprofile is not intended for production Change-Id: I1e9d4d5df403040d69db93a08647cd0abe1b8037	2024年12月10日 15:17:11 -08:00
Jianjian Huo	ea1d84c1d7	Object-server: add periodic greenthread yielding during file write Currently, when object-server serves PUT request and DiskFile writer write file chunks to disk, there is no explicit eventlet sleep called. When network outpace the slow disk IO, it's possible one large and slow PUT request could cause eventlet hub not to schedule any other green threads for a long period of time. To improve this, this patch enable the configurable yield parameter 'cooperative_period' into object server controller write path. Related-Change: I80b04bad0601b6cd6caef35498f89d4ba70a4fd4 Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I1c0aba9830433f093d024b4c39cd3a3b2f0d69f1	2024年11月29日 13:59:25 +00:00
Zuul	7662cde704	Merge "Add oldest failed async pending tracker"	2024年11月05日 08:22:10 +00:00
Chinemerem	0a5348eb48	Add oldest failed async pending tracker In the past we have had some async pendings that repeatedly fail for months at a time. This patch adds an OldestAsyncPendingTracker class which manages the tracking of the oldest async pending updates for each account-container pair. This class maintains timestamps for pending updates associated with account-container pairs. It evicts the newest pairs when the max_entries is reached. It supports retrieving the N oldest pending updates or calculating the age of the oldest pending update. Change-Id: I6d9667d555836cfceda52708a57a1d29ebd1a80b	2024年11月01日 15:49:53 -07:00
Clay Gerrard	df22032d79	object-expirer: add round_robin_cache_size option Drive-Bys: * DRY out redundent configuration examples in expiring objects overview documentation. * Add missing delay_reaping man page docs. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I8879dbd13527233c878dff764ec411ce9619ee39	2024年11月01日 09:54:54 +00:00
Tim Burke	ef8764cb06	logging: Add UPDATE to valid http methods We introduced this a while back, but forgot to add it then. Related-Change: Ia13ee5da3d1b5c536eccaadc7a6fdcd997374443 Change-Id: Ib65ddf50d7f5c3e27475626000943eb18e65c73a	2024年10月09日 08:18:49 -07:00
Alistair Coles	d555755423	proxy_logging config: unit tests and doc pointers Add unit tests to verify the precedence of access_log_ and log_ prefixes to options. Add pointers from proxy_logging sections in other sample config files to the proxy-server.conf-sample file. Change-Id: Id18176d3790fd187e304f0e33e3f74a94dc5305c	2024年07月16日 11:33:58 +01:00
Thomas Goirand	90da23c7d2	kms_keymaster: allow specifying barbican_endpoint Under a multi-region deployment with a single Keystone server, specifying the Keystone auth credentials isn't enough. Indeed, Castellan succeeds when logging-in, but may use the wrong Barbican endpoint (if there are 2 Barbican deployed). This is what happened to us, when deploying our 2nd region. They way to fix it would be to tell Castellan what region to use, unfortunately, there's no such option in Castellan. Though we may specify the barbican_endpoint, which is what this patch allows. Change-Id: Ib7f4219ef5fdef65e9cfd5701e28b5288741783e	2024年06月14日 12:17:07 +02:00
Zuul	d1aa735a37	Merge "backend ratelimit: support per-method rate limits"	2024年05月13日 16:11:19 +00:00
Zuul	bf206ed2fe	Merge "backend ratelimit: support reloadable config file"	2024年05月11日 20:26:40 +00:00
Zuul	937af35e62	Merge "object-expirer: add example to delay_reaping sample config"	2024年04月26日 15:14:34 +00:00
indianwhocodes	11eb17d3b2	support x-open-expired header for expired objects If the global configuration option 'enable_open_expired' is set to true in the config, then the client will be able to make a request with the header 'x-open-expired' set to true in order to access an object that has expired, provided it is in its grace period. If this config flag is set to false, the client will not be able to access any expired objects, even with the header, which is the default behavior unless the flag is set. When a client sets a 'x-open-expired' header to a true value for a GET/HEAD/POST request the proxy will forward x-backend-open-expired to storage server. The storage server will allow clients that set x-backend-open-expired to open and read an object that has not yet been reaped by the object-expirer, even after the x-delete-at time has passed. The header is always ignored when used with temporary URLs. Co-Authored-By: Anish Kachinthaya <akachinthaya@nvidia.com> Related-Change: I106103438c4162a561486ac73a09436e998ae1f0 Change-Id: Ibe7dde0e3bf587d77e14808b169c02f8fb3dddb3	2024年04月26日 10:13:40 +01:00
Alistair Coles	ce619137db	object-expirer: add example to delay_reaping sample config Add an example of a delay_reaping config option with quoted key. Change-Id: I0c7ead6795822ea0fb0e81abc1e4685d7946942c Related-Change: I106103438c4162a561486ac73a09436e998ae1f0	2024年04月26日 09:37:40 +01:00
Mandell Degerness	5961ba0ca7	expirer: account and container level delay_reaping The object expirer can be configured to delay the reaping of objects from disk after their expiration time using account and container level delay_reaping values. The delay_reaping value of accounts and containers in seconds is configured in the object server config. The object expirer references these configured values to only reap objects from specified accounts and containers after their corresponding delays. The goal of the delay_reaping feature is to prevent accidental or premature data loss if an object marked for deletion with the 'x-delete-at' feature should not be reaped immediately, for whatever reason. Configuring the delay_reaping value at a granular account and container level is beneficial for being able to keep storage capacity consumption in control while maintaining a desired data recovery window. This patch also adds a sample configuration, documentation, and tests for bad configurations and grace period functionality. Co-Authored-By: Anish Kachinthaya <akachinthaya@nvidia.com> Change-Id: I106103438c4162a561486ac73a09436e998ae1f0	2024年04月25日 13:59:36 -07:00
Alistair Coles	3517ca453e	backend ratelimit: support per-method rate limits Add support for config options such as: head_requests_per_device_per_second = 100 Change-Id: I2936f799b6112155ff01dcd8e1f985849a1af178	2024年03月12日 15:54:54 +00:00
Alistair Coles	e9abfd76ee	backend ratelimit: support reloadable config file Add support for a backend_ratelimit_conf_path option in the [filter:backend_ratelimit] config. If specified then the middleware will give precedence to config options from that file over config options from the [filter:backend_ratelimit] section. The path defaults to /etc/swift/backend-ratelimit.conf. The config file is periodically reloaded and any changed options are applied. The middleware will log a warning the first time it fails to load a config file that had previously been successfully loaded. The middleware also logs at info level when it first successfully loads a config file that had previously failed to be loaded. Otherwise, the middleware will log when a config file is loaded that results in the config being changed. Change-Id: I6554e37c6ab5b0a260f99b54169cb90ab5718f81	2024年03月11日 18:10:24 +00:00
Tim Burke	6a426f7fa0	sharder: Add periodic_warnings_interval to example config Change-Id: Ie3c64646373580b70557f2720a13a5a0c5ef7097	2024年03月11日 10:35:13 -07:00
Zuul	07c8e8bcdc	Merge "Object-server: add periodic greenthread yielding during file read."	2024年02月27日 04:03:00 +00:00
Jianjian Huo	d5877179a5	Object-server: add periodic greenthread yielding during file read. Currently, when object-server serves GET request and DiskFile reader iterate over disk file chunks, there is no explicit eventlet sleep called. When network outpace the slow disk IO, it's possible one large and slow GET request could cause eventlet hub not to schedule any other green threads for a long period of time. To improve this, this patch add a configurable sleep parameter into DiskFile reader, which is 'cooperative_period' with a default value of 0 (disabled). Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I80b04bad0601b6cd6caef35498f89d4ba70a4fd4	2024年02月27日 11:24:41 +11:00
Alistair Coles	2500fbeea9	proxy: don't use recoverable_node_timeout with x-newest Object GET requests with a truthy X-Newest header are not resumed if a backend request times out. The GetOrHeadHandler therefore uses the regular node_timeout when waiting for a backend connection response, rather than the possibly shorter recoverable_node_timeout. However, previously while reading data from a backend response the recoverable_node_timeout would still be used with X-Newest requests. This patch simplifies GetOrHeadHandler to never use recoverable_node_timeout when X-Newest is truthy. Change-Id: I326278ecb21465f519b281c9f6c2dedbcbb5ff14	2024年02月26日 09:54:36 +00:00
Takashi Kajinami	bd64748a03	Document allowed_digests for formpost middleware The allowed_digests option were added to the formpost middleware in addition to the tempurl middleware[1], but the option was not added to the formpost section in the example proxy config file. [1] `2d063cd61f` Change-Id: Ic885e8bde7c1bbb3d93d032080b591db1de80970	2023年12月25日 17:17:39 +09:00
Tim Burke	0c9b545ea7	docs: Clean up proxy logging docs Change-Id: I6ef909e826d3901f24d3c42a78d2ab1e4e47bb64	2023年08月04日 11:30:42 -07:00
Jianjian Huo	cb1e584e64	Object-server: keep SLO manifest files in page cache. Currently, SLO manifest files will be evicted from page cache after reading it, which cause hard drives very busy when user requests a lot of parallel byte range GETs for a particular SLO object. This patch will add a new config 'keep_cache_slo_manifest', and try keeping the manifest files in page cache by not evicting them after reading if config settings allow so. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I557bd01643375d7ad68c3031430899b85908a54f	2023年07月07日 12:48:24 -07:00
Tim Burke	469c38e9fb	wsgi: Add keepalive_timeout option Clients sometimes hold open connections "just in case" they might later pipeline requests. This can cause issues for proxies, especially if operators restrict max_clients in an effort to improve response times for the requests that do get serviced. Add a new keepalive_timeout option to give proxies a way to drop these established-but-idle connections without impacting active connections (as may happen when reducing client_timeout). Note that this requires eventlet 0.33.4 or later. Change-Id: Ib5bb84fa3f8a4b9c062d58c8d3689e7030d9feb3	2023年04月18日 11:49:05 -07:00
Zuul	5fae344ef4	Merge "internal_client: Remove allow_modify_pipeline option"	2023年04月14日 17:30:51 +00:00
Matthew Oliver	e5105ffa09	internal_client: Remove allow_modify_pipeline option The internal client is suppose to be internal to the cluster, and as such we rely on it to not remove any headers we decide to send. However if the allow_modify_pipeline option is set the gatekeeper middleware is added to the internal client's proxy pipeline. So firstly, this patch removes the allow_modify_pipeline option from the internal client constructor. And when calling loadapp allow_modify_pipeline is always passed with a False. Further, an op could directly put the gatekeeper middleware into the internal client config. The internal client constructor will now check the pipeline and raise a ValueError if one has been placed in the pipeline. To do this, there is now a check_gatekeeper_loaded staticmethod that will walk the pipeline which called from the InternalClient.__init__ method. Enabling this walking through the pipeline, we are now stashing the wsgi pipeline in each filter so that we don't have to rely on 'app' naming conventions to iterate the pipeline. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: Idcca7ac0796935c8883de9084d612d64159d9f92	2023年04月14日 10:37:40 +01:00
Tim Burke	cbba65ac91	quotas: Add account-level per-policy quotas Reseller admins can set new headers on accounts like X-Account-Quota-Bytes-Policy-<policy-name>: <quota> This may be done to limit consumption of a faster, all-flash policy, for example. This is independent of the existing X-Account-Meta-Quota-Bytes header, which continues to limit the total storage for an account across all policies. Change-Id: Ib25c2f667e5b81301f8c67375644981a13487cfe	2023年03月21日 17:27:31 +00:00
Zuul	0470994a03	Merge "slo: Default allow_async_delete to true"	2022年12月01日 19:25:50 +00:00
Jianjian Huo	4ed2b89cb7	Sharder: warn when sharding appears to have stalled. This patch add a configurable timeout after which the sharder will warn if a container DB has not completed sharding. The new config is container_sharding_timeout with a default of 172800 seconds (2 days). Drive-by fix: recording sharding progress will cover the case of shard range shrinking too. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I6ce299b5232a8f394e35f148317f9e08208a0c0f	2022年10月14日 08:54:06 -07:00
Zuul	8ab6af27c5	Merge "proxy: Add a chance to skip memcache for get_*_info calls"	2022年09月26日 19:08:11 +00:00
Zuul	b05b27c0b6	Merge "Add note about rsync_bwlimit suffixes"	2022年08月30日 22:53:02 +00:00
Tim Burke	5c6407bf59	proxy: Add a chance to skip memcache for get__info calls If you've got thousands of requests per second for objects in a single container, you basically NEVER want that container's info to ever fall out of memcache. If it does*, all those clients are almost certainly going to overload the container. Avoid this by allowing some small fraction of requests to bypass and refresh the cache, pushing out the TTL as long as there continue to be requests to the container. The likelihood of skipping the cache is configurable, similar to what we did for shard range sets. Change-Id: If9249a42b30e2a2e7c4b0b91f947f24bf891b86f Closes-Bug: #1883324	2022年08月30日 18:49:48 +10:00
Zuul	24acc6e56b	Merge "Add backend rate limiting middleware"	2022年08月30日 07:18:57 +00:00
Tim Burke	a9177a4b9d	Add note about rsync_bwlimit suffixes Change-Id: I019451e118d3bd7263a52cf4bf354d0d0d2b4607	2022年08月26日 08:54:06 -07:00
Tim Burke	f6196b0a22	AUTHORS/CHANGELOG for 2.30.0 Change-Id: If7c9e13fc62f8104ccb70a12b9c839f78e7e6e3e	2022年08月17日 22:21:45 -07:00
Zuul	5ff37a0d5e	Merge "DB Replicator: Add handoff_delete option"	2022年07月22日 01:45:31 +00:00
Matthew Oliver	bf4edefce4	DB Replicator: Add handoff_delete option Currently the object-replicator has an option called `handoff_delete` which allows us to define the the number of replicas which are ensured in swift. Once a handoff node ensures that many successful responses it can go ahead and delete the handoff partition. By default it's 'auto' or rather the number of primary nodes. But this can be reduced. It's useful in draining full disks, but has to be used carefully. This patch adds the same option to the DB replicator and works the same way. But instead of deleting a partition it's done at the per DB level. Because it's done in the DB Replicator level it means the option is now available to both the Account and Container replicators. Change-Id: Ide739a6d805bda20071c7977f5083574a5345a33	2022年07月21日 13:35:24 +10:00
Zuul	73b2730f71	Merge "Add ring_ip option to object services"	2022年06月06日 21:04:48 +00:00
Clay Gerrard	12bc79bf01	Add ring_ip option to object services This will be used when finding their own devices in rings, defaulting to the bind_ip. Notably, this allows services to be containerized while servers_per_port is enabled: * For the object-server, the ring_ip should be set to the host ip and will be used to discover which ports need binding. Sockets will still be bound to the bind_ip (likely 0.0.0.0), with the assumption that the host will publish ports 1:1. * For the replicator and reconstructor, the ring_ip will be used to discover which devices should be replicated. While bind_ip could previously be used for this, it would have required a separate config from the object-server. Also rename object deamon's bind_ip attribute to ring_ip so that it's more obvious wherever we're using the IP for ring lookups instead of socket binding. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: I1c9bb8086994f7930acd8cda8f56e766938c2218	2022年06月02日 16:31:29 -05:00
Zuul	5398204f22	Merge "tempurl: Deprecate sha1 signatures"	2022年06月01日 15:54:25 +00:00

First Previous 1 2 3 4 5 ... Next Last

openstack/swift - swift - OpenDev: Free Software Needs Free Tools

687 Commits