11125 Commits

This Branch
This Branch
All Branches
Author SHA1 Message Date
Tim Burke
0dfa38d025 docs: Fix version call-out for stale_worker_timeout
Related-Change: I8227939d04fda8db66fb2f131f2c71ce8741c7d9
Change-Id: I149a2df2d942bba02049947865b000c9cf1a89bc
2025年01月16日 14:21:55 -08:00
Tim Burke
ae6300af86 wsgi: Reap stale workers (after a timeout) following a reload
Add a new tunable, `stale_worker_timeout`, defaulting to 86400 (i.e. 24
hours). Once this time elapses following a reload, the manager process
will issue SIGKILLs to any remaining stale workers.
This gives operators a way to configure a limit for how long old code
and configs may still be running in their cluster.
To enable this, the temporary reload child (which waits for the reload
to complete then closes the accept socket on all the old workers) has
grown the ability to send state to the re-exec'ed manager. Currently,
this is limited to just the set of pre-re-exec child PIDs and their
reload times, though it was designed to be reasonably extensible.
This allows the new manager to recognize stale workers as they exit
instead of logging
 Ignoring wait() result from unknown PID ...
With the improved knowledge of subprocesses, we can kick the log level
for the above message up from info to warning; we no longer expect it
to trigger in practice.
Drive-by: Add logging to ServersPerPortStrategy.register_worker_exit
that's comparable to what WorkersStrategy does.
Change-Id: I8227939d04fda8db66fb2f131f2c71ce8741c7d9
2025年01月16日 13:44:21 +11:00
Zuul
734ed9cdd8 Merge "Remove py2-only code paths" 2025年01月15日 22:39:06 +00:00
Zuul
768571d807 Merge "CI: document bandit tests by reference" 2025年01月15日 22:04:57 +00:00
Tim Burke
a630e76d6c versioning: 411 PUTs with neither content-length nor transfer-encoding
... just like we would do in a normal container. Previously, we'd try to
read a byte from the client which, due to a bug in eventlet HTTP framing,
would either hang until we hit a timeout or worse read from the next
pipelined request.
This required that we reset a (repeatedly-reused!) request in s3api to
have an empty body, or it would start triggering 411s, too.
See also: https://github.com/eventlet/eventlet/pull/985
Closes-Bug: #2081103
Change-Id: I56c1ecc4edb953c0bade8744e4bed584099f29c7
2025年01月15日 12:52:09 -08:00
Zuul
9b4d008390 Merge "Leave updater per-device stats in recon for debugging" 2025年01月15日 18:31:47 +00:00
Alistair Coles
aa0429ce00 CI: document bandit tests by reference
The available bandit tests change with time (e.g. the
Related-Change). We shouldn't try to maintain the list.
Related-Change: Ie668d49a56c0a6542d28128656cfd44f7c089ec4
Change-Id: I6eb106abbac28ffbb9a3f64e8aa60218cbe75682
2025年01月14日 09:34:46 +00:00
Tim Burke
f95315b711 CI: Remove B320 and B410 bandit skips
They were removed upstream recently, so now Bandit is complaining about
the unknown test.
See https://github.com/PyCQA/bandit/pull/1212
Change-Id: Ie668d49a56c0a6542d28128656cfd44f7c089ec4
2025年01月13日 15:20:16 -08:00
Tim Burke
128124cdd8 Remove py2-only code paths
Change-Id: Ic66b9ae89837afe31929ce07cc625dfc28314ea3
2025年01月13日 13:36:41 -08:00
Zuul
94d3a5dee8 Merge "obj: Add option to tune down etag validation in object-server" 2025年01月08日 20:59:29 +00:00
Tim Burke
3d8fb046cb obj: Add option to tune down etag validation in object-server
Historically, the object-server would validate the ETag of an object
whenever it was streaming the complete object. This minimizes the
possibility of returning corrupted data to clients, but
- Clients that only ever make ranged requests get no benefit and
- MD5 can be rather CPU-intensive; this is especially noticeable
 in all-flash clusters/policies where Swift is not disk-constrained.
Add a new `etag_validate_pct` option to tune down this validation.
This takes values from 100 (default; all whole-object downloads are
validated) down to 0 (none are).
Note that even with etag validation turned off, the object-auditor
should eventually detect and quarantine corrupted objects. However,
transient read errors may cause clients to download corrupted data.
Hat-tip to Jianjian for all the profiling work!
Co-Authored-By: Jianjian Huo <jhuo@nvidia.com>
Change-Id: Iae48e8db642f6772114c0ae7c6bdd9c653cd035b
2025年01月08日 18:21:30 +00:00
Zuul
4d4e65904a Merge "Improve get_logger tests re. statsd prefix" 2025年01月08日 17:18:18 +00:00
Zuul
7828c233a1 Merge "tests: use a method to test a method" 2025年01月08日 10:56:34 +00:00
Alistair Coles
1bc0507c13 Improve get_logger tests re. statsd prefix
Add test for mutilated statsd client
Drive-by: revert unncesessary whitespace change in Related-Change.
Related-Change: I3a677bb67c5700da48f89c847f652b4610ba47c2
Co-Authored-By: Shreeya Deshpande <shreeyad@nvidia.com>
Co-Authored-By: Clay Gerrard <cgerrard@nvidia.com>
Change-Id: Id262bbcf0b233f9728f55be208bee5bc146c053d
2025年01月08日 09:41:10 +00:00
Zuul
c10ca639e6 Merge "tests: relocate some logging related unit tests" 2025年01月07日 23:46:14 +00:00
Clay Gerrard
6a633d06fb tests: use a method to test a method
Change-Id: I14ea6bbd55512e8798e5cadf4f6e68f95961206b
2025年01月07日 16:38:08 -06:00
Alistair Coles
7d3a32d107 tests: relocate some logging related unit tests
The Related-Changes moved logging and statsd components into new
modules and decoupled statsd from the logs module. This patch attempts
to re-locate the related unit tests to appropriate modules.
* tests for utils functions such as get_logger should be in
 test_utils.py.
* tests for log functions such as get_swift_logger should be in
 test_logs.py.
* tests for statsd client functions should be in
 test_statsd_client.py.
* tests related to patching a SwiftLogAdapter with a StatsdClient
 interface should be in test_utils.py.
Change-Id: I3a677bb67c5700da48f89c847f652b4610ba47c2
Related-Change: I44694b92264066ca427bb96456d6f944e09b31c0
Related-Change: I8988c0add6bb4a65cc8be38f0bf527f141aac48a
Related-Change: Ie73988edf6be0e38d9004bee04ff46c906a759ff
Related-Change: I5ae2cc5c257fb8d7eab885977d9d9cf602224ec7
Related-Change: I4b5b12a3b0288b696a39903264741bc862a94ad7
Related-Change: Ic4b5005e3efffa8dba17d91a41e46d5c68533f9a
2025年01月07日 10:22:44 -05:00
Clay Gerrard
e39a0d1959 Leave updater per-device stats in recon for debugging
It's confusing and unecessary to have the last cycles per-device
object-updater stats reaped from recon immediately during aggregation
and make it impossible to debug stats aggregation.
Drive-by: fix some bugs with stats aggregation
Change-Id: I9df7c2d1c31646a3200614b629598576eb9e64c0
2025年01月06日 17:11:31 -06:00
Zuul
95b9e6e335 Merge "CI: Add Dalmatian upgrade job" 2025年01月03日 19:08:35 +00:00
Zuul
06e09b9ece Merge "Drop py2 support" 2025年01月01日 07:20:05 +00:00
Tim Burke
1f0777d96c tests: Enforce sorted listdir results in test_updater
Previously, we were relying on some xfs-specific return order.
Change-Id: If9a0fdb3749a18a9479f20fb174e0c1908a783bb
2024年12月30日 21:54:37 -08:00
Tim Burke
1f35e0c10f CI: Add Dalmatian upgrade job
Change-Id: Ia028624bded221c3bf03a8d3dac94183d4388431
2024年12月29日 09:27:11 -08:00
Elod Illes
3c9838101a [CI] Remove old experimental rolling upgrade job
This patch removes swift-multinode-rolling-upgrade-victoria job for
multiple reason:
- victoria is very old and unmaintained
- job is defined only on unmaintained branches
- py38 + CentOS Stream 8 are EOL'd and the job is based on these
Change-Id: I3d6a679e6553534e937303b5210125a6ef8365af
2024年12月22日 12:06:41 +01:00
Tim Burke
7367907c58 Drop py2 support
* Remove py2 gate jobs.
* Build non-universal, py3-only wheels.
* Specify minimum python version in package metadata.
* Clean up requirements/constraints/bindep (a little, anyway).
Change-Id: I53153c4fde043e964e1daa7bbf2089e0471dede2
2024年12月20日 09:11:14 -08:00
Zuul
5d1dbbccbe Merge "docs: Changed OS version to RHEL 9 and CentOS Stream 9." 2024年12月20日 16:28:23 +00:00
ngcjny
17f77b2d76 docs: Changed OS version to RHEL 9 and CentOS Stream 9.
Changed OS version from RHEL 7 and CentOS 7 to RHEL 9 and
CentOS Stream 9.
Changed python to python3.
Changed yum command to dnf command.
Change-Id: Ie1e815c0434255e77ef5e9103576f85d9d6490ae
2024年12月20日 16:11:33 +00:00
Zuul
b7c228c234 Merge "trivial: Enable a couple off-by-default hacking checks" 2024年12月20日 13:02:38 +00:00
Zuul
78fd4e6bfa Merge "Require that updater_workers be a postive integer" 2024年12月20日 00:23:16 +00:00
Chinemerem
fbfdc89df5 Require that updater_workers be a postive integer
Previously, it was possible for updater_workers to be a negative integer
or zero. This change enforces that updater_workers should be a positive
integer.
Change-Id: Ie40194b406aeedcf8c38a3c273ab768e2b643a5d
2024年12月19日 21:54:32 +00:00
Chinemerem
5281af5cf2 Add object_updater_last stat
Change-Id: I22674f2e887bdeeffe325efd2898fb90faa4235f
2024年12月19日 11:10:52 -08:00
Zuul
ea06ed4494 Merge "Aggregate per-disk recon stats" 2024年12月19日 18:35:24 +00:00
Tim Burke
4b3696003c trivial: Enable a couple off-by-default hacking checks
H106 and H904 were already passing anyway.
Change-Id: Ic386e09e40a49b661f30ea40e2c737d59100d086
2024年12月19日 09:57:44 -08:00
Zuul
3944630007 Merge "CI: Clean up deps for various doc builds" 2024年12月19日 16:49:42 +00:00
Chinemerem
af57922cd8 Aggregate per-disk recon stats
Address an issue where `OldestAsyncManager` instances created before forking resulted in each child process maintaining its own isolated copy-on-write stats, leaving the parent process with an empty/unused instance. This caused the final `dump_recon` call at the end of `run_forever` to report no meaningful telemetry.
The fix aggregates per-disk recon stats collected by each child process. This is done by loading recon cache data from all devices, consolidating key metrics, and writing the aggregated stats back to the recon cache.
Change-Id: I70a60ae280e4fccc04ff5e7df9e62b18d916421e
2024年12月19日 02:02:41 -08:00
Zuul
155b759714 Merge "Bring py3-constraints.txt more in line with global u-c" 2024年12月18日 21:26:13 +00:00
Zuul
0b534d5846 Merge "Up-rev hacking" 2024年12月18日 02:19:09 +00:00
Zuul
4faa3523c9 Merge "CI: Configure bandit better" 2024年12月17日 10:55:51 +00:00
Tim Burke
13197af6e3 CI: Clean up deps for various doc builds
- Define a single location for doc-build deps
 - As a side-effect, use constraints for api-ref builds
- Remove test-requirements.txt from those deps
Change-Id: If6cc8702e89f5110ad89ba933f55641de02550e9
2024年12月12日 15:02:27 -08:00
Zuul
fe7928ea8a Merge "Add unit test for object-updater recon dump" 2024年12月11日 21:22:32 +00:00
Zuul
d3eb11625d Merge "Refactor FormPost to use WSGIContext" 2024年12月11日 08:56:20 +00:00
nathang15
404edeb7fa Refactor FormPost to use WSGIContext
... instead of self-handling subrequests manually.
Closes-Bug: #1523401
Change-Id: I85b5302c2416de1793599385b791fcd3ec3b4da0
2024年12月11日 06:37:10 +00:00
Tim Burke
e576c5cee0 CI: Configure bandit better
Declare the tests to skip, rather than the tests to run. This ensures
that we pick up new bandit checks automatically.
I recently noticed a use of md5() without the usedforsecurity=False
kwarg. Confused about why this wasn't caught in the gate, I eventually
traced it back to B303 (which we explicitly enabled) being largely
superseded by B324 (which did not exist when we wrote down the tests
to enable).
Flag a bunch of false-positives with "# nosec" comments, resolve two
other errors, and skip some more-pervasive errors, to be resolved later.
Change-Id: Ia054e4f7c9e5bf29064a66933e27830adbc107d3
2024年12月10日 15:18:12 -08:00
Tim Burke
a55a48ffc8 docs: Call out that xprofile is not intended for production
Change-Id: I1e9d4d5df403040d69db93a08647cd0abe1b8037
2024年12月10日 15:17:11 -08:00
Tim Burke
199aa78fbe xprofile: Stop using eval()
All we need is int(). Using eval() on user-provided data (or really at
all) is a Bad Idea.
Closes-Bug: #2091124
Change-Id: I39bb87f9d8e27f2f88410a087a120a0e9be1a243
2024年12月10日 15:16:41 -08:00
Zuul
b371c38fc5 Merge "Clarify ContainerBroker tests re expirer queue items" 2024年12月10日 18:56:34 +00:00
Alistair Coles
3a5bbcd7a6 Clarify ContainerBroker tests re expirer queue items
Add some commentary as a reminder that whilst normal object updates to
the ContainerBroker cannot have content-type timestamp older than data
timestamp, expirer queue updates can.
Change-Id: I6d8ad06c645f25497dc15173460430fd93747afa
Related-Change: Ie4b25f1bd16def4069878983049b83de06f68e54
2024年12月10日 10:50:38 +00:00
Alistair Coles
cde99ff660 Add unit test for object-updater recon dump
Related-Change: Iba43783e880e0860357ba8b9f0a11f28abf87555
Change-Id: I1e096dab9a97956bf786ccbcd37c20f9a3a5429e
2024年12月10日 10:26:48 +00:00
Zuul
f9a3f142ab Merge "Make OldestAsyncPendingTracker timestamp float" 2024年12月09日 23:05:20 +00:00
Zuul
cffa7dea77 Merge "CI: Consistently use TOX_CONSTRAINTS_FILE" 2024年12月09日 23:05:16 +00:00
Zuul
9efaae78a5 Merge "Up-rev hacking" 2024年12月09日 23:05:13 +00:00