e39a0d1959b2167022fdb5dab6fed462d51b2fe8
Commit Graph

5327 Commits

Author SHA1 Message Date
Clay Gerrard
e39a0d1959 Leave updater per-device stats in recon for debugging
It's confusing and unecessary to have the last cycles per-device
object-updater stats reaped from recon immediately during aggregation
and make it impossible to debug stats aggregation.
Drive-by: fix some bugs with stats aggregation
Change-Id: I9df7c2d1c31646a3200614b629598576eb9e64c0
2025年01月06日 17:11:31 -06:00
Tim Burke
1f0777d96c tests: Enforce sorted listdir results in test_updater
Previously, we were relying on some xfs-specific return order.
Change-Id: If9a0fdb3749a18a9479f20fb174e0c1908a783bb
2024年12月30日 21:54:37 -08:00
Zuul
78fd4e6bfa Merge "Require that updater_workers be a postive integer" 2024年12月20日 00:23:16 +00:00
Chinemerem
fbfdc89df5 Require that updater_workers be a postive integer
Previously, it was possible for updater_workers to be a negative integer
or zero. This change enforces that updater_workers should be a positive
integer.
Change-Id: Ie40194b406aeedcf8c38a3c273ab768e2b643a5d
2024年12月19日 21:54:32 +00:00
Chinemerem
5281af5cf2 Add object_updater_last stat
Change-Id: I22674f2e887bdeeffe325efd2898fb90faa4235f
2024年12月19日 11:10:52 -08:00
Chinemerem
af57922cd8 Aggregate per-disk recon stats
Address an issue where `OldestAsyncManager` instances created before forking resulted in each child process maintaining its own isolated copy-on-write stats, leaving the parent process with an empty/unused instance. This caused the final `dump_recon` call at the end of `run_forever` to report no meaningful telemetry.
The fix aggregates per-disk recon stats collected by each child process. This is done by loading recon cache data from all devices, consolidating key metrics, and writing the aggregated stats back to the recon cache.
Change-Id: I70a60ae280e4fccc04ff5e7df9e62b18d916421e
2024年12月19日 02:02:41 -08:00
Zuul
fe7928ea8a Merge "Add unit test for object-updater recon dump" 2024年12月11日 21:22:32 +00:00
Tim Burke
199aa78fbe xprofile: Stop using eval()
All we need is int(). Using eval() on user-provided data (or really at
all) is a Bad Idea.
Closes-Bug: #2091124
Change-Id: I39bb87f9d8e27f2f88410a087a120a0e9be1a243
2024年12月10日 15:16:41 -08:00
Zuul
b371c38fc5 Merge "Clarify ContainerBroker tests re expirer queue items" 2024年12月10日 18:56:34 +00:00
Alistair Coles
3a5bbcd7a6 Clarify ContainerBroker tests re expirer queue items
Add some commentary as a reminder that whilst normal object updates to
the ContainerBroker cannot have content-type timestamp older than data
timestamp, expirer queue updates can.
Change-Id: I6d8ad06c645f25497dc15173460430fd93747afa
Related-Change: Ie4b25f1bd16def4069878983049b83de06f68e54
2024年12月10日 10:50:38 +00:00
Alistair Coles
cde99ff660 Add unit test for object-updater recon dump
Related-Change: Iba43783e880e0860357ba8b9f0a11f28abf87555
Change-Id: I1e096dab9a97956bf786ccbcd37c20f9a3a5429e
2024年12月10日 10:26:48 +00:00
Zuul
9efaae78a5 Merge "Up-rev hacking" 2024年12月09日 23:05:13 +00:00
Tim Burke
992d70198c Up-rev hacking
Change-Id: I9473fad7c46ac03bbc71328c17e988af9d21386c
2024年12月09日 09:41:56 -08:00
Zuul
6e37329bd6 Merge "Remove duplicate definition of empty string etag" 2024年12月03日 18:09:52 +00:00
Alistair Coles
5e800e328e Remove duplicate definition of empty string etag
Change-Id: Ib8196fe24f8a999315af469a435bd639378c78a9
2024年12月03日 12:56:34 +00:00
Zuul
61b2350ead Merge "tests: Use format=plain instead of format=txt" 2024年12月03日 10:49:34 +00:00
Tim Burke
ace2357c62 tests: Use format=plain instead of format=txt
Our API ref says "Valid values are json, xml, or plain. The default is
plain." There's no reason our tests ought to use an invalid value; it
will confuse anyone looking at tests for how to do a thing.
Note that tests were passing because invalid values are ignored, so
?format=txt behaves exactly like ?format=plain.
Change-Id: I6e119cc9c7297d8aade9736fa1d6f4a105466d77
2024年12月02日 12:21:20 -08:00
Zuul
0a6d20e388 Merge "Object-server: add periodic greenthread yielding during file write" 2024年11月29日 16:32:45 +00:00
Jianjian Huo
ea1d84c1d7 Object-server: add periodic greenthread yielding during file write
Currently, when object-server serves PUT request and DiskFile
writer write file chunks to disk, there is no explicit
eventlet sleep called. When network outpace the slow disk IO,
it's possible one large and slow PUT request could cause
eventlet hub not to schedule any other green threads for a
long period of time. To improve this, this patch enable the
configurable yield parameter 'cooperative_period' into object
server controller write path.
Related-Change: I80b04bad0601b6cd6caef35498f89d4ba70a4fd4
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I1c0aba9830433f093d024b4c39cd3a3b2f0d69f1
2024年11月29日 13:59:25 +00:00
Zuul
51849df687 Merge "Remove statds from the logs module" 2024年11月27日 18:16:02 +00:00
Alistair Coles
8699af83c9 Remove statds from the logs module
We would like to remove any Statsd related code from logs.py. That
requires that SwiftLogAdapter no longer provides a StatsdClient
interface by default. However, for backwards compatibility the main
utils.get_logger/get_prefixed_logger entrypoints must provide a
SwiftLogAdapter that does have a StatdsClient interface.
The new utils._patch_statsd_methods helper function is therefore used
to retrospectively patch a SwiftLogAdapter instance with the
StatsdClient interface when necessary. The _patch_statsd_helper is
used in get_logger, and again when we clone a logger in
get_prefixed_logger.
Co-Authored-By: Shreeya Deshpande <shreeyad@nvidia.com>
Change-Id: I44694b92264066ca427bb96456d6f944e09b31c0
2024年11月27日 10:54:38 +00:00
Zuul
447079399d Merge "Add error msg validation for BadDigest" 2024年11月25日 23:24:25 +00:00
ashnair
41c614bbb5 Add error msg validation for BadDigest
Change-Id: I5c9af73cc66e17dc662d8c4d97cbf71ded1fed6f
2024年11月25日 18:21:12 +00:00
Clay Gerrard
4aadb54025 Ensure Content-Length in backend container/account HEAD response
A failing CORS test in the gate discovered that, when running with
eventlet==0.38.0, container and account HEAD requests returned
Content-Type of application/json to clients regardless of the requested
format. This was due to the backend HEAD response no longer having a
Content-Length header, causing the listing_formats middleware to not
modify the returned Content-Type (see Related-Change).
The Related-Change fixed the client facing issue by making
listing_formats middleware ensure the correct Content-Type is returned
to clients even when Content-Length is absent in the backend
response. The Related-Change also ensured that the 204 response to
clients always has a Content-Length header.
This patch directly fixes the problem of backend account and container
server HEADs no longer having 'Content-Length: 0' by adding it
explicitly. This violates the RFC prohibition of a 204 response having
a Content-Length header [1], but preserves Swift's historic behavior
and is consistent with the proxy-server's 204 response to clients.
[1] https://httpwg.org/specs/rfc7230.html#header.content-length
Related-Change: If724485e1425d1481d10b9255436301e346f07e8
Change-Id: Idacc59c5f43367926eff5221ee7fc417a9bc2d50
2024年11月25日 11:13:42 +00:00
Clay Gerrard
fa889358ac Ensure correct content-type in container HEAD response
A failing CORS test in the gate discovered that we were responding
application/json to ?format=txt requests (which is maybe not even a
valid value for that qs param?), but only when running with
eventlet==0.38.0
This avoids the problem of backend container server HEADs no longer
having 'Content-Length: 0' by fixing the client HEAD resp headers before
we check for chunked-transfer resp.
Drive-By: refactor listing_formats to use HeaderKeyDict and always set
Content-Length explicitly
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Change-Id: If724485e1425d1481d10b9255436301e346f07e8
2024年11月25日 10:15:04 +11:00
Alistair Coles
ffbf17e47c Fix duplicate prefix in exception logging
This patch fixes a bug in the Related-Change that causes a prefix to
be added twice at the start of an 'exception' log message.
Previously, PrefixLoggerAdapter.exception would forward to the wrapped
LogAdapter.exception, and not call back to
PrefixLoggerAdapter.process. The prefix therefore needed to be added
in both PrefixLoggerAdapter.exception and PrefixLoggerAdapter.process
(for other log level methods).
SwiftLogAdapter.exception *does* call back to SwiftLogAdapter.process,
and so since the Related-Change it is not necessary to add the prefix
in both the SwiftLogAdapter.exception and SwiftLogAdapter.process
methods.
Related-Change: I8988c0add6bb4a65cc8be38f0bf527f141aac48a
Change-Id: Ia6e1f007989b0ef455b8dca8155b386a3fd9e8e1
2024年11月18日 18:06:55 +00:00
Zuul
aac4c14574 Merge "Differentiate unlinks and outdated unlinks" 2024年11月18日 01:55:28 +00:00
Zuul
71696d3a83 Merge "Remove PrefixLoggerAdapter and SwiftLoggerAdapter" 2024年11月14日 12:51:08 +00:00
Shreeya Deshpande
f88efdb4df Remove PrefixLoggerAdapter and SwiftLoggerAdapter
In order to modernize swift's statsd configuration we're working to
separate it from logging. This change is a pre-requisite for the
Related-Change in order to simplfy the stdlib base logger instance
wrapping in a single extended SwiftLogAdapter (previously LogAdapter)
which supports all the features swift's servers/daemons need
from our logger instance interface.
Related-Change-Id: I44694b92264066ca427bb96456d6f944e09b31c0
Change-Id: I8988c0add6bb4a65cc8be38f0bf527f141aac48a
2024年11月13日 15:40:41 -05:00
Zuul
90f75e33b7 Merge "reconciler: Record queue-clean-up later than old-policy-clean-up" 2024年11月13日 05:47:54 +00:00
Tim Burke
3a43242e79 reconciler: Record queue-clean-up later than old-policy-clean-up
Previously, queue-clean-up would use the same timestamp/offset as the
tombstone written down in the old policy to clean up now-moved data.
This could lead to bad behaviors, as the reconciler-written tombstone
could itself be enqueued to be reconciled, at the same timestamp. If
that happened, replication would never bring the DB replicas to
consistency, causing the reconciler to get different answers for
whether there was work to do.
Now, add an extra offset bump between the tombstone in the old policy
and queue-clean-up. Also add an extra offset to the moved data in the
new policy, to keep it one ahead of queue-clean-up. New ordering looks
like:
- data/ts in old policy at t0 (as well as queue entry time to move it)
- tombstone in old policy at t0_1
- queue clean-up time at t0_2
- moved data/ts in new policy at t0_3
Closes-Bug: #2028175
Change-Id: Ib0dda0d338f48336d18d3d817a0c5994e201042e
2024年11月06日 14:32:53 +00:00
Zuul
7662cde704 Merge "Add oldest failed async pending tracker" 2024年11月05日 08:22:10 +00:00
Chinemerem
0a5348eb48 Add oldest failed async pending tracker
In the past we have had some async pendings that repeatedly fail for months at a time. This patch adds an OldestAsyncPendingTracker class which manages the tracking of the oldest async pending updates for each account-container pair. This class maintains timestamps for pending updates associated with account-container pairs. It evicts the newest pairs when the max_entries is reached. It supports retrieving the N oldest pending updates or calculating the age of the oldest pending update.
Change-Id: I6d9667d555836cfceda52708a57a1d29ebd1a80b
2024年11月01日 15:49:53 -07:00
Chinemerem
39e4ae3076 Differentiate unlinks and outdated unlinks
This commit differentiates async pendings that are unlinked due to a successful object update (unlinks)from those async pendings that are unlinked due to a newer async_pending existing for the same object (outdated_unlinks).
Change-Id: I66e16207f3e368248617fc7ed3c6b5c80c54b1b5
2024年11月01日 16:26:51 +00:00
Clay Gerrard
df22032d79 object-expirer: add round_robin_cache_size option
Drive-Bys:
 * DRY out redundent configuration examples in expiring objects overview
 documentation.
 * Add missing delay_reaping man page docs.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I8879dbd13527233c878dff764ec411ce9619ee39
2024年11月01日 09:54:54 +00:00
Clay Gerrard
31ef443715 test-expirer: call tracking & exception stubs
This change enhances the FakeInteralClient to make it easier to use the
existing stubs to test error handling behaviors and make assertions
about the behaviors.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Ib652a752ba91dbc791eef2a1d6940bcb0d16e36a
2024年11月01日 09:37:01 +00:00
Anish Kachinthaya
4f69ab3c5d fix x-open-expired 404 on HEAD?part-number reqs
Fixes a bug with the x-open-expired feature where our magic header
does not get copied when refetching all manifests that causes
404 on HEAD requests with part-number=N query parameter since
the object-server returns an empty response body and the
proxy needs to refetch. The fix also applies to segment GET
requests if the segments have expired.
Change-Id: If0382d433f73cc0333bb4d0319fe1487b7783e4c
2024年10月18日 18:33:09 +00:00
Zuul
a427d2754f Merge "trivial: Default value for EUCLEAN" 2024年10月14日 10:16:40 +00:00
Zuul
a2e8cf08d5 Merge "FakeSwift: capture unexpected calls" 2024年10月12日 04:47:47 +00:00
Alistair Coles
056b2afbd7 FakeSwift: capture unexpected calls
Previously FakeSwift would raise a KeyError for a call that had no
matching registered response *before* capturing the call. This
prevents tests reliably asserting that only expected calls have been
made. In particular, if the code being tested handles the KeyError
gracefully then it was possible to write quite reasonable test
assertions that passed despite unexpected calls being made.
Tests in test/unit/container/test_reconciler.py seem to rely on this
behaviour, so this patch adds a capture_unexpected_calls option for
FakeSwift which defauts to True. This allows the reconciler tests to
opt out of the new stricter call capturing.
Change-Id: Idc6b6b5a2b665538e861700f5d0996fc39368f5b
2024年10月11日 14:31:36 +01:00
Tim Burke
d0b190f64a trivial: Default value for EUCLEAN
Apparently errno.EUCLEAN is not available on OS X, which can complicate
running unit tests.
Change-Id: Iaa3d7756949b4a67d4afe8a53b242ed9f41e9374
2024年10月11日 14:21:30 +01:00
Jianjian Huo
7980b6a0d3 Object-expirer: continue to process next container on listing errors
When Expirer is iterating task containers and doing listings, it's
possible one of the container nodes hosting the task container may
become overloaded (or have a really low backend ratelimit set).
Object-expirer should expect UnexpectedResponse and continue to try
and list the task objects in the next container.
And if the task container doesn't exist, expirer should not try to
delete the non-existent containers, before continue to work on
the next container.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Id1966fa22725a02471e2d7c5a42fb243b1cfcf6a
2024年10月04日 13:47:55 -07:00
Zuul
02b17e43f6 Merge "Make object-expirer respect internal_client_conf_path" 2024年10月03日 11:22:21 +00:00
Chinemerem
ee80f99aec Make object-expirer respect internal_client_conf_path
Previously the object-expirer would not respect an internal_client_conf_path option if its config was loaded from a "legacy" config filepath (e.g. **/object-expirer.conf). Legacy object-expirer config expected config sections for an internal-client to be included in the same file. However, "modern" object-expirer config files (e.g. **/object-server.conf) expect internal-client config to be loaded from a separate file specified by internal_client_conf_path and defaulting to 'internal_client_conf_path'.
"Legacy" expirer config is still used in production clusters but it is useful to use a shared internal-client config. This patch therefore makes the internal_client_conf_path option always be respected regardless of the path of the object-expirer conf file. If 'internal_client_conf_path' is not specified, "modern" config will continue to use the default '/etc/swift/internal-client.conf', but "legacy" config will default to the path of the expirer conf file (i.e. the same as the previous behavior).
Related-Change: Ib21568f9b9d8547da87a99d65ae73a550e9c3230
Change-Id: I24ec702cd2ed074ca9df084cefc896418cece394
2024年10月02日 08:57:33 -07:00
Alistair Coles
6e4ecfc5dc proxy: fix is_useful_response for py2
py2 http responses don't have a 'headers' attribute, so don't try to
use it.
Unfortunately the unit test infrastructure's FakeConn class does
have a 'headers' attribute (whose value doesn't match the result of
calling its 'getheaders()' method!), so this bug was not caught by
tests in the Related-Change.
Related-Change: I96f28ab0b2b5f9374c399e8905ee240e7b093f8b
Change-Id: I2cd820280b8c69cafc5730183903c9d379d8dde5
2024年09月26日 16:01:36 +01:00
Alistair Coles
ffdf962598 object-expirer: fix unused _make_internal_client arg
The RelatedChange introduced a _make_internal_client() method with an
unused argument 'is_legacy_conf'. This patch completes the original
intention i.e. for the selection of internal client config path to
also be moved to the new method and use the 'is_legacy_conf' arg.
Change-Id: I5075cb446a15edc7f47e83f6aa038c626bd1dd82
RelatedChange: Ia6e1e6a8b58a8476fa16a3c7d45e620c6d7f88e4
2024年09月11日 10:13:23 +01:00
Zuul
8efb333872 Merge "diskfile: Treat EUCLEAN like ENODATA" 2024年09月10日 10:23:03 +00:00
Zuul
98eb28d510 Merge "utils: paths with empty components are invalid" 2024年09月10日 03:03:09 +00:00
Zuul
146bfeb643 Merge "proxy-logging: Clean up some timing assertions" 2024年09月09日 13:39:36 +00:00
Tim Burke
015cbaac86 utils: paths with empty components are invalid
Note that you can still have a "//" in the path with rest_with_last, though.
Change-Id: I171afcd67b162634189b752ff92a4f43484bc12a
2024年09月06日 14:51:44 -07:00