f256bc7eb385d8d80c2ce2aa26e270171d9164a1
6261 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Zuul
|
f39133055f | Merge "sharder: make gap and overlap warning logs shorter" | ||
|
Zuul
|
2e8338240f | Merge "Fix recursion error in account_quota middleware" | ||
|
Christian Schwede
|
06a6329793 |
Fix recursion error in account_quota middleware
There is an infinite loop if multiple quota limits are set and exceeded,
eventually resulting in a 500 response due to a RecursionError ("maximum
recursion depth exceeded").
The issue is the delayed rejection, required to support container_acls.
If any quota is exceeded the middleware needs to return directly,
without proceeding to check other quota settings.
The fix is basically to add a "return self.app". However, there is quite
some redundant code, thus moving this into its own method.
Another test with multiple exceeded quotas has been added, which is
failing without the bugfix.
Closes-Bug: #2118758
Change-Id: I49ec4c5f6c83f36ce1d38f2f1687081c71488286
Signed-off-by: Christian Schwede <cschwede@redhat.com>
|
||
|
Alistair Coles
|
fd342b9190 |
sharder: make gap and overlap warning logs shorter
Previously, when the audit process detected gaps and/or overlaps in a DB's shard ranges, it would log a warning that included a list of all impacted shard ranges. The log message can grow long when there are gaps or overlaps involving many shard ranges: so long that syslog might raise an OSError (Message too long). This patch shortens these log warning messages to only include a count of the number of gaps and/or overlaps. The count may still be useful to observe how a problem has developed over time. The detailed information is better accessed using the swift-manage-shard-ranges repair command. Change-Id: I055c40395807708de60882f53652d9533a495d09 Signed-off-by: Alistair Coles <alistairncoles@gmail.com> |
||
|
Tim Burke
|
ae062f8b09 |
ring: Introduce a v2 ring format
There's a bunch of moving pieces here: - Add a new RingWriter class. Stick it in a new swift.common.ring.io module. You *can* use it like the old gzip file, but you can also define named sections which can be referenced later on read. Section names may be arbitrary strings, but the "swift/" prefix is reserved for upstream use. Sections must contain a single length-value encoded BLOB. If sections are used, an additional BLOB is written at the end containing a JSON section-index, followed by an uncompressed offset for the index. Move RingReader to ring/io.py, too. - Clean up some ring metadata handling: - Drop MD5 tracking in RingReader. It was brittle at best anyway, and nothing uses it. YAGNI - Fix size/raw_size attributes when loading only metadata. - Add the ability to seek within RingReaders, though you need to know what you're doing and only seek to flush points. - Let RingBuilder objects change how wide their replica2part2dev_id arrays are. Add a dev_id_bytes key to serialized ring metadata. dev_id_bytes may be either 2 or 4, but 4 requires v2 rings. We considered allowing dev_id_bytes of 1, but dropped it as unnecessary complexity for a niche use case. - swift-ring-builder version subcommand added, which takes a ring. This lets operators see the serialization format of a ring on disk: $ swift-ring-builder object.ring.gz version object.ring.gz: Serialization version: 2 (2-byte IDs), build version: 54 Signed-off-by: Tim Burke <tim.burke@gmail.com> Change-Id: Ia0ac4ea2006d8965d7fdb6659d355c77386adb70 |
||
|
Zuul
|
e75e93f11c | Merge "Drop support for old pickled rings" | ||
|
Tim Burke
|
0417979ca5 |
Drop support for old pickled rings
We stopped writing pickled rings more than twelve years ago. Any cluster that was going to upgrade from then has, or can pick any of the multitude of intermediary releases to pause at and push rings. We can also safely assume that regions will be present for devices; that change is nearly as old. As a side-effect, clean up some old tests that did nonsense things like having 7 assignments per row for a part-power-2 ring. UpgradeImpact: remove ability to read rings written by swift <1.7.0 circa 2012 Related-Change: I799b9a4c894d54fb16592443904ac055b2638e2d Related-Change: Ifefbb839cdcf033e6c9201fadca95224c7303a29 Signed-off-by: Tim Burke <tim.burke@gmail.com> Change-Id: Ic8322b18d51b40f586cb217a0d1b2f345e1d8df6 |
||
|
Zuul
|
8af485775a | Merge "s3api: Add support for crc64nvme checksum calculation" | ||
|
Zuul
|
a1f7a1e82d | Merge "s3api: add more assertions w.r.t. S3 checksum BadDigest" | ||
|
Alistair Coles
|
404e1f2732 |
s3api: Add support for crc64nvme checksum calculation
Add anycrc as a soft dependency in case ISA-L isn't available. Plus we'll want it later: when we start writing down checksums, we'll need it to combine per-part checksums for MPUs. Like with crc32c, we won't provide any pure-python version as the CPU-intensiveness could present a DoS vector. Worst case, we 501 as before. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Signed-off-by: Tim Burke <tim.burke@gmail.com> Change-Id: Ia05e5677a8ca89a62b142078abfb7371b1badd3f Signed-off-by: Alistair Coles <alistairncoles@gmail.com> |
||
|
Zuul
|
d9115f24d6 | Merge "s3api: add compat test sending too much body with checksum" | ||
|
Zuul
|
2fc9209d47 | Merge "s3api: Validate additional checksums on upload" | ||
|
Alistair Coles
|
61c0bfcf95 |
s3api: add more assertions w.r.t. S3 checksum BadDigest
Assert that BadDigest responses due to checksum mismatch do not include the expected or computed values. Change-Id: Iaffa02c3c02fa3bc6922f51ecf28a39f4b24ccf2 Signed-off-by: Alistair Coles <alistairncoles@gmail.com> |
||
|
Alistair Coles
|
351ee72790 |
s3api: add compat test sending too much body with checksum
Adds a test that verifies extra body content beyond the content-length is ignored provided that the checksum value matches that of the content-length bytes. Add comment to explain why this is the case. Drive-by: add clarifying comment to unit test. Change-Id: I8f198298a817be47223e2f45fbc48a6f393b3bef Signed-off-by: Alistair Coles <alistairncoles@gmail.com> |
||
|
Tim Burke
|
be56c1e258 |
s3api: Validate additional checksums on upload
See https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html for some background. This covers both "normal" objects and part-uploads for MPUs. Note that because we don't write down any client-provided checksums during initiate-MPU calls, we can't do any verification during complete-MPU calls. crc64nvme checksums are not yet supported; clients attempting to use them will get back 501s. Adds crt as a boto3 extra to test-requirements. The extra lib provides crc32c and crc64nvme checksum support in boto3. Co-Authored-By: Ashwin Nair <ashnair@nvidia.com> Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Signed-off-by: Tim Burke <tim.burke@gmail.com> Signed-off-by: Alistair Coles <alistairncoles@gmail.com> Change-Id: Id39fd71bc59875a5b88d1d012542136acf880019 |
||
|
Zuul
|
1428eb3b58 | Merge "Fix traceback in invalidate_hash" | ||
|
Zuul
|
364cc6556f | Merge "s3api: fix multi-upload BadDigest error" | ||
|
Alistair Coles
|
1a27d1b83f |
s3api: fix multi-upload BadDigest error
S3 includes the expected base64 digest in a BadDigest response to a multipart complete POST request. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: Ie20ccf10846854f375c29be1b0b00b8eaacc9afa |
||
|
Clay Gerrard
|
53b66155a7 |
test: use a tempdir in TestRingData
Change-Id: I88e2e743ccbd6292bc1570ae0efbdd45dcced8cc |
||
|
Tim Burke
|
3dba681005 |
Fix traceback in invalidate_hash
Change-Id: I80142c6c0654b65b5755e7e828bcc4969a10f4f1 |
||
|
Zuul
|
69bff25516 | Merge "Use built-in implementation to get utc timezone" | ||
|
Takashi Kajinami
|
9754eff025 |
Use built-in implementation to get utc timezone
datetime.timezone.utc[1] has been available in Python 3 and can be used instead of datetime.UTC which is available only in Python >=3.11 . [1] https://docs.python.org/3.13/library/datetime.html#datetime.timezone.utc Change-Id: I92bc82a1b7e2bcb947376bc4d96fc603ad7d5b6c |
||
|
Takashi Kajinami
|
005d69d1a9 |
Drop remaining skip check for Python < 3
... because Python 2.x is no longer supported. Change-Id: I3167a539b3e26ceb35976fbd7a2356ba59d4a5e4 |
||
|
Zuul
|
d2272833fe | Merge "tests: Fix some connection-closed testing on OS X" | ||
|
Tim Burke
|
2e14051cb6 |
tests: Fix some connection-closed testing on OS X
Change-Id: I32fec7540bee70e77140964c5983d133a572fa7b |
||
|
Tim Burke
|
28db9bbcdd |
tests: Fix test_LoggerFileObject_recursion
It used to be that when a logger encountered an error trying to handle / emit a message, it would attempt to log about the error via the same logger. Long ago, this could lead to infinite recursion in Swift's LoggerFileObject, so we fixed it and added a test that included showing the bad stdlib behavior. Recently, stdlib did a similar fix so the recursion doesn't happen. See - https://github.com/python/cpython/issues/91555 and - https://github.com/python/cpython/pull/131812 Of course, now our test breaks (at least, on 3.13.4). Relax the assertion a little. Related-Change: Ia9ecffc88ce43616977e141498e5ee404f2c29c4 Change-Id: Id2f490d4204b8eaf07857cb84ed783bec19b8511 |
||
|
Zuul
|
ad41dbeffe | Merge "s3 compat tests: sanitize object listings" | ||
|
Zuul
|
184641a754 | Merge "tests: Keep port number in valid range" | ||
|
Alistair Coles
|
962084ded0 |
s3 compat tests: sanitize object listings
Swift does not return all the parameters of objects in a listing (e.g. ChecksumType and ChecksumAlgorithm) so pop these from listings before making assertions. Change-Id: Ieb7a9783731c11f1c08db398eae07ffafa127460 |
||
|
Zuul
|
afacfb6cea | Merge "Object-server: change labeled timing metrics sample rate for debugging requests" | ||
|
Yan Xiao
|
313959ae92 |
Object-server: change labeled timing metrics sample rate for debugging requests
The labeled timing metrics previously have sample_rate similar to those added for non-labeled metrics. However the down sampling has not been helpful when using labeled metrics to investigate customer issues, for example those related to object server REPLICATE requests. This patch changes labeled timing metrics to not have sample_rate. The non-labeled metrics are unrelated to this effort thus not changed. Related-Change: I05336b700120ab5fcf922590d6a12f73112edb50 Change-Id: Ia6e856ffaf8fd1b4a905e6976ebdc62ed5ddf32f |
||
|
Zuul
|
1c244b3cd5 | Merge "Add config option for whether to skip s3_acl-requiring tests" | ||
|
Tim Burke
|
3ff6b34a3b |
tests: Keep port number in valid range
Previously (at least on OSX) this could lead to errors like socket.gaierror: [Errno 8] nodename nor servname provided, or not known when we actually went and tried to connect to send the update (!!) Change-Id: I86f6e731d1ee273c6772974ce597ac91be3937be |
||
|
Zuul
|
f64269b981 | Merge "Object-server: add labeled timing metrics for other object server requests" | ||
|
Zuul
|
579bf0cf8a | Merge "tests: Speed up statsd test_methods_are_no_ops_when_not_enabled" | ||
|
Zuul
|
74e09cdbbd | Merge "tests: Reduce test time on OSX" | ||
|
Tim Burke
|
aa5bc01982 |
tests: Speed up statsd test_methods_are_no_ops_when_not_enabled
Change-Id: I0a27efe897b4e8ce2c21da1a3603a2a77c02eb69 |
||
|
Tim Burke
|
877c936e2f |
tests: Reduce test time on OSX
Locally, this reduced the test time from 240s to 0.16s. Change-Id: I5a4786b6782c06f8e6bd9fab5d4dae683a970242 |
||
|
Yan Xiao
|
c66c0bfd25 |
Object-server: add labeled timing metrics for other object server requests
Change-Id: I05336b700120ab5fcf922590d6a12f73112edb50 |
||
|
Zuul
|
1cd20f87de | Merge "Add labeled metrics to proxy-logging" | ||
|
Zuul
|
7d37076f9d | Merge "s3api: more test cases for conditional writes." | ||
|
Jianjian Huo
|
33b17742a7 |
s3api: more test cases for conditional writes.
Change-Id: Id5583e3a1e4515ec3c8a972f647aaaabfba673bc Related-Change: I2e57dacb342b5758f16b502bb91372a2443d0182 |
||
|
Yan Xiao
|
82b1964479 |
Add labeled metrics to proxy-logging
Modified code to use native labeled metrics API introduced by the Related-Change. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Co-Authored-By: Shreeya Deshpande <shreeyad@nvidia.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Related-Change: I115ffb1dc601652a979895d7944e011b951a91c1 Change-Id: I5a96626862644fe23a0d4bc53e85989770c8ceec |
||
|
Alistair Coles
|
b735b3d034 |
object-server: return 503 not 404 if meta or data file unlinked
When a DiskFile is opened, its on-disk files are first listed and then xattr metadata is read. If a listed file no longer exists when its metadata is read then a DiskFileNotExist exception was raised which would previously cause the object server to return a 404. It isn't appropriate to return 404 when the list of on-disk files suggested that the object existed (i.e. included a .data file) and there is no evidence that the object no longer exists: the missing file may well have been *replaced* by a concurrent request. For example, if a POST request races with another concurrent POST request, the object's .meta file may be replaced by a new .meta file, and reading the original .meta file will fail. Similarly, if a POST races with a concurrent PUT request, the PUT request may replace the .data file causing the POST request handler to fail to read the original .data file. In neither case has the object ever been non-existent. This issue was observed during a period of many concurrent POST requests to an object in a production system. A significant number of the POSTs returned 404 despite the object never being deleted. This patch modifies DiskFile to raise a new DiskFileStateChanged exception when metadata cannot be read from a data, meta, or ts file that was listed in the object's datadir. The object server may translate this exception to a 503 response rather than a 404, depending on the request method: * POSTs, GETs and HEADs that encounter this transient loss of a file will now return 503 rather than 404. The DiskFile may still exist. * DELETEs that encounter this transient loss of a file would previously have proceeded but will now return 503: a replacement file may contain an X-Delete-At that would have prevented the DELETE proceeding. * PUTs that encounter this transient loss of a file will continue to proceed as before: a replacement meta file may contain an X-Delete-At that requires updates to the expirer queue but those updates would presumably be handled by the concurrent POST that has replaced the meta file; a replacement .data file will remain on disk until the DiskFile is next opened and one of the .data files will be cleaned up. Closes-Bug: #2054791 Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: I2f698c25ed65b236e851e5a307d48a12cef62b33 |
||
|
Zuul
|
84a70769b1 |
Merge "s3api: Allow PUT with if-none-match: *"
|
||
|
Zuul
|
f516065abd | Merge "common: add http exception handling for base storage server timing stats" | ||
|
Zuul
|
4f7b1a9b7d | Merge "Object-server: add labeled timing metrics for object REPLICATE request" | ||
|
Tim Burke
|
edd5eb29d7 |
s3api: Allow PUT with if-none-match: *
Swift already supports that much, at least. AWS used to not support any conditional PUTs, but that's changed somewhat recently; see - https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/ - https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-s3-functionality-conditional-writes/ Drive-By: Fix retry of a CompleteMultipartUpload with changed parts; it should 404 rather than succeed in writing the new manifest. Change-Id: I2e57dacb342b5758f16b502bb91372a2443d0182 |
||
|
Yan Xiao
|
ec6e8bd203 |
common: add http exception handling for base storage server timing stats
The timing stats decorators are moved from utils to base_storage_server.py to avoid a circular import of HTTPException. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: Idc4b52de0a04ebfc0e353162bd791d4e0e20eac3 |
||
|
Zuul
|
4e2d08041a | Merge "tests: Use subTest" |