Uses gzip or Brotli. Guide to enable compression
Performance: Enable text compression #845
@fnetX It seems it's rather trival to enable gzip, if you want we can share the responsibility.
Unfortunatly the reverse proxy we're using, doesn't support brotli.
This is weird. We should have gzip compression enabled.
Just checked:
GET /Codeberg/Community/issues/845 HTTP/2
HTTP/2 200 OK
cache-control: no-store, no-transform
content-type: text/html; charset=UTF-8
set-cookie: ...
date: 2023年1月19日 13:02:40 GMT
strict-transport-security: max-age=63072000; includeSubDomains; preload
permissions-policy: interest-cohort=()
x-frame-options: sameorigin
x-content-type-options: nosniff
content-security-policy-report-only: ...
X-Firefox-Spdy: h2
vs
GET /assets/stlview.js HTTP/2
HTTP/2 200 OK
accept-ranges: bytes
cache-control: private, max-age=21600
content-type: text/javascript; charset=utf-8
etag: W/"NDMwM3N0bHZpZXcuanNXZWQsIDE4IEphbiAyMDIzIDIyOjIyOjQ5IEdNVA=="
last-modified: 2023年1月18日 22:22:49 GMT
date: 2023年1月19日 11:43:57 GMT
strict-transport-security: max-age=63072000; includeSubDomains; preload
permissions-policy: interest-cohort=()
x-frame-options: sameorigin
x-content-type-options: nosniff
content-security-policy-report-only: ...
content-encoding: gzip
vary: Accept-Encoding
X-Firefox-Spdy: h2
For some reason, only certain routes seem to use gzip.
It looks like we explicitly don't compress HTML. I wonder if someone can elaborate why ... ?
Edit: Oh, we do ... No clue then.
https://docs.haproxy.org/2.7/configuration.html#9.2-filter%20compression
is mandatory to explicitly use a filter line to enable the HTTP compression when at least one filter other than the cache or the fcgi-app is used for the same listener/frontend/backend. This is important to know the filters evaluation order.
But I don't understand the config of kampenwand, maybe @gapodo?
But I don't understand the config of kampenwand, maybe @gapodo?
While I (believe I) understand the kempenwand config fairly well by now, I haven't really played with compression on haproxy (I mostly use it as L4-Loadbalancer with SNI routing), if there is a need to investigate please ping me, as I'm currently fairly busy, and it'll likely be a week or two until I get the time for this (also not high up on my priority list, as compression / lower bandwidth is nice, but not using CPU cycles for compression is also nice... and it "works for my bandwidth" :D I'd need to investigate for this specifically as compression is done on either nginx or apache in my setups)
@gapodo It would indeed help if you have a look at the highlighted lines and see if there's an obvious problem. Other than that, I think this issue is not highest priority. It would be better to improve caching and filesizes in general.
And... I had some unexpected time to take a look at the lines in question (and a preliminary read into brotli)
After some digging / testing and comparing I could not see a smoking gun, but noticed the Cache-control header in the 2 replies being different, after some more digging... I found it in the HAProxy docs...
HAProxy 2.6 Configuration - Compression (same in the docs for 2.0 2.1 2.2 2.3 2.4 and 2.5, just change the version in the link) Quote from the linked docs section, emphasis mine...
Compression will be activated depending on the Accept-Encoding request
header. With identity, it does not take care of that header.
If backend servers support HTTP compression, these directives
will be no-op: HAProxy will see the compressed response and will not
compress again. If backend servers do not support HTTP compression and
there is Accept-Encoding header in request, HAProxy will compress the
matching response.
Compression is disabled when:
- the request does not advertise a supported compression algorithm in the
"Accept-Encoding" header- the response message is not HTTP/1.1 or above
- HTTP status code is not one of 200, 201, 202, or 203
- response contain neither a "Content-Length" header nor a
"Transfer-Encoding" whose last value is "chunked"- response contains a "Content-Type" header whose first value starts with
"multipart"- the response contains the "no-transform" value in the "Cache-control"
header- User-Agent matches "Mozilla/4" unless it is MSIE 6 with XP SP2, or MSIE 7
and later- The response contains a "Content-Encoding" header, indicating that the
response is already compressed (see compression offload)- The response contains an invalid "ETag" header or multiple ETag headers
Which is what happens here... likely Forgejo already sets the Cache-control header (which is the correct behavior for a dynamically changing page), which disables compression.
Re. Brotli... HAProxy doesn't support brotli (and for a very good reason), as it is comparatively memory intensive (~6-10x gzip, which is already ~6x to deflate), this causes exceptionally high memory utilization for compression, which is esp. bad on high traffic reverse proxies / load balancers (HAProxy is made to handle humongous amounts of traffic on a small resource budget). Brotli shines on pre-compressing static assets, but doesn't perform well in inline compression, so adopting it should not be considered on dynamic assets, no matter what...
Brotli shines on pre-compressing static assets
For clarity: brotly shines on pre-compressing static assets compressed with brotly; If you enable brotly over http to jpeg, png or webp requests, you will add a lot of computing with very low benefit and even with high posibilites of increase the request size. That's the reason why http compressing of images usually is not a general option (unless SVG which is mainly text or bitmap formats).
For clarity: brotly shines on pre-compressing static assets compressed with brotly;
We may be talking about the same thing, but your double mention of brotli is somewhat messing with me... by pre-compressing static assets, I mean compress store and deliver the compressed item without in-line compression (basically send the .tgz (not a .tgz but it's an easy example) that's stored on the disk already).
Pre-compressing static assets may be meaningful even for some images,... and it can be automatically tested in e.g. the build step (which is what we do with assets for high traffic sites at work), since you are only doing it once, using the highest possible compression (even if it takes ages) may be a feasible option, as it's done once not on every request (though this requires web servers capable of delivering pre-compressed data as if it was compressed in-line, which IIRC not that many servers can handle).
For clarity: brotly shines on pre-compressing static assets compressed with brotly;
We may be talking about the same thing, but your double mention of brotli is somewhat messing with me...
Your statement about brotly shines on pre-compressing done the same with me XD
... by pre-compressing static assets, I mean compress store and deliver the compressed item without in-line compression (basically send the .tgz (not a .tgz but it's an easy example) that's stored on the disk already).
Yes, I understand the same, but in that cases brotly doesn't shine because essentially does nothing in terms of compression, the data pass as is. So the "shine" is the source of my confusion, because gzip and deflate shine the same in that cases.
Pre-compressing static assets may be meaningful even for some images,... and it can be automatically tested in e.g. the build step (which is what we do with assets for high traffic sites at work), since you are only doing it once, using the highest possible compression (even if it takes ages) may be a feasible option, as it's done once not on every request (though this requires web servers capable of delivering pre-compressed data as if it was compressed in-line, which IIRC not that many servers can handle).
http (in-line) compression is not other than a compression of the http body but with an algorithm capable of doing single pass for both compression and decompression (i.e. streaming (de)compression, in server and browser respectively), as all the Lempel-Ziv (LZ) family and derivated do (deflate, gzip, gif, png, brotli,... ). But not all of them are useful for being applied as stream compression in the server (do you imagine a png built on demand?), that's why just deflate, gzip and brotli have support both sides. Furthermore, they just have being applied for texts, because any LZ-compression applied over any other LZ-compressed-data has very low improvement with almost equal cost and possibly higher size. This bypassing is compression offload.
As a PoC, you can view a text compressed request with gzip or brotli with curl passed to an off-line "pre-compressor":
curl -H 'Accept-Encoding: gzip' https://www.example.org --output -
curl -H 'Accept-Encoding: gzip' https://www.example.org --output - | gunzip -c -
curl -H 'Accept-Encoding: br' https://www.newtenberg.com --output -
curl -H 'Accept-Encoding: br' https://www.newtenberg.com --output - | brotli -d -
I would be in favor of allowing precompressed static files (gzip, brotli, zstd). One advantage here is nudging users to precompress at max levels since servers try to weigh to cost of spending cycles compressing on the fly to power usage/transfer speeds choosing something a balanced instead of strictly smaller size. The downstream user gets the smallest output possible when precompressed. This does use more storage & currently Git is the mechanism for uploading static files for whatever reason despite Git not being great for blob-like files (transfer or storage) instead of cURL to POST an archive to some endpoint.
The lack of compression is quite noticeable when loading issue lists. For example, https://codeberg.org/forgejo/forgejo/issues is half a megabyte and download takes ~130 ms on my connection.
Is there a specific reason why dynamic content isn't compressed? If related: has this ever been reconsidered since the more performant zstd became widely supported by browsers (mid 2024)?
Do note that Zstd largely isn’t a win for "web" tho (such as the example of the issues page) compared to the older, more ubiquitous Brotli. Brotli was made specifically for the web platform with dictionaries specific to HTML+CSS+JS, minimizing data usage, & better streaming to show partially loaded content. Zstd still has a place with binary files, repository archives, etc. but isn’t the best option for serving web content.
Reduces accessibility and is thus a "bug" for certain user groups on Codeberg.
Something is not working the way it should. Does not concern outages.
Errors evidently caused by infrastructure malfunctions or outages
This issue involves Codeberg's downstream modifications and settings and/or Codeberg's structures.
Please join the discussion and consider contributing a PR!
No bug, but an improvement to the docs or UI description will help
This issue or pull request already exists
New feature
Involves changes to the server setups, use `bug/infrastructure` for infrastructure-related user errors.
An issue directly involving legal compliance
involving questions about the ToS, especially licencing compliance
Please consider editing your posts and remember that there is a human on the other side. We get that you are frustrated, but it's harder for us to help you this way.
Things related to Codeberg's external communication
More information is needed
This issue contains a clearly stated problem. However, it is not clear whether we have to fix anything on Codeberg's end, but we're helping them fix it and/or find the cause.
Related to Forgejo. Please also check Forgejo's issue tracker.
Migration related issues in Forgejo
Issues related to the Codeberg Pages feature
Issue is related to the Weblate instance at https://translate.codeberg.org
Woodpecker CI related issue
involves improvements to the sites security
Add a new service to the Codeberg ecosystem (instead of implementing into Gitea)
An open issue or pull request to an upstream repository to fix this issue (partially or completely) exists (i.e. Gitea, Forgejo, etc.)
Codeberg's current set of contributors are not planning to spend time on delegating this issue.
No due date set.
No dependencies set.
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?