Codeberg/Community

Fork 12

Code Issues 425 Activity

Performance: Enable text compression #845

New issue

Open

opened 2022年12月22日 02:08:46 +01:00 by HexagonCDN · 13 comments

HexagonCDN commented

2022年12月22日 02:08:46 +01:00

Copy link

Uses gzip or Brotli. Guide to enable compression

Uses [gzip](https://en.wikipedia.org/wiki/Gzip) or [Brotli](https://en.wikipedia.org/wiki/Brotli). [Guide to enable compression](https://developer.chrome.com/docs/lighthouse/performance/uses-text-compression)

👍 6

HexagonCDN referenced this issue

2022年12月22日 02:09:21 +01:00

Performance suggestions #817

Gusted commented

2022年12月23日 20:36:03 +01:00

Owner

Copy link

@fnetX It seems it's rather trival to enable gzip, if you want we can share the responsibility.

Unfortunatly the reverse proxy we're using, doesn't support brotli.

@fnetX It seems it's [rather trival to enable `gzip`](https://docs.haproxy.org/2.7/configuration.html#4.2-compression%20algo), if you want we can share the responsibility. Unfortunatly the reverse proxy we're using, [doesn't support brotli](https://github.com/haproxy/haproxy/issues/21).

👍 1

fnetX added the

infrastructure

Codeberg

labels

2022年12月23日 21:56:24 +01:00

fnetX commented

2023年01月19日 14:06:33 +01:00

Owner

Copy link

This is weird. We should have gzip compression enabled.

Just checked:

GET /Codeberg/Community/issues/845 HTTP/2
HTTP/2 200 OK
cache-control: no-store, no-transform
content-type: text/html; charset=UTF-8
set-cookie: ...
date: 2023年1月19日 13:02:40 GMT
strict-transport-security: max-age=63072000; includeSubDomains; preload
permissions-policy: interest-cohort=()
x-frame-options: sameorigin
x-content-type-options: nosniff
content-security-policy-report-only: ...
X-Firefox-Spdy: h2

GET /assets/stlview.js HTTP/2
HTTP/2 200 OK
accept-ranges: bytes
cache-control: private, max-age=21600
content-type: text/javascript; charset=utf-8
etag: W/"NDMwM3N0bHZpZXcuanNXZWQsIDE4IEphbiAyMDIzIDIyOjIyOjQ5IEdNVA=="
last-modified: 2023年1月18日 22:22:49 GMT
date: 2023年1月19日 11:43:57 GMT
strict-transport-security: max-age=63072000; includeSubDomains; preload
permissions-policy: interest-cohort=()
x-frame-options: sameorigin
x-content-type-options: nosniff
content-security-policy-report-only: ...
content-encoding: gzip
vary: Accept-Encoding
X-Firefox-Spdy: h2

For some reason, only certain routes seem to use gzip.

This is weird. We should have gzip compression enabled. Just checked: ~~~ GET /Codeberg/Community/issues/845 HTTP/2 HTTP/2 200 OK cache-control: no-store, no-transform content-type: text/html; charset=UTF-8 set-cookie: ... date: 2023年1月19日 13:02:40 GMT strict-transport-security: max-age=63072000; includeSubDomains; preload permissions-policy: interest-cohort=() x-frame-options: sameorigin x-content-type-options: nosniff content-security-policy-report-only: ... X-Firefox-Spdy: h2 ~~~ vs ~~~ GET /assets/stlview.js HTTP/2 HTTP/2 200 OK accept-ranges: bytes cache-control: private, max-age=21600 content-type: text/javascript; charset=utf-8 etag: W/"NDMwM3N0bHZpZXcuanNXZWQsIDE4IEphbiAyMDIzIDIyOjIyOjQ5IEdNVA==" last-modified: 2023年1月18日 22:22:49 GMT date: 2023年1月19日 11:43:57 GMT strict-transport-security: max-age=63072000; includeSubDomains; preload permissions-policy: interest-cohort=() x-frame-options: sameorigin x-content-type-options: nosniff content-security-policy-report-only: ... content-encoding: gzip vary: Accept-Encoding X-Firefox-Spdy: h2 ~~~ For some reason, only certain routes seem to use gzip.

👀 1

fnetX commented

2023年01月19日 14:16:15 +01:00

Owner

Copy link

https://codeberg.org/Codeberg-Infrastructure/scripted-configuration/src/branch/main/hosts/kampenwand/etc/haproxy/haproxy.cfg#L126-L127

It looks like we explicitly don't compress HTML. I wonder if someone can elaborate why ... ?

Edit: Oh, we do ... No clue then.

https://codeberg.org/Codeberg-Infrastructure/scripted-configuration/src/branch/main/hosts/kampenwand/etc/haproxy/haproxy.cfg#L126-L127 It looks like we explicitly don't compress HTML. I wonder if someone can elaborate why ... ? Edit: Oh, we do ... No clue then.

👀 1

fsologureng commented

2023年01月20日 02:09:13 +01:00

Copy link

https://docs.haproxy.org/2.7/configuration.html#9.2-filter%20compression

is mandatory to explicitly use a filter line to enable the HTTP compression when at least one filter other than the cache or the fcgi-app is used for the same listener/frontend/backend. This is important to know the filters evaluation order.

But I don't understand the config of kampenwand, maybe @gapodo?

https://docs.haproxy.org/2.7/configuration.html#9.2-filter%20compression > is mandatory to explicitly use a filter line to enable the HTTP compression when at least one filter other than the cache or the fcgi-app is used for the same listener/frontend/backend. This is important to know the filters evaluation order. But I don't understand the config of kampenwand, maybe @gapodo?

rdwz referenced this issue from forgejo/website

2023年01月20日 10:11:59 +01:00

Use brotli along gzip #69

gapodo commented

2023年01月23日 21:55:04 +01:00

Copy link

But I don't understand the config of kampenwand, maybe @gapodo?

While I (believe I) understand the kempenwand config fairly well by now, I haven't really played with compression on haproxy (I mostly use it as L4-Loadbalancer with SNI routing), if there is a need to investigate please ping me, as I'm currently fairly busy, and it'll likely be a week or two until I get the time for this (also not high up on my priority list, as compression / lower bandwidth is nice, but not using CPU cycles for compression is also nice... and it "works for my bandwidth" :D I'd need to investigate for this specifically as compression is done on either nginx or apache in my setups)

> But I don't understand the config of kampenwand, maybe @gapodo? While I (believe I) understand the kempenwand config fairly well by now, I haven't really played with compression on haproxy (I mostly use it as L4-Loadbalancer with SNI routing), if there is a need to investigate please ping me, as I'm currently fairly busy, and it'll likely be a week or two until I get the time for this (also not high up on my priority list, as compression / lower bandwidth is nice, but not using CPU cycles for compression is also nice... and it "works for my bandwidth" :D I'd need to investigate for this specifically as compression is done on either nginx or apache in my setups)

❤️ 1

fnetX commented

2023年01月24日 00:18:38 +01:00

Owner

Copy link

@gapodo It would indeed help if you have a look at the highlighted lines and see if there's an obvious problem. Other than that, I think this issue is not highest priority. It would be better to improve caching and filesizes in general.

👍 2

gapodo commented

2023年01月24日 22:03:31 +01:00

Copy link

And... I had some unexpected time to take a look at the lines in question (and a preliminary read into brotli)

After some digging / testing and comparing I could not see a smoking gun, but noticed the Cache-control header in the 2 replies being different, after some more digging... I found it in the HAProxy docs...

HAProxy 2.6 Configuration - Compression (same in the docs for 2.0 2.1 2.2 2.3 2.4 and 2.5, just change the version in the link) Quote from the linked docs section, emphasis mine...

Compression will be activated depending on the Accept-Encoding request
header. With identity, it does not take care of that header.
If backend servers support HTTP compression, these directives
will be no-op: HAProxy will see the compressed response and will not
compress again. If backend servers do not support HTTP compression and
there is Accept-Encoding header in request, HAProxy will compress the
matching response.

Compression is disabled when:

the request does not advertise a supported compression algorithm in the
"Accept-Encoding" header

the response message is not HTTP/1.1 or above

HTTP status code is not one of 200, 201, 202, or 203

response contain neither a "Content-Length" header nor a
"Transfer-Encoding" whose last value is "chunked"

response contains a "Content-Type" header whose first value starts with
"multipart"

the response contains the "no-transform" value in the "Cache-control"
header

User-Agent matches "Mozilla/4" unless it is MSIE 6 with XP SP2, or MSIE 7
and later

The response contains a "Content-Encoding" header, indicating that the
response is already compressed (see compression offload)

The response contains an invalid "ETag" header or multiple ETag headers

Which is what happens here... likely Forgejo already sets the Cache-control header (which is the correct behavior for a dynamically changing page), which disables compression.

Re. Brotli... HAProxy doesn't support brotli (and for a very good reason), as it is comparatively memory intensive (~6-10x gzip, which is already ~6x to deflate), this causes exceptionally high memory utilization for compression, which is esp. bad on high traffic reverse proxies / load balancers (HAProxy is made to handle humongous amounts of traffic on a small resource budget). Brotli shines on pre-compressing static assets, but doesn't perform well in inline compression, so adopting it should not be considered on dynamic assets, no matter what...

And... I had some unexpected time to take a look at the lines in question (and a preliminary read into brotli) After some digging / testing and comparing I could not see a smoking gun, but noticed the `Cache-control` header in the 2 replies being different, after some more digging... I found it in the HAProxy docs... [HAProxy 2.6 Configuration - Compression](https://docs.haproxy.org/2.6/configuration.html#compression) (same in the docs for 2.0 2.1 2.2 2.3 2.4 and 2.5, just change the version in the link) Quote from the linked docs section, emphasis mine... > Compression will be **activated** depending on the **Accept-Encoding** request > header. With identity, it does not take care of that header. > If backend servers support HTTP compression, these directives > will be no-op: HAProxy will see the compressed response and will not > compress again. If backend servers do not support HTTP compression and > there is Accept-Encoding header in request, HAProxy will compress the > matching response. > **Compression is disabled when:** > * the request does not advertise a supported compression algorithm in the > "Accept-Encoding" header > * the response message is not HTTP/1.1 or above > * HTTP status code is not one of 200, 201, 202, or 203 > * response contain neither a "Content-Length" header nor a > "Transfer-Encoding" whose last value is "chunked" > * response contains a "Content-Type" header whose first value starts with > "multipart" > * **the response contains the "no-transform" value in the "Cache-control"** > header > * User-Agent matches "Mozilla/4" unless it is MSIE 6 with XP SP2, or MSIE 7 > and later > * The response contains a "Content-Encoding" header, indicating that the > response is already compressed (see compression offload) > * The response contains an invalid "ETag" header or multiple ETag headers Which is what happens here... likely Forgejo already sets the Cache-control header (which is the correct behavior for a dynamically changing page), which disables compression. Re. Brotli... HAProxy doesn't support brotli (and for a very good reason), as it is comparatively memory intensive (~6-10x gzip, which is already ~6x to deflate), this causes exceptionally high memory utilization for compression, which is esp. bad on high traffic reverse proxies / load balancers (HAProxy is made to handle humongous amounts of traffic on a small resource budget). Brotli shines on pre-compressing static assets, but doesn't perform well in inline compression, so adopting it should not be considered on dynamic assets, no matter what...

👍 1

fsologureng commented

2023年01月25日 16:03:17 +01:00

Copy link

Brotli shines on pre-compressing static assets

For clarity: brotly shines on pre-compressing static assets compressed with brotly; If you enable brotly over http to jpeg, png or webp requests, you will add a lot of computing with very low benefit and even with high posibilites of increase the request size. That's the reason why http compressing of images usually is not a general option (unless SVG which is mainly text or bitmap formats).

> Brotli shines on pre-compressing static assets For clarity: brotly shines on pre-compressing static assets **compressed with brotly**; If you enable brotly over http to jpeg, png or webp requests, you will add a lot of computing with very low benefit and even with high posibilites of increase the request size. That's the reason why http compressing of images usually is not a general option (unless SVG which is mainly text or bitmap formats).

gapodo commented

2023年01月25日 20:15:25 +01:00

Copy link

For clarity: brotly shines on pre-compressing static assets compressed with brotly;

We may be talking about the same thing, but your double mention of brotli is somewhat messing with me... by pre-compressing static assets, I mean compress store and deliver the compressed item without in-line compression (basically send the .tgz (not a .tgz but it's an easy example) that's stored on the disk already).

Pre-compressing static assets may be meaningful even for some images,... and it can be automatically tested in e.g. the build step (which is what we do with assets for high traffic sites at work), since you are only doing it once, using the highest possible compression (even if it takes ages) may be a feasible option, as it's done once not on every request (though this requires web servers capable of delivering pre-compressed data as if it was compressed in-line, which IIRC not that many servers can handle).

> For clarity: brotly shines on pre-compressing static assets compressed with brotly; We may be talking about the same thing, but your double mention of brotli is somewhat messing with me... by pre-compressing static assets, I mean compress store and deliver the compressed item without in-line compression (basically send the .tgz (not a .tgz but it's an easy example) that's stored on the disk already). Pre-compressing static assets may be meaningful even for some images,... and it can be automatically tested in e.g. the build step (which is what we do with assets for high traffic sites at work), since you are only doing it once, using the highest possible compression (even if it takes ages) may be a feasible option, as it's done once not on every request (though this requires web servers capable of delivering pre-compressed data as if it was compressed in-line, which IIRC not that many servers can handle).

👍 1

fsologureng commented

2023年01月25日 23:12:11 +01:00

Copy link

For clarity: brotly shines on pre-compressing static assets compressed with brotly;

We may be talking about the same thing, but your double mention of brotli is somewhat messing with me...

Your statement about brotly shines on pre-compressing done the same with me XD

... by pre-compressing static assets, I mean compress store and deliver the compressed item without in-line compression (basically send the .tgz (not a .tgz but it's an easy example) that's stored on the disk already).

Yes, I understand the same, but in that cases brotly doesn't shine because essentially does nothing in terms of compression, the data pass as is. So the "shine" is the source of my confusion, because gzip and deflate shine the same in that cases.

Pre-compressing static assets may be meaningful even for some images,... and it can be automatically tested in e.g. the build step (which is what we do with assets for high traffic sites at work), since you are only doing it once, using the highest possible compression (even if it takes ages) may be a feasible option, as it's done once not on every request (though this requires web servers capable of delivering pre-compressed data as if it was compressed in-line, which IIRC not that many servers can handle).

http (in-line) compression is not other than a compression of the http body but with an algorithm capable of doing single pass for both compression and decompression (i.e. streaming (de)compression, in server and browser respectively), as all the Lempel-Ziv (LZ) family and derivated do (deflate, gzip, gif, png, brotli,... ). But not all of them are useful for being applied as stream compression in the server (do you imagine a png built on demand?), that's why just deflate, gzip and brotli have support both sides. Furthermore, they just have being applied for texts, because any LZ-compression applied over any other LZ-compressed-data has very low improvement with almost equal cost and possibly higher size. This bypassing is compression offload.

As a PoC, you can view a text compressed request with gzip or brotli with curl passed to an off-line "pre-compressor":

curl -H 'Accept-Encoding: gzip' https://www.example.org --output -
curl -H 'Accept-Encoding: gzip' https://www.example.org --output - | gunzip -c -
curl -H 'Accept-Encoding: br' https://www.newtenberg.com --output -
curl -H 'Accept-Encoding: br' https://www.newtenberg.com --output - | brotli -d -

> > For clarity: brotly shines on pre-compressing static assets compressed with brotly; > > We may be talking about the same thing, but your double mention of brotli is somewhat messing with me... Your statement about *brotly shines on pre-compressing* done the same with me XD >... by pre-compressing static assets, I mean compress store and deliver the compressed item without in-line compression (basically send the .tgz (not a .tgz but it's an easy example) that's stored on the disk already). Yes, I understand the same, but in that cases brotly doesn't shine because essentially does nothing in terms of compression, the data pass as is. So the "shine" is the source of my confusion, because gzip and deflate shine the same in that cases. > > Pre-compressing static assets may be meaningful even for some images,... and it can be automatically tested in e.g. the build step (which is what we do with assets for high traffic sites at work), since you are only doing it once, using the highest possible compression (even if it takes ages) may be a feasible option, as it's done once not on every request (though this requires web servers capable of delivering pre-compressed data as if it was compressed in-line, which IIRC not that many servers can handle). http (in-line) compression is not other than a compression of the http body but with an algorithm capable of doing single pass for both compression and decompression (i.e. streaming (de)compression, in server and browser respectively), as all the Lempel-Ziv (LZ) family and derivated do (deflate, gzip, gif, png, brotli,... ). But not all of them are useful for being applied as stream compression in the server (do you imagine a png built on demand?), that's why just deflate, gzip and brotli have support both sides. Furthermore, they just have being applied for texts, because any LZ-compression applied over any other LZ-compressed-data has very low improvement with almost equal cost and possibly higher size. This bypassing is compression offload. As a PoC, you can view a text compressed request with gzip or brotli with curl passed to an off-line "pre-compressor": ``` curl -H 'Accept-Encoding: gzip' https://www.example.org --output - curl -H 'Accept-Encoding: gzip' https://www.example.org --output - | gunzip -c - curl -H 'Accept-Encoding: br' https://www.newtenberg.com --output - curl -H 'Accept-Encoding: br' https://www.newtenberg.com --output - | brotli -d - ```

toastal commented

2024年06月30日 12:26:20 +02:00

Copy link

I would be in favor of allowing precompressed static files (gzip, brotli, zstd). One advantage here is nudging users to precompress at max levels since servers try to weigh to cost of spending cycles compressing on the fly to power usage/transfer speeds choosing something a balanced instead of strictly smaller size. The downstream user gets the smallest output possible when precompressed. This does use more storage & currently Git is the mechanism for uploading static files for whatever reason despite Git not being great for blob-like files (transfer or storage) instead of cURL to POST an archive to some endpoint.

derdilla commented

2025年10月18日 16:49:04 +02:00

Copy link

The lack of compression is quite noticeable when loading issue lists. For example, https://codeberg.org/forgejo/forgejo/issues is half a megabyte and download takes ~130 ms on my connection.

Is there a specific reason why dynamic content isn't compressed? If related: has this ever been reconsidered since the more performant zstd became widely supported by browsers (mid 2024)?

The lack of compression is quite noticeable when loading issue lists. For example, https://codeberg.org/forgejo/forgejo/issues is half a megabyte and download takes ~130 ms on my connection. Is there a specific reason why dynamic content isn't compressed? If related: has this ever been reconsidered since the more performant zstd became widely supported by browsers (mid 2024)?

toastal commented

2025年10月18日 22:35:32 +02:00

Copy link

Do note that Zstd largely isn’t a win for "web" tho (such as the example of the issues page) compared to the older, more ubiquitous Brotli. Brotli was made specifically for the web platform with dictionaries specific to HTML+CSS+JS, minimizing data usage, & better streaming to show partially loaded content. Zstd still has a place with binary files, repository archives, etc. but isn’t the best option for serving web content.

No Branch/Tag specified

Branches Tags

main

Labels

Clear labels

accessibility

Reduces accessibility and is thus a "bug" for certain user groups on Codeberg.

bug

Something is not working the way it should. Does not concern outages.

bug

infrastructure

Errors evidently caused by infrastructure malfunctions or outages

Codeberg

This issue involves Codeberg's downstream modifications and settings and/or Codeberg's structures.

contributions welcome

Please join the discussion and consider contributing a PR!

docs

No bug, but an improvement to the docs or UI description will help

duplicate

This issue or pull request already exists

enhancement

New feature

infrastructure

Involves changes to the server setups, use `bug/infrastructure` for infrastructure-related user errors.

legal

An issue directly involving legal compliance

licence / ToS

involving questions about the ToS, especially licencing compliance

please chill

we are volunteers

Please consider editing your posts and remember that there is a human on the other side. We get that you are frustrated, but it's harder for us to help you this way.

public relations

question

More information is needed

question

user support

This issue contains a clearly stated problem. However, it is not clear whether we have to fix anything on Codeberg's end, but we're helping them fix it and/or find the cause.

s/Forgejo

s/Forgejo/migration

Migration related issues in Forgejo

s/Pages

s/Weblate

s/Woodpecker

Woodpecker CI related issue

security

involves improvements to the sites security

service

Add a new service to the Codeberg ecosystem (instead of implementing into Gitea)

upstream

An open issue or pull request to an upstream repository to fix this issue (partially or completely) exists (i.e. Gitea, Forgejo, etc.)

wontfix

Codeberg's current set of contributors are not planning to spend time on delegating this issue.

No labels

contributions welcome

Milestone

Clear milestone

No items

No milestone

Projects

Clear projects

No items

No project

Assignees

Clear assignees

No assignees

7 participants

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

Codeberg/Community#845

No description provided.

Delete branch "%!s()"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?