So the system will honour my self-hosted avatar rather than giving me a digital drawing as in https://codeberg.org/strk ...
Enable Libravatar support #48
Do you know if it is possible to enable libravatar support without enabling gravatar.com support?
What I tried is to set:
DISABLE_GRAVATAR = true
ENABLE_FEDERATED_AVATAR = true
as soon as GRAVATAR is enabled I see xhr requests to gravatar.com (as tested on codeberg-test.org)
[...] rather than giving me a digital drawing as in https://codeberg.org/strk ...
Have you tried uploading your picture on your personal profile settings page?
This is what you want:
[picture]
DISABLE_GRAVATAR = false
GRAVATAR_SOURCE = https://seccdn.libravatar.org/avatar/
ENABLE_FEDERATED_AVATAR = true
Setting DISABLE_GRAVATAR = true, as you do, disables the whole concept of fetching avatars from somewhere externally to same-host. I know the naming is not really correct, but it is set to that for historical reasons.
@strk
That seems to work. I did it on codeberg-test.org
I am not really sure if this is what I want.
Since this is federated - I think it might be better if we host our own libgravatar server/proxy (however that is called).
I like the fact that when browsing codeberg umatix only shows codeberg.org as a source and no third party service ;)
What do others think?
@hw ?
Once you setup your own libravatar server you can change GRAVATAR_SOURCE but the important thing here is ENABLE_FEDERATED_AVATAR, which would not even hit the GRAVATAR_SOURCE IFF the domain for the user email is configured for Libravatar support.
Check how my avatar is directly served by my mail domain service
See how it works (check with your browser debug tools) my avatar here: https://gitea.com/strk/
@strk
on https://gitea.com/strk/ I saw that your avatar was served from avatars.kbt.io in my browser tools, after I allowed xhr requests to that domain.
I am really not sure how the federation is supposed to work, but it seems to me something that the avatar servers do - but not gitea itself. So running our own server seems best to me.
But then again, maybe I missed something...
...
Also I saw blocked calls to www.googletagmanager.com all over gitea.com - well .com meets .com - good match.
@strk
Sorry, I missed that avatars.kbt.io is your domain.
Well, I know this is debatable, but making all codeberg users browsers do xhr requests to different servers (like eg.avatars.kbt.io) allows the server owners to collect IPs of codeberg users. I don't think that this is good.
Is there something like a libravatar proxy we can host which does the federation - so codeberg users only communicate with codeberg, and their IP remains hidden to avatar servers?
@strk : in my personal opinion XHR requests for every avatar display in repo history or even discover page display don't seem right, as this implies significant abuse/tracking potential, which is clearly against our mission.
If avatar images were cached on disk (once loaded to profile picture and then refreshed or updated either on request or in intervals that don't allow tracking), then I think enabling support is a no-brainer. Would you like to volunteer for such a patch that stores images to file to inhibit continuous tracking traffic to 3rdparty servers?
@ashimokawa yes you are right that avatar server owners will be able to track who views/fetches avatars from that server. In my case that's me tracking who views/fetches my avatar (the only avatar on my server). And this ability is given to anyone in control of their email domain.
Those who cannot control their email domain can still upload an image to the Gitea server OR to the avatar server of Gitea server owner's choice (I suggested libravatar.org servers, but you can of course install your own server, libre libravatar servers being available).
I'd like that users would also be able to directly specify an avatar URL in their profile settings, so to be able to decide where to get users to fetch their avatar from, but this is currently not available in Gitea (I might have submitted an issue for that)
@hw: to recap, the tracking potential is given to the user as a choice. Note that recent Libravatar server implementations do implement caching on disk the avatars which are looked up from the fallback (as contrary to those looked up from email-owner configured domain). Browser itself will usually cache those images. Again, we're talking about 1st party server (the server of the user) rather than 3rd party server, when the user controls her mail domain. Just install libravatar server and you'll become the 2nd party (for the user) serving avatars for whoever doesn't use its own domain.
In reply to:
Is there something like a libravatar proxy we can host which does the federation - so codeberg users only communicate with codeberg, and their IP
remains hidden to avatar servers?
You can install a caching libravatar server with a fallback to libravatar.org instance.
The libravatar client (gitea) would then:
1. See if user's domain asks to fetch avatar from a specific server, and use that one in an `<img>` tag (no `XHR`) sent to browser.
2. Failing the above, use a "local" server (on the same domain as gitea) in the `<img>` tag sent to browser.
3. The local server would see if an avatar for that email hash exists locally, if not, it will fetch one from the fallback (in your case it could be libravatar.org) and cache it locally, goto 2.
4. IFF libravatar.org would receive the fetch request from point 3, it will see if an avatar for that email hash exists locally (either as first-class citizen or cache) and serve it. Failing to find it locally it will fetch it from its own fallback (usually gravatar.com) and cache it locally, returning it to the requestor.
So in the worst case tracking would be as follows:
1. User's email name server would receive a request from Gitea server (max once a day) to check if an avatar server is configured
2. [failing the above] avatar.codeberg.org would receive a request from user browser for the avatar associated with an hash
3. [failing to find in cache) cdn.libravatar.org would receive a request from avatar.codeberg.org
4. [failing to find in cache) realserver.libravatar.org would receive a request from cdn.libravatar.org
5. [failing to find in cache) gravatar.com would receive a request from realserver.libravatar.org
At the end of the above pipeline, the user's avatar will be found in step 2
Useful pointer: https://github.com/go-gitea/gitea/issues/6046
yes you are right that avatar server owners will be able to track who views/fetches avatars from that server. In my case that’s me tracking who views/fetches my avatar (the only avatar on my server). And this ability is given to anyone in control of their email domain.
[...]
to recap, the tracking potential is given to the user as a choice.
I have to disagree here. Avatars are fetched all the time when someone sees that user in a commit list, user list, issue comment, whatever. Basically an active codeberg user can track a big chunk of users IPs if we has control over his avatar url, even if they do not visit your profile directly. I do not think you have such intentions - but I want to highlight the abuse potential.
You can install a caching libravatar server with a fallback to libravatar.org instance.
The libravatar client (gitea) would then:
1. See if user's domain asks to fetch avatar from a specific server, and use that one in an `<img>` tag (no `XHR`) sent to browser.
2. Failing the above, use a "local" server (on the same domain as gitea) in the `<img>` tag sent to browser.
3. The local server would see if an avatar for that email hash exists locally, if not, it will fetch one from the fallback (in your case it could be libravatar.org) and cache it locally, goto 2.
4. IFF libravatar.org would receive the fetch request from point 3, it will see if an avatar for that email hash exists locally (either as first-class citizen or cache) and serve it. Failing to find it locally it will fetch it from its own fallback (usually gravatar.com) and cache it locally, returning it to the requestor.
The problem lies in 1. If we can prevent that and proxy that though avatar.codeberg.org - I am convinced.
EDIT: To make that clear, I think it is okay if an avatar server owner sees anonymized codeberg request, but I do not think it is ok if an avatar server owner sees IP, browser, referer etc.
When any user visits a user list, the DNS lookup is done by the Gitea server, so the only IP received by the users email domain name server is the one of the Gitea server.
You are right that the actual avatar server (discovered via DNS lookup) will receive the IP of the fetcher, so of the fetcher is the browser, it will receive IP of the users. If you want to restrict the number of servers receiving the user's IP then the best choice would be to have Gitea fetch the avatar and serve it (proxy it) instead of redirecting the browser there. The cost of that would be higher traffic and disk occupation.
Disabling "federated avatars" and simply specifying a local server for GRAVATAR_URL would prevent using user-determined servers because the gravatar API wants an hash, so never gets the user domain and is thus unable to do the DNS lookup, which is what I would like to have (don't want to be forced to upload my avatar all over the world).
You can refer to the issue link I pasted before to track or tackle implementation of that kind of proxying in Gitea itself.
the best choice would be to have Gitea fetch the avatar and serve it
Yeah, that's exactly the "caching and storing avatar on disk"-model above.
The cost of that would be higher traffic and disk occupation
Pretty much unch relative to "normal" user avatars, with the advantage that all data goes through the same HTTP connection (with keep-alive or HTTP2), so higher performance for all users?
You can refer to the issue link I pasted before to track or tackle implementation of that kind of proxying in Gitea itself
Yes, this is on point. If the current behaviour would be changed to store the avatar file on disk and serve it just like normal avatar pictures, pretty much all issues above would be solved(*)...
(*) one exception tho, Codeberg.org's current ToS and privacy statement declare that no user data is shared with external services. As this is an opt-in, the disclaimer seems appropriate on the profile settings page right where a user is enabling the avatar service. Adjusting privacy statement and properly explaining in layman's terms how this works seems appropriate.
We would also have to ensure that external services are technically blocked and unable to set user cookies (with the current clientside/browser-based XHR avatar model they can), as ePrivacy regulation ("Cookie directive") allows only plain session cookies without explicit opt-in of users.
as this implies significant abuse/tracking potential, which is clearly against our mission.
privacy statement declare that no user data is shared with external services.
+1. I will simply block external request if you add gravatar.
Do I get correctly, that we are generally positive towards federated avatars as long as we don't use
- centralized proprietary services
- there are no requests to third-parties (= a media proxy is in place)
?
Contribution welcome then, I guess.
Wouldn't this leak email hashes? If so it should be opt-in and with a warning.
I know this debate was a while ago, but I would like to throw my hat in the ring and second @n on this topic.
Wouldn't this leak email hashes? If so it should be opt-in and with a warning.
It would send SHA256 hashes of e-mail addresses to libravatar servers, yes. I don't see that as a problem, since users of libravatar have opted in to matching the hash with the e-mail address and other users won't enable the "use libravatar" setting.
Not entirely sure if that is the issue to ask/discuss this or if there is a different issue (or one should be made), but would this solve commits from remote not showing your avatar?
Commits made through Codeberg do show my avatar but any made remotely through git only show a placeholder one... Not sure if I need to configure something as on GitHub this worked either way, so idk if it is something related to e-mail or some other identifier...
Any commits associated with verified email addresses on your account will appear as being authored by you in the web interface.
Any commits associated with verified email addresses on your account will appear as being authored by you in the web interface.
Not entirely sure where I can check that.
If it is under the account tab, then I have my primary e-mail set there and it should also be my main e-mail in git.
It's worth to note that I created the account through GitHub Auth, so certain settings have been set for me already.
Wouldn't this leak email hashes? If so it should be opt-in and with a warning.
It would send SHA256 hashes of e-mail addresses to libravatar servers, yes. I don't see that as a problem, since users of libravatar have opted in to matching the hash with the e-mail address and other users won't enable the "use libravatar" setting.
Moreover in my case the "libravatar server" would basically be run by me so I'd be sending the SHA256 hash of MY EMAIL to MY SERVER, so there would be no privacy issue at all.
I'm still missing my avatar, as you can probably see here :)
Contribution welcome then, I guess.
Where would such contribution be sent ? Is there an ansible script to determine the configuration of codeberg.org Gitea (or Forgejo) instance ?
@strk
on https://gitea.com/strk/ I saw that your avatar was served from avatars.kbt.io in my browser tools, after I allowed xhr requests to that domain.I am really not sure how the federation is supposed to work
It works like this: service (Gitea / Gogs / Forgejo in this case) wants to display a picture associated with an email address. It queries DNS to find "the avatar server" for the domain part of the email address. Then it asks that server for the avatar associated to an hash of the email address.
This means:
- If you control the domain of your email address the hash of your email is only sent (by the visitor's browser) to a server you specify in your DNS zone.
- The avatar server (that you specified via DNS) will get an HTTP request every time any user visits a page on the Gitea/Gogs/Forgejo service which wants to show your avatar and you don't have it already cached.
In practical terms by enabling Libravatar you ALLOW users of your service to have a centralized control of their own avatar AND in the current implementation you also allow them to TRACK your service users.
I've filed an upstream ticket for fixing the tracking part here:
https://github.com/go-gitea/gitea/issues/23525
I already typed out a (very) long reply here earlier but Codeberg swallowed it with a 500 error when posting. 😭
Again, but very brief now:
@strk I asked internally how to proceed with this. I'm willing to try and help bring this forward but time is (always) limited.
Did I understand this correctly that we don't need to wait for gotta itself to implement the media proxy cache but we can also run a libravatar server ourself and just always reference this?
If yes, then the plan would be to get this service setup and configure Codeberg-Test to test this setup. Afterwards update our privacy policy to mention that email hashes will be sent to third party servers when the option for this is enabled in the settings (this option already exists right?)
We might have an additional complication here with weblate (https://translate.codeberg.org) as well. That fetches avatars from codeberg.org with a custom patch. But if I'm understanding this correctly neither Codeberg not the libravatar proxy will have all avatars sorted if this is implemented? Maybe we do need gitea to cache the avatars for this after all.
(Weblate so has libravatr support and does implement this caching. But this isnt really helping us here either, I think.)
Did I understand this correctly that we don't need to wait for gotta itself to implement the media proxy cache but we can also run a libravatar server ourself and just always reference this?
Yes, you can run a libravatar server yourself.
The libravatar server of your own would query the DNS and query the DNS-advertised avatar server OR decide what to do (generate random?) in case DNS knows nothing about avatars.
Afterwards update our privacy policy to mention that email hashes will be sent to third party servers when the option for this is enabled in the settings (this option already exists right?)
Option is available. It's the GRAVATAR URL, nothing specific to libravatar as the DNS lookup would be done by your server, not by gitea itself.
Out of curiosity, if a proxy of sorts is planned for Libravatar, would it also be viable to extend said support to Gravatar as well? I was intending to use Gravatar to more easily manage profile pictures across various stuff, but noticed that it's disabled here via the auto-generated avatar.
Tho it does seem that Libravatar on its own does already proxy Gravatar, but only on MD5; SHA256 seems to not be proxied.
Existing proxies usually already fallback to an external service for fetching the avatar when DNS does not contain other pointers. The libravatar.org web service itself does this for example (not sure it is still available)
Upstream (Gitea) issue to implement functionality to have Gitea itself ACT as a proxy is here: https://github.com/go-gitea/gitea/issues/23525 but no news so far and I still think codeberg.org could just enable the DNS lookups and let users track visitors of their avatars.
Alternatively, codeberg.org could deploy their own libravatar service and point GRAVATAR_SOURCE there, see https://wiki.libravatar.org/running_your_own/
This issue is best discussed to the infrastructure repository: Codeberg-Infrastructure/build-deploy-forgejo#126
Reduces accessibility and is thus a "bug" for certain user groups on Codeberg.
Something is not working the way it should. Does not concern outages.
Errors evidently caused by infrastructure malfunctions or outages
This issue involves Codeberg's downstream modifications and settings and/or Codeberg's structures.
Please join the discussion and consider contributing a PR!
No bug, but an improvement to the docs or UI description will help
This issue or pull request already exists
New feature
Involves changes to the server setups, use `bug/infrastructure` for infrastructure-related user errors.
An issue directly involving legal compliance
involving questions about the ToS, especially licencing compliance
Please consider editing your posts and remember that there is a human on the other side. We get that you are frustrated, but it's harder for us to help you this way.
Things related to Codeberg's external communication
More information is needed
This issue contains a clearly stated problem. However, it is not clear whether we have to fix anything on Codeberg's end, but we're helping them fix it and/or find the cause.
Related to Forgejo. Please also check Forgejo's issue tracker.
Migration related issues in Forgejo
Issues related to the Codeberg Pages feature
Issue is related to the Weblate instance at https://translate.codeberg.org
Woodpecker CI related issue
involves improvements to the sites security
Add a new service to the Codeberg ecosystem (instead of implementing into Gitea)
An open issue or pull request to an upstream repository to fix this issue (partially or completely) exists (i.e. Gitea, Forgejo, etc.)
Codeberg's current set of contributors are not planning to spend time on delegating this issue.
No due date set.
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?