This issue represents the current state of the inaccessible captcha that is still enabled for registration on Codeberg.org. To get everything on board about the problem:
- The captcha prevents people with limited eye vision (either blind or otherwise having difficulties deciphering the distorted text) from registering at Codeberg.org without help from our admins. While we are happy to help, requiring assistance for the first step on Codeberg discriminates against these users and demotivates, especially when users only want to "quickly" report an issue to a project.
- The captcha is not very effective anyway, because it's rather easy to solve. However, it reduces spam attacks. Without a captcha, mass-registering accounts on Codeberg is as easy as submitting the form and parsing the activation link from emails using a simple pattern. It is a low hanging fruit and everyone can get started with this within a few minutes. With a captcha, you need to invest about an hour to integrate with a captcha solver, so spam is reduced to those that find the time and motivation to do this. It relaxes our moderation team, they only need to deal with occasional spam waves.
- We have considered many options over the years, such as proof-of-work based throttling. However, most options are either not very effective, not well maintained, still inaccessible (not by design, but because no one made them accessible), privacy-invasive, proprietary, require JavaScript ...
- We invested effort, for example into an mcaptcha integration, but we decided it was not a viable option (for some reason I personally don't remember).
The solution to this is whatever requires a similar amount of engineering to automate registrations to Codeberg, and our aim is to completely get rid of a captcha. Our most recent idea can be found at forgejo/forgejo#6966
Message quotes from our Matrix channel about the problem
The only possible solution is to drop the captcha entirely, and that's both the goal and plan for the past years.
We invested effort into other captcha solutions, implement support for some others in Forgejo (e.g. mcaptcha), but nothing really worked out. You can trace the experiments back to https://codeberg.org/Codeberg-Infrastructure/CaptchaService :)
Dropping the captcha implies that we need better spam tools. We expected to get there by April 2025. The current progress is that we have a Forgejo Guardian which does an excellent job at containing SEO spam already.
As the recent spam waves showed, the current captcha is not hard to break. People can do this with a few hours of engineering effort.
However, we know that when we drop the captcha, the threshold is reduced by a lot. Up to the point where spamming user accounts on Codeberg is as easy as filling registrations and extracting the activation link from the email. Unfortunately, that means we will see similar campaigns on a weekly or even daily basis. Basically each time someone doesn't like us and finds five minutes to write a shell script.
So the requirement for dropping the captcha is not necessarily a captcha, but something that requires people to at least add some hour of work to implement OCR, send data to a service, or otherwise solve some technical difficulties.
If someone implements 100 different templates for the registration form and email that work the same but are hard to automate, this could also work :D
I appreciate the effort and your help in bringing this forward. It is embarrassing that this topic has been open for so long. The difficulty is the rather complex dependencies on spam detection, moderation tooling, denial of service vectors and much more.
### Comment
This issue represents the current state of the inaccessible captcha that is still enabled for registration on Codeberg.org. To get everything on board about the problem:
* The captcha prevents people with limited eye vision (either blind or otherwise having difficulties deciphering the distorted text) from registering at Codeberg.org without help from our admins. While we are happy to help, requiring assistance for the first step on Codeberg discriminates against these users and demotivates, especially when users only want to "quickly" report an issue to a project.
* The captcha is not very effective anyway, because it's rather easy to solve. However, it reduces spam attacks. Without a captcha, mass-registering accounts on Codeberg is as easy as submitting the form and parsing the activation link from emails using a simple pattern. It is a low hanging fruit and everyone can get started with this within a few minutes. With a captcha, you need to invest about an hour to integrate with a captcha solver, so spam is reduced to those that find the time and motivation to do this. It relaxes our moderation team, they only need to deal with occasional spam waves.
* We have considered many options over the years, such as proof-of-work based throttling. However, most options are either not very effective, not well maintained, still inaccessible (not by design, but because no one made them accessible), privacy-invasive, proprietary, require JavaScript ...
* We invested effort, for example into an mcaptcha integration, but we decided it was not a viable option (for some reason I personally don't remember).
The solution to this is whatever requires a similar amount of engineering to automate registrations to Codeberg, and our aim is to completely get rid of a captcha. Our most recent idea can be found at https://codeberg.org/forgejo/forgejo/issues/6966
<details><summary>Message quotes from our Matrix channel about the problem</summary>
The only possible solution is to drop the captcha entirely, and that's both the goal and plan for the past years.
We invested effort into other captcha solutions, implement support for some others in Forgejo (e.g. mcaptcha), but nothing really worked out. You can trace the experiments back to https://codeberg.org/Codeberg-Infrastructure/CaptchaService :)
Dropping the captcha implies that we need better spam tools. We expected to get there by April 2025. The current progress is that we have a Forgejo Guardian which does an excellent job at containing SEO spam already.
As the recent spam waves showed, the current captcha is not hard to break. People can do this with a few hours of engineering effort.
However, we know that when we drop the captcha, the threshold is reduced by a lot. Up to the point where spamming user accounts on Codeberg is as easy as filling registrations and extracting the activation link from the email. Unfortunately, that means we will see similar campaigns on a weekly or even daily basis. Basically each time someone doesn't like us and finds five minutes to write a shell script.
So the requirement for dropping the captcha is not necessarily a captcha, but something that requires people to at least add some hour of work to implement OCR, send data to a service, or otherwise solve some technical difficulties.
If someone implements 100 different templates for the registration form and email that work the same but are hard to automate, this could also work :D
</details>
I appreciate the effort and your help in bringing this forward. It is embarrassing that this topic has been open for so long. The difficulty is the rather complex dependencies on spam detection, moderation tooling, denial of service vectors and much more.