When I try to post a comment to gitnex/GitNex#57, the server responds with a HTTP 500 error page, and the comment does not get saved.
Attempting to post comment in a specific issue results in HTTP 500 error #1092
Server logs:
2023年06月30日 00:29:53 ...rs/web/repo/issue.go:2851:NewComment() [E] [649e21fd-119] CreateIssueComment: Error 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Wow. This is probably related to forgejo/forgejo#220
How far back do you keep logs? Would it be possible to check if others have also failed to comment on this issue, or in the repo?
The problem appears to be serious, definitely. From the logs, it appears to happen in bursts, but my suspicion is that certain parts of the data structure are broken (as in: causing deadlocks), and people retry a few times, then give up.
I am currently trying to comment in the linked Forgejo thread, but it is ... well ... broken.
I see :/
From the logs, it appears to happen in bursts
Does that mean that when it happens, it happens across the instance or some other large group of issues, and also that retrying later (e.g. in a few hours) it can succeed?
I matched the occurences with access logs. It happens on the same routes, so the bursts probably come from a single user that retries again and again, eventually gives up.
For example, a burst of problems appeared today, all in the GitNex repo. I suppose this was you :)
The problem is that the "Comment" and "Label" db code is quite complex and not transaction-safe (no strict locking order).
No easy fix at the moment, but in 1.21 the forms will be submitted by AJAX then no content would be lost anymore ..... (as the log says: restarting transaction by re-submit)
@wxiaoguang I think AJAX is no solution here. The issue persisted for about 20 retries over 10 minutes, so AJAX would need to be really stubborn and retry for a long time.
As an aside, will issue comments break for users without javascript? 🤔
No easy fix at the moment, but in 1.21 the forms will be submitted by AJAX then no content would be lost anymore
I think no content should be lost even right now, because pressing the browser back button gets me back to the comments page, with my comment still in the editor. The browser saves the editor contents somewhere, probably.
But on the other hand that change is very interesting, cant wait for it! Right now commenting (and other actions) means reloading the whole page, and with a long thread on a very slow connection.. that takes time.
@wxiaoguang I think AJAX is no solution here. The issue persisted for about 20 retries over 10 minutes, so AJAX would need to be really stubborn and retry for a long time.
Not just AJAX, but the user would also need to be patient to not close the tab when its not working.
I think AJAX is no solution here.
Yup, it's not the best solution, it's only a workaround for losing contents.
The issue persisted for about 20 retries over 10 minutes, so AJAX would need to be really stubborn and retry for a long time.
There could be 2 cases IMO:
- The single user's submit causes deadlock, then there might be a serious problem in the code.
- There are multiple users submitting requests at the same time, then the non-strict transaction locking causes the problem.
I have no idea about what's the root problem at the moment.
As an aside, will issue comments break for users without javascript? 🤔
There are already many components depending on JS, I do not think a user could make "changes (POST)" in all cases without JS nowadays, but they could still view (read) the pages without JS in many cases.
For example, IIRC the "close comment" button was 100% JS from very long time ago.
I just got the same issue on the gadgetbridge repository. Tried to submit a new device request, and got a 500 error.
I can confirm that I also get a 500 error trying to open an issue in the Gadgetbridge repository.
Failed request was at 2023年07月04日 17:24:00 UTC, I am assuming it's the same issue - could someone confirm from the logs?
@sploinga @joserebelo The only problems we can see is that your content is apparently too long. Did you try to post massive amounts of text? We cannot find a correlation to a deadlock. We'll investigate the other issue tomorrow.
@fnetX I copy-pasted the same issue message that @sploinga was trying to submit and did not even consider that. It does include a somewhat large log file inline that should be causing the problem.
Should not be related with this issue, apologies for the noise.
I also got a 500 error response when trying to comment at URL davidak/nixos-config#17.
Text was:
(55 characters, so not too long)
After some time and many tries, it worked. But sending this also failed often.
It needs to take a look at the MariaDB's innodb transaction error log, to see how the deadlock happens.
Reduces accessibility and is thus a "bug" for certain user groups on Codeberg.
Something is not working the way it should. Does not concern outages.
Errors evidently caused by infrastructure malfunctions or outages
This issue involves Codeberg's downstream modifications and settings and/or Codeberg's structures.
Please join the discussion and consider contributing a PR!
No bug, but an improvement to the docs or UI description will help
This issue or pull request already exists
New feature
Involves changes to the server setups, use `bug/infrastructure` for infrastructure-related user errors.
An issue directly involving legal compliance
involving questions about the ToS, especially licencing compliance
Please consider editing your posts and remember that there is a human on the other side. We get that you are frustrated, but it's harder for us to help you this way.
Things related to Codeberg's external communication
More information is needed
This issue contains a clearly stated problem. However, it is not clear whether we have to fix anything on Codeberg's end, but we're helping them fix it and/or find the cause.
Related to Forgejo. Please also check Forgejo's issue tracker.
Migration related issues in Forgejo
Issues related to the Codeberg Pages feature
Issue is related to the Weblate instance at https://translate.codeberg.org
Woodpecker CI related issue
involves improvements to the sites security
Add a new service to the Codeberg ecosystem (instead of implementing into Gitea)
An open issue or pull request to an upstream repository to fix this issue (partially or completely) exists (i.e. Gitea, Forgejo, etc.)
Codeberg's current set of contributors are not planning to spend time on delegating this issue.
No due date set.
No dependencies set.
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?