Codeberg/Community
54
325
Fork
You've already forked Community
12

migration from github seems to use incorrect pagination requests #1934

Open
opened 2025年05月15日 11:05:17 +02:00 by martenson · 7 comments

Comment

I am in the process of creating a mirror for https://github.com/galaxyproject/galaxy/ and am encountering this issue, which seems to be caused by improper pagination handling for larger github projects.

error while listing repos: GET https://api.github.com/repos/galaxyproject/galaxy/issues?direction=asc&page=218&per_page=45&sort=created&state=all: 422 Pagination with the page parameter is not supported for large datasets, please use cursor based pagination (after/before) []
### Comment I am in the process of creating a mirror for `https://github.com/galaxyproject/galaxy/` and am encountering this issue, which seems to be caused by improper pagination handling for larger github projects. ``` error while listing repos: GET https://api.github.com/repos/galaxyproject/galaxy/issues?direction=asc&page=218&per_page=45&sort=created&state=all: 422 Pagination with the page parameter is not supported for large datasets, please use cursor based pagination (after/before) [] ```

Another problem with migrating issues, not sure if related to the above.

Migrating from https://github.com/galaxyproject/galaxy failed.
Error 1062 (23000): Duplicate entry '508493-1' for key 'UQE_issue_repo_index'
Another problem with migrating issues, not sure if related to the above. ``` Migrating from https://github.com/galaxyproject/galaxy failed. Error 1062 (23000): Duplicate entry '508493-1' for key 'UQE_issue_repo_index' ```

Is there a good reason to make a mirror for such a huge repository (including issues)? It's very likely to hit the new storage limits.

Is there a good reason to make a mirror for such a huge repository (including issues)? It's very likely to hit the [new storage limits](https://blog.codeberg.org/new-storage-limits-on-codeberg-what-you-need-to-know.html).

I guess the question is whether codeberg wants to host large and complex software projects. I'm sure we'd be fine with paying for some extra storage or service but if this is a hard limit and large projects are off the table that would also be good to have confirmed.

I guess the question is whether codeberg wants to host large and complex software projects. I'm sure we'd be fine with paying for some extra storage or service but if this is a hard limit and large projects are off the table that would also be good to have confirmed.

@martenson wrote in #1934 (comment):

I guess the question is whether codeberg wants to host large and complex software projects

I am not sure how it's related, unless galaxy is in the process or considering to move to Codeberg. This request read more like someone wanting to create a mirror of a large repository (this happens quite frequently with the linux kernel) without having any meaning behind it.

@martenson wrote in https://codeberg.org/Codeberg/Community/issues/1934#issuecomment-4594175: > I guess the question is whether codeberg wants to host large and complex software projects I am not sure how it's related, unless galaxy is in the process or considering to move to Codeberg. This request read more like someone wanting to create a mirror of a large repository (this happens quite frequently with the linux kernel) without having any meaning behind it.

I am one of the Galaxy maintainers and am exploring whether codeberg could be a reasonable swap-in for our code infrastructure and what is missing. This is my private initiative, I have no mandate to speak for the project on this topic.

I am one of the Galaxy maintainers and am exploring whether codeberg could be a reasonable swap-in for our code infrastructure and what is missing. This is my private initiative, I have no mandate to speak for the project on this topic.

I see, thank you for your clarification. It should be possible to start a migration without migrating the issues and pull requests, I am not sure how easy it is to modify the migration code to use this new cursor pagination for the Github migration.

I see, thank you for your clarification. It should be possible to start a migration without migrating the issues and pull requests, I am not sure how easy it is to modify the migration code to use this new cursor pagination for the Github migration.
Owner
Copy link

Thank you for your report, and we would be happy to welcome you to Codeberg.

Without a very close look, it seems like GitHub has changed their limits regarding export of large projects and now requires a different method (the reasoning for this is not clear for me, I can't see a performance or otherwise technical difference between having a page and page size and defining start and end manually).

The result is the same. I doubt that there is human resources to modify our migration code in the near future, so I suppose you are trapped with GitHub for the moment and a migration to Codeberg is currently not possible. We hope you check back later, maybe in a year or two?

If you want to try a migration sooner, your best options are:

  • Use a custom importer script. There are quite some community tools around in various programming languages and quality, that allow migrating and synchronizing projects with special needs.
  • Try to use another platform as a stop gap. Maybe some other code hosting platforms allows a migration from GitHub, and we allow a migration from there. I didn't check the platforms recently, though.
  • Try to fix the issue and either submit a patch to Forgejo, or temporarily run your own instance with a hotfix for the issue. You can migrate from there to Codeberg as well.
  • Migrate without your issues. Often, the year-long list of issues that no one will actually implement might use a fresh start anyway. It allows to focus on the most important issues today. Priorities change over time.
Thank you for your report, and we would be happy to welcome you to Codeberg. Without a very close look, it seems like GitHub has changed their limits regarding export of large projects and now requires a different method (the reasoning for this is not clear for me, I can't see a performance or otherwise technical difference between having a page and page size and defining start and end manually). The result is the same. I doubt that there is human resources to modify our migration code in the near future, so I suppose you are trapped with GitHub for the moment and a migration to Codeberg is currently not possible. We hope you check back later, maybe in a year or two? If you want to try a migration sooner, your best options are: * Use a custom importer script. There are quite some community tools around in various programming languages and quality, that allow migrating and synchronizing projects with special needs. * Try to use another platform as a stop gap. Maybe some other code hosting platforms allows a migration from GitHub, and we allow a migration from there. I didn't check the platforms recently, though. * Try to fix the issue and either submit a patch to Forgejo, or temporarily run your own instance with a hotfix for the issue. You can migrate from there to Codeberg as well. * Migrate without your issues. Often, the year-long list of issues that no one will actually implement might use a fresh start anyway. It allows to focus on the most important issues today. Priorities change over time.
Sign in to join this conversation.
No Branch/Tag specified
main
No results found.
Labels
Clear labels
accessibility

Reduces accessibility and is thus a "bug" for certain user groups on Codeberg.
bug

Something is not working the way it should. Does not concern outages.
bug
infrastructure

Errors evidently caused by infrastructure malfunctions or outages
Codeberg

This issue involves Codeberg's downstream modifications and settings and/or Codeberg's structures.
contributions welcome

Please join the discussion and consider contributing a PR!
docs

No bug, but an improvement to the docs or UI description will help
duplicate

This issue or pull request already exists
enhancement

New feature
infrastructure

Involves changes to the server setups, use `bug/infrastructure` for infrastructure-related user errors.
legal

An issue directly involving legal compliance
licence / ToS

involving questions about the ToS, especially licencing compliance
please chill
we are volunteers

Please consider editing your posts and remember that there is a human on the other side. We get that you are frustrated, but it's harder for us to help you this way.
public relations

Things related to Codeberg's external communication
question

More information is needed
question
user support

This issue contains a clearly stated problem. However, it is not clear whether we have to fix anything on Codeberg's end, but we're helping them fix it and/or find the cause.
s/Forgejo

Related to Forgejo. Please also check Forgejo's issue tracker.
s/Forgejo/migration

Migration related issues in Forgejo
s/Pages

Issues related to the Codeberg Pages feature
s/Weblate

Issue is related to the Weblate instance at https://translate.codeberg.org
s/Woodpecker

Woodpecker CI related issue
security

involves improvements to the sites security
service

Add a new service to the Codeberg ecosystem (instead of implementing into Gitea)
upstream

An open issue or pull request to an upstream repository to fix this issue (partially or completely) exists (i.e. Gitea, Forgejo, etc.)
wontfix

Codeberg's current set of contributors are not planning to spend time on delegating this issue.
Milestone
Clear milestone
No items
No milestone
Projects
Clear projects
No items
No project
Assignees
Clear assignees
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Codeberg/Community#1934
Reference in a new issue
Codeberg/Community
No description provided.
Delete branch "%!s()"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?