Community Wishlist Survey 2023/Larger suggestions/Improve speed at which InternetArchiveBot archives links
Appearance
From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Pbsouthwood (talk | contribs) at 18:00, 18 February 2023 (Support proposal). It may differ significantly from the current version .
This proposal is a larger suggestion that is out of scope for the Community Tech team. Participants are welcome to vote on it, but please note that regardless of popularity, there is no guarantee this proposal will be implemented. Supporting the idea helps communicate its urgency to the broader movement.
Improve speed at which InternetArchiveBot archives links
- Problem: citations suffer linkrot and arent archived quickly enough
- Proposed solution: have an automatic bot that archives links straight away that is built into the automatic citations bit
- Who would benefit: everyone who uses citations
- More comments: I already know that it is done automatically eventually, however, I feel that it is too slow
- Phabricator tickets:
- Proposer: HoHo3143 (talk) 22:28, 23 January 2023 (UTC) [reply ]
Discussion
- I think an argument can be made that this would be an outstanding redundancy in times and places where news media and even academia may be an enemy of the state. Such a robust automatic backup could insure that versions of events and information is retained before a take down can occur. Not that it is necessarily overly common, but in an ever changing geopolitical landscape, there is a certain utility to automatic duplicates. Foxtrot620 (talk) 02:33, 24 January 2023 (UTC) [reply ]
- @HoHo3143: I've retitled this to make it clearer and more general. I hope the new title accurately reflects your intention. Note that you can manually submit pages to have the InternetArchiveBot process them (not that that necessarily solves your issue, but it's a useful workaround). There are also other tools such as the Internet Archive's Wayback Machine browser extension which allow instance archiving of any page or URL. SWilson (WMF) (talk) 03:50, 24 January 2023 (UTC) [reply ]
- May be there would be some tools to automatically archive pages to Internet Archive and add archiver link into the article. Thingofme (talk) 09:22, 24 January 2023 (UTC) [reply ]
- @Thingofme: Isn't that what the InternetArchiveBot is though? SWilson (WMF) (talk) 01:14, 25 January 2023 (UTC) [reply ]
- @HoHo3143 Pinging again in case the above question was missed. We're wondering if the manual archiving tool meets your needs, since it gives you a way to fetch archive URLs in real-time, should the bot have not processed recently enough.
- I worry "making InternetArchiveBot go faster" may be out of scope. The bot is wholly maintained by a volunteer, and from we understand it already edits essentially as quickly as it can. MusikAnimal (WMF) (talk) 21:42, 3 February 2023 (UTC) [reply ]
- Ok thank you for letting me know. There are large numbers of sources which haven't yet been archived so I thought why not suggest speeding it up. If this isn't possible as it is volunteer based, that is ok. HoHo3143 (talk) 04:25, 4 February 2023 (UTC) [reply ]
- @HoHo3143 I wouldn't say it's impossible because it's volunteer-based, rather it's just out of scope for our team since we know it to be a very massive codebase and the production setup is quite complex. We'd rely wholly on the volunteer assisting us. I'm pretty sure making it faster isn't an infrastructural issue (which seemingly is something we could help with), but I could be wrong. I just know reviewing the contributions, the bot already seems to go pretty dang fast. Maybe it could be ran as a second bot to go even faster. Let's just ping the maintainer and ask: @Cyberpower678 Do you have any thoughts on this? MusikAnimal (WMF) (talk) 03:05, 6 February 2023 (UTC) [reply ]
- Actually, there is an infrastructure issue. Cloud VPS was recently removed from rate limit exceptions and now the bot is being throttled by a webservice rate limit, not to be confused with the API rate limit. We've reached a scalability limit here. Of course, we are working on optimization to make it be more efficient with the production servers, but ultimately, IABot 3 is what will be the ultimate solution to scaling and speed. IABot 3 is not around the corner though, and is still in the planning and design stages. I agree, the bot is too slow as it stands right now. —CYBERPOWER (Chat) 17:09, 17 February 2023 (UTC) [reply ]
- @HoHo3143 I wouldn't say it's impossible because it's volunteer-based, rather it's just out of scope for our team since we know it to be a very massive codebase and the production setup is quite complex. We'd rely wholly on the volunteer assisting us. I'm pretty sure making it faster isn't an infrastructural issue (which seemingly is something we could help with), but I could be wrong. I just know reviewing the contributions, the bot already seems to go pretty dang fast. Maybe it could be ran as a second bot to go even faster. Let's just ping the maintainer and ask: @Cyberpower678 Do you have any thoughts on this? MusikAnimal (WMF) (talk) 03:05, 6 February 2023 (UTC) [reply ]
- Ok thank you for letting me know. There are large numbers of sources which haven't yet been archived so I thought why not suggest speeding it up. If this isn't possible as it is volunteer based, that is ok. HoHo3143 (talk) 04:25, 4 February 2023 (UTC) [reply ]
- Support Support having pdf of a source when used would be a great step forward I have see genealogy software like Family Search doing it - Salgo60 (talk) 18:09, 10 February 2023 (UTC) [reply ]
- May be there would be some tools to automatically archive pages to Internet Archive and add archiver link into the article. Thingofme (talk) 09:22, 24 January 2023 (UTC) [reply ]
Voting
- Support Support Xbypass (talk) 20:08, 10 February 2023 (UTC) [reply ]
- Support Support Tom Ja (talk) 21:43, 10 February 2023 (UTC) [reply ]
- Support Support Significa liberdade (talk) 22:26, 10 February 2023 (UTC) [reply ]
- Support Support ·addshore· talk to me! 00:16, 11 February 2023 (UTC) [reply ]
- Support Support Hehua (talk) 02:36, 11 February 2023 (UTC) [reply ]
- Support Support Tgr (talk) 04:14, 11 February 2023 (UTC) [reply ]
- Support Support EpicPupper (talk) 05:39, 11 February 2023 (UTC) [reply ]
- Support Support NMaia (talk) 06:00, 11 February 2023 (UTC) [reply ]
- Support Support HoHo3143 (talk) 07:36, 11 February 2023 (UTC) [reply ]
- Support Support Jurbop (talk) 08:04, 11 February 2023 (UTC) [reply ]
- Support Support Arado Ar 196 (talk) 08:32, 11 February 2023 (UTC) [reply ]
- Support Support SunDawn (talk) 12:39, 11 February 2023 (UTC) [reply ]
- Support Support Bluerasberry (talk) 15:04, 11 February 2023 (UTC) [reply ]
- Support Support CROIX (talk) 15:19, 11 February 2023 (UTC) [reply ]
- Support Support Nicereddy (talk) 16:57, 11 February 2023 (UTC) [reply ]
- Support Support Novak Watchmen (talk) 17:58, 11 February 2023 (UTC) [reply ]
- Support Support Ivario (talk) 22:04, 11 February 2023 (UTC) [reply ]
- Support Support Furfur ⁂ Discussion 00:06, 12 February 2023 (UTC) [reply ]
- Support Support --NGC 54 (talk|contribs) 01:30, 12 February 2023 (UTC) [reply ]
- Support Support Mauricio V. Genta (talk) 08:06, 12 February 2023 (UTC) [reply ]
- Support Support Fvtvr3r (talk) 13:57, 12 February 2023 (UTC) [reply ]
- Support Support Waldyrious (talk) 04:53, 13 February 2023 (UTC) [reply ]
- Support Support JAn Dudík (talk) 22:01, 13 February 2023 (UTC) [reply ]
- Support Support Ɱ (talk) 02:39, 14 February 2023 (UTC) [reply ]
- Support Support SpacedShark (talk) 06:37, 14 February 2023 (UTC) [reply ]
- Support Support essential Just N. (talk) 15:00, 14 February 2023 (UTC) [reply ]
- Support Support Lousyd (talk) 18:58, 17 February 2023 (UTC) [reply ]
- Support Support Fuchs B (talk) 20:18, 17 February 2023 (UTC) [reply ]
- Support Support —(ping on reply)—CX Zoom (A/अ/অ) (let's talk|contribs) 08:31, 18 February 2023 (UTC) [reply ]
- Support Support -- Ferien (talk) 16:17, 18 February 2023 (UTC) [reply ]
- Support Support Vulcan ❯❯❯Sphere! 16:35, 18 February 2023 (UTC) [reply ]
- Support Support · · · Peter (Southwood) (talk): 18:00, 18 February 2023 (UTC) [reply ]