User talk:InternetArchiveBot
edit |
Archive |
---|
Archives |
Connect with the developers and other users
Telegram
IRC (irc.libera.chat #iabot
)
Operation status
For the most up to date information see the run pages or Wiki Operations Summary on Airtable
- 🟢 InternetArchiveBot is currently running on 300+ Wikimedia wikis.
- 🟢 We have moved the management interface to a new server. Please start using iabot.wmcloud.org instead of iabot.toolforge.org. Please let us know if anything broke during this process.
- 🟡 Testing is stalled on Alemannisch Wikipedia (als), Asturian Wikipedia (ast), and Japanese Wikipedia (ja).
- 🔴 Bot is approved but disabled on Hungarian Wikipedia (hu).
- 🔴 Bot is approved but disabled indefinitely pending software improvements on French Wikipedia (fr), MediaWiki.org, Norwegian Nynorsk Wikipedia (nn), Polish Wikipedia (pl), and Portuguese Wikipedia (pt).
Last updated: 03:57, 4 September 2024 (UTC)
How this page works
- Ask your question in any language. Questions in English or German will receive the fastest responses.
- Our team will try to respond within seven days.
- Seven days after our response we will mark the thread as resolved. This queues the thread for archiving.
- If our response does not answer your question, you are welcome to remove the "section resolved" tag and write an additional comment.
- Seven days after the thread is marked as resolved, it will be archived. Once a thread is archived, it should not be un-archived. Instead, create a new thread and link to the old one.
{{Section resolved|1=~~~~}}
after 7 days.
Dealing with redirect
Hi, is it possible for the bot to recognise a specific redirect as equivalent to a dead link? My use case is the British Film Institute which this week has closed down all its individual webpages for movies, which now all redirect to an info page. For example, https://www2.bfi.org.uk/films-tv-people/4ce2b6e822633 (for the movie Action Stations) redirects to https://www.bfi.org.uk/this-page-no-longer-exists, as do all other per-film pages. In many cases the original page will be on the Internet Archive (in this case https://web.archive.org/web/20201030232522/https://www2.bfi.org.uk/films-tv-people/4ce2b6e822633), so the link could be replaced. Can the bot be configured to do this? Many thanks for any advice. Tobyhoward (talk) 09:45, 1 October 2023 (UTC) Reply
- This could be a good task for User:GreenC's bot. Harej (talk) 21:18, 11 October 2023 (UTC) Reply
- I can do this (Enwiki only). Opened a request at w:Wikipedia:Link_rot/URL_change_requests#bfi.org.uk_soft-404s please follow up there. -- GreenC (talk) 21:19, 11 October 2023 (UTC) Reply
This request has been resolved mostly for now on enwiki. About 15k new archive URLs have been uploaded to IABot and marked dead, which will propagate to the other wikis. There is a lot more work to be done on Wikidata and other wikis, particularly with the BFI external link template used in a couple dozen wikis. It is beyond the scope of IABot, and probably me also. BFI has issues, more info at bottom of this discussion page. -- GreenC (talk) 16:09, 17 October 2023 (UTC) Reply
IAB marked URL as dead, even though an archived copy was available
https://nl.wikipedia.org/w/index.php?title=Wind_Rose&diff=66078906&oldid=65892316
As you can see, it marked the URL to Masters of Rock as dead, which is kind of fair. But it didn't add the archived copy, even though it was available: http://web.archive.org/web/20190617195034/https://www.mastersofrock.cz/en/Wind-Rose
Seems like a bug to me. Mondo (talk) 09:12, 2 October 2023 (UTC) Reply
- Mondo, it's possible the bot did not add that archive link because it was not able to load it. I tried loading it in my browser just now and it would not load. If in the future that archival copy manages to load the bot will be able to add it as an archive link. Harej (talk) 21:27, 11 October 2023 (UTC) Reply
- Since I posted this, I have tried to load that archive link on several occasions in several browsers and it loads just fine. Is it so hard to believe the bot could have a bug? Mondo (talk) 11:11, 12 October 2023 (UTC) Reply
- Can confirm the Wayback Machine was having issues loading at the time we answered your request last week, but that isn't actually the point of the issue. What's going on here is that an editor borked the original URL on the template, which is very much dead, as it returns a 403, and that specific URL has no Wayback Machine copy. In other words, the Wayback snapshot you provided is technically different than the URL in the citation template, which is why the bot won't find it. I suggest stripping the accidental garbage from the end of the URL which appears to be a URL encoded snippet of the citation template itself. Right now the URL is https://www.mastersofrock.cz/en/Wind-Rose%20%7Ctitle=Wind%20Rose instead of https://www.mastersofrock.cz/en/Wind-Rose which actually loads, and is the original of the Wayback snapshot you suggested.
- In short, there is no bot bug here, just a case of GIGO. —CYBERPOWER (Chat) 20:23, 18 October 2023 (UTC) Reply
- Since I posted this, I have tried to load that archive link on several occasions in several browsers and it loads just fine. Is it so hard to believe the bot could have a bug? Mondo (talk) 11:11, 12 October 2023 (UTC) Reply
ruwiki
Hi, botowner. Your bot does many edits like this or this. Please, stop it and cancel it. 91.197.junr3170 (talk) 18:35, 3 October 2023 (UTC) Reply
- 91.197.junr3170, can you describe what is wrong with these edits? Harej (talk) 21:22, 11 October 2023 (UTC) Reply
- ЛА-1978-No04→ %D0%9B%D0%90-1978-%E2%84%9604 91.197.junr3170 (talk) 19:40, 18 October 2023 (UTC) Reply
- 91.197.junr3170, while percent-encoding is turned off for Russian Wikipedia, it is still required for URLs because of a limitation the bot has. A future version of the bot will address this. Harej (talk) 20:25, 18 October 2023 (UTC) Reply
- ЛА-1978-No04→ %D0%9B%D0%90-1978-%E2%84%9604 91.197.junr3170 (talk) 19:40, 18 October 2023 (UTC) Reply
url containing percent-encoded spaces not processed correctly
Hello. What happened is, the bot found an archive copy at https://web.archive.org/web/20160304000852/http://theblues.chelseafc.com/cgi-bin/playersearch.pl?Bill%20H%20ROBERTSON , but didn't copy the whole of it to the archive-url parameter, instead it stopped at the first percent-encoded space, so only copied https://web.archive.org/web/20160304000852/http://theblues.chelseafc.com/cgi-bin/playersearch.pl?Bill
I've undone it for now. Struway2 (talk) 07:55, 12 October 2023 (UTC) Reply
- Thank you for your report Struway2. We have filed a bug report on Phabricator. Harej (talk) 20:38, 18 October 2023 (UTC) Reply
Rollback or ..
Hi. Bot added dead link to some articles in 2021. Link is not dead now, how can you rollback these edits?
Example: "SEDS" link in NGC_6246
Tou can find list of these articles and first 4 symbols of name is the same; "NGC " Bikar (talk) 11:28, 12 October 2023 (UTC) Reply
- Bikar, our scan logs don't go back to 2021 so we could not tell you the context there. I recommend keeping the archive links in place in case the website goes down again. If it went down a first time it could go down again. Harej (talk) 20:41, 18 October 2023 (UTC) Reply
Generated link marked as dead
Generated URL improperly marked as dead:
https://svgtranslate.toolforge.org/{{urlencode:{{FULLPAGENAME}}|WIKI}}
- https://commons.wikimedia.org/w/index.php?title=File:2022_Russian_invasion_of_Ukraine.svg&diff=prev&oldid=811551228
Glrx (talk) 16:47, 13 October 2023 (UTC) Reply
- Glrx, we have opened a ticket on Phabricator. Harej (talk) 20:47, 18 October 2023 (UTC) Reply
InternetArchiveBot keeps adding blank archive save
Hi. Can InternetArchiveBot at English Wiki please stop adding this blank archive save to en:2020 NBL1 season. It is not a useful archive link as it does not load the content of the page that was there in March 2020. I have checked all the saves of that page from Wayback machine and none of them are useful as NBL1.com.au pages have traditionally not saved properly. Thanks. DaHuzyBru (talk) 02:01, 14 October 2023 (UTC) Reply
- DaHuzyBru, that archive URL has been removed from the database. Harej (talk) 20:50, 18 October 2023 (UTC) Reply
Article size limit redux
Hi @Harej! I see that phab:T342168 has been resolved. Does that mean it's now possible to increase the size limit of articles that the bot can handle? {{u|Sdkb }} talk 03:19, 14 October 2023 (UTC) Reply
- Sdkb, we have just now removed the limit. Thank you for checking in! Harej (talk) 21:05, 18 October 2023 (UTC) Reply
- Fantastic; thanks! Cheers, {{u|Sdkb }} talk 22:27, 18 October 2023 (UTC) Reply
Percent encoding of URLs
Hello. Do not convert URLs into these endless gibberish things, please. Sneeuwschaap (talk) 13:02, 14 October 2023 (UTC) Reply
- Sneeuwschaap, unfortunately, while percent-encoding of article text is turned off for Russian Wikipedia, it is still required for URLs because of a limitation the bot has. A future version of the bot will address this. Harej (talk) 21:06, 18 October 2023 (UTC) Reply
On the Esperanto Wikipedia, the bot sometimes adds archive URLs when they are already present
This results in multiple pointers to the same archive page appearing, when there really only should be one.
Here is an example (which I later manually corrected): Rozalia Zamenhof Mayhair (talk) 06:12, 15 October 2023 (UTC) Reply
- Mayhair, we have adjusted the template configuration on our end to now also accept English-language parameters as well. However I'd like to note the "cite web" call on that page is using Esperanto-language parameters even though it only has English parameters. I would recommend updating that template call to use the Esperanto variant of that template. Harej (talk) 21:18, 18 October 2023 (UTC) Reply
Time out
The bot has timed out today and yesterday. TwoScars (talk) 21:19, 15 October 2023 (UTC) Reply
- TwoScars, we've been trying to reproduce the underlying issue but unfortunately have not been able to. If you have any additional information about when the bot times out, such as loading a particular page, that would be useful. If it just occasionally goes offline without anything in particular prompting it to, we are still looking into that. Harej (talk) 01:00, 19 October 2023 (UTC) Reply
- I just got it to work. It took a while, but did not time out. The file was User:TwoScars/sandbox. Also got it to work on Mambourg Glass Company. TwoScars (talk) 16:41, 19 October 2023 (UTC) Reply
Bot down since this morning
Hello team! I went to run the archive bot for an article and found that the site is currently returning a 500 error. I double-checked on "Down for Everyone or Just Me" and it does not appear to be exclusive to me. Sock (talk) 17:04, 23 October 2023 (UTC) Reply
- Thank you for your report, Sock. We are still working on figuring out the underlying cause; we have been unable to exactly reproduce what causes it. Harej (talk) 20:08, 25 October 2023 (UTC) Reply
How do I ask the bot not to edit pages
Basically, I’ve been re-writing an article which has lots of old magazine articles on web archive. I’ve used "archive-url" to link them as I couldn’t find a better parameter within the citation template, but the bot has today linked to the websites of the magazines (linked to allow archive-url). How do I stop it editing this specific page so I don’t waste its time (weaker point) and also that I don’t have to change it every time it may do so in the future (stronger point).
I’m aware this can be done for other bots, but they’ve generally helped & I’m wondering how to do so for this bot specifically.
Thanks EPEAviator (talk) 12:04, 26 October 2023 (UTC) Reply
- I’ve just found the FAQ page I was looking for after checking a second time. Sometimes I need to remember to wear my glasses... EPEAviator (talk) 12:07, 26 October 2023 (UTC) Reply
October 2023
I would like to report an issue with InternetArchiveBot, which is currently unable to archive content from https://shodhganga.inflibnet.ac.in . I kindly request your assistance in resolving this problem. Thanks.–Owais Al Qarni (talk) 07:59, 28 October 2023 (UTC) Reply