Commons:Bots/Work requests
- Bot help and list
- Requests to operate a bot
- Requests for work to be done by a bot
- Requests for batch uploads
- Gadget Cat-a-lot can be used for most category adding, removing and moving
- VisualFileChange tool can be used for mass-changes of one author's uploaded files, or files in a category, creating a mass-deletion-request, let you insert tags to the file-description-pages (even copy from exif/meta-data). You can also perform find and replace operations with or without JavaScript regular expressions (regexp). A common use case is: "I've changed my username". It's a web-tool and can be launched directly.
- AutoWikiBrowser can be used for large number of supervised and automatic edits
- Commons:Batch uploading page can be used to request mass image uploads
Legend |
---|
|
|
|
|
|
Manual settings |
When exceptions occur, please check the setting first. |
Move "Historical images of" to "History of"
[edit ]Per note at Category:Historical images by country (as conclusion from Commons:Categories for discussion/2019/09/Category:Historical images), the content of the categories at Special:PrefixIndex/Category:Historical images of should be moved to "History of". This seems to involve more than 10'000 categories, see PetScan:29034509. I think the resulting redirect could afterwards be tagged for speedy deletion. Enhancing999 (talk) 18:59, 2 August 2024 (UTC) [reply ]
- i dont think it's a good idea to handle this problem without human supervision.
- i would rather do these instead:
- prohibit new categories with the word from being created.
- let users slowly move the files to the appropriate categories (by time).
- RZuo (talk) 20:42, 2 August 2024 (UTC) [reply ]
- "history of ..." is not any better. everything is history. RZuo (talk) 20:43, 2 August 2024 (UTC) [reply ]
- Right, any cutoff for "history" will change every second/minute/hour/week/month/year/century/millennium. See also Commons:Categories for discussion/2024/08/Category:History by country. — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 10:20, 4 August 2024 (UTC) [reply ]
- there's specific interest related to "history" of something.
- for example, historians of asian history should go under "history of asia".
- but to dump files into "history of xx" is no more better than dumping them in "xx" or "historical images of xx". all files of xx can perfectly fit into all those three variations.
- most of these "historical images of xx" basically contain all photographs before the advent of digital photography, especially black and white photographs.
- so i'd rather users move these cats to or create for example "xx in the 19th/20th century". RZuo (talk) 12:49, 4 August 2024 (UTC) [reply ]
- i have an idea of a bot moving files according to the time/date, but i need probably 1 or 2 years to code something like that up. RZuo (talk) 12:53, 4 August 2024 (UTC) [reply ]
- I don't think this is the place to re-discuss the CfD. If you think the closure is problematic, ask an admin to re-open it. Enhancing999 (talk) 12:57, 4 August 2024 (UTC) [reply ]
- i have an idea of a bot moving files according to the time/date, but i need probably 1 or 2 years to code something like that up. RZuo (talk) 12:53, 4 August 2024 (UTC) [reply ]
- Right, any cutoff for "history" will change every second/minute/hour/week/month/year/century/millennium. See also Commons:Categories for discussion/2024/08/Category:History by country. — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 10:20, 4 August 2024 (UTC) [reply ]
- "history of ..." is not any better. everything is history. RZuo (talk) 20:43, 2 August 2024 (UTC) [reply ]
- There is just no way this can be done manually. If there are cases you think would be problematic, please state them here. Enhancing999 (talk) 20:56, 2 August 2024 (UTC) [reply ]
- Support Per Enhancing999. There's currently 33732 categories for "historical images", which is way to many for anyone to deal with manually. This also isn't the place to relitigate the CfD. Nor do I think doing so would go anywhere anyways since it was open for 4 years and has been closed since last year. So there has been plenty of time for people to raise concerns about it. Most of these categories only contain a couple of images to begin with and they aren't "historical" either. The idea that we should let users slowly move the files to the appropriate categories when it's only a couple of images per category to begin with is totally ridiculous and would just waste everyone's time. There's no reason people can't better categorize the images once they are moved to "history of xx" categories. That's where most of the images were in the first place. Regardless, this should totally be done by a bot instead of forcing users to waste time doing it manually. --Adamant1 (talk) 07:27, 10 August 2024 (UTC) [reply ]
- Instead of tagging the redirects for speedy deletion, a bot may rename the categories without leaving a redirect if the corresponding category 'history of...' does not yet exist. Wikiwerner (talk) 12:12, 17 November 2024 (UTC) [reply ]
- If a cat "historical images of xx" has < 5 (or a similarly small number) files, all files should be moved to "xx". it's not necessary to make a separate subcat for just a handful of files. RoyZuo (talk) 14:14, 17 November 2024 (UTC) [reply ]
- e.g. Category:Historical images of Sababurg which has 6 files, while Category:Sababurg has only ~50 files. RoyZuo (talk) 14:40, 17 November 2024 (UTC) [reply ]
- But how do decide by bot when a category is desired to keep? Wikiwerner (talk) 17:09, 18 November 2024 (UTC) [reply ]
- e.g. Category:Historical images of Sababurg which has 6 files, while Category:Sababurg has only ~50 files. RoyZuo (talk) 14:40, 17 November 2024 (UTC) [reply ]
- This request requires probably hundreds of thousands of edits. Is "History of ..." a better categorization? The category:History is subject of a CfD too: see Commons:Categories for discussion/2024/06/Category:History. We better wait for a verdict there. Otherwise perhaps another hundreds of thousands of edits are necessary after the latter verdict. Wikiwerner (talk) 19:59, 21 November 2024 (UTC) [reply ]
- I have subscribed here but @Wikiwerner: et all if there is a decision to move forward with this, please ping me. If I have time I can see about adding this to my plate at that time.
- Wikiwerner is also right that if there was to be a collapsing of smaller categories ground rules would need to be clearly defined to make that codeable and have clear expectations. A blind move is far easier, a bot cannot make judgement calls outside of potentially "< 5 members? move". But it gets messy and rather complicated very quickly. It cannot determine exceptions etc, though a list could certainly be compiled of categories with fewer than X members. TheSandDoctor (talk) 06:12, 14 March 2025 (UTC) [reply ]
Generate a daily database report equivalent of Special:UncategorizedCategories
[edit ]initial request and related discussion |
---|
Generate a daily database report equivalent of Special:UncategorizedCategories
For each page, output:
Ideally formatted in a template. Enhancing999 (talk) 14:27, 24 August 2024 (UTC) [reply ]
|
- Updated request (the reports were created a while ago and manually updated)
The following reports should be updated by bot:
- Commons:Report_Special:UncategorizedCategories (based on Quarry:query/86077, takes >10 minutes to run)
- Commons:Report_UncategorizedCategories_with_infobox (Quarry:query/85877, takes ∼1 minute to run)
Notes:
- When updating, after running the query, the resulting categories need to be null-edited and then the queries run again. Otherwise we get false positives due to template based categorizations (notably {{Wikidata Infobox}}).
- The count by user is added when it's formatted.
- The lines should be in a template for easier formatting.
- If it's easier to update, I could merge the two reports.
- Ideally, the reports are updated 6AM and 6PM UTC, so Europeans and Americans don't get too many entries that have already been dealt with.
The reports may appear short now, but not too long ago they were at 4000 categories total. I think this was partially due to Special:UncategorizedCategories having ran only once a month.
The reports would be similar to w:Wikipedia:Database_reports/Uncategorized_categories.
∞∞ Enhancing999 (talk) 12:08, 29 September 2024 (UTC) [reply ]
- You can choose to download the results as a wikitable. Does that resemble the desired output? Wikiwerner (talk) 17:46, 20 November 2024 (UTC) [reply ]
- A bit (compare with the pages). If you can automated that part, it would be a good start.
∞∞ Enhancing999 (talk) 19:36, 24 November 2024 (UTC) [reply ]- I have given it a try. I let a script request the wikitable download URL and perform two regex replacements. (And now I see that you piped the Wikidata search link, unlike my script. That's fixed easily next time.) Wikiwerner (talk) 20:29, 27 November 2024 (UTC) [reply ]
- Looks good. Thanks!
∞∞ Enhancing999 (talk) 22:41, 27 November 2024 (UTC) [reply ]- The next step is running the query again. How do you do that? Wikiwerner (talk) 14:21, 1 December 2024 (UTC) [reply ]
- If it's your own query, you have to login into quarry and click "submit query". There is a feature that makes forking other people's queries easy.
- As for pybot, I asked at mw:Topic:Yd8qqsrjykawj9v9. I couldn't get far with the "superset" solution mentioned there.
- In the meantime, I found that loading the most recent run is possible per m:Research:Quarry#Downloading_a_resultset.
- If you ask for access to toolserver, you could use m:Research:Quarry#Querying_ToolsDB_public_databases.
∞∞ Enhancing999 (talk) 14:35, 1 December 2024 (UTC) [reply ]- Thank you very much. Now I can run the same script, with the same HTTP request, after each query run. The only thing we need is a way to trigger a new query run... Wikiwerner (talk) 17:02, 1 December 2024 (UTC) [reply ]
- I have discovered how to do that. I have run the query and updated the report. Wikiwerner (talk) 11:21, 8 December 2024 (UTC) [reply ]
- Thank you very much. Now I can run the same script, with the same HTTP request, after each query run. The only thing we need is a way to trigger a new query run... Wikiwerner (talk) 17:02, 1 December 2024 (UTC) [reply ]
- The next step is running the query again. How do you do that? Wikiwerner (talk) 14:21, 1 December 2024 (UTC) [reply ]
- Looks good. Thanks!
- I have given it a try. I let a script request the wikitable download URL and perform two regex replacements. (And now I see that you piped the Wikidata search link, unlike my script. That's fixed easily next time.) Wikiwerner (talk) 20:29, 27 November 2024 (UTC) [reply ]
- A bit (compare with the pages). If you can automated that part, it would be a good start.
Report update request (#2)
[edit ]- Please also update these new reports with a bot:
- I suggest that these are updated twice a month at first. Frequency could be increased as needed.
- Here's how I update the reports manually (info how this is done for the two reports above doesn't seem to be included): I go to the query page click Download data and select csv. Then I open the csv in VSCodium (Visual Studio Code) and use this to add
[[:Category:
to the start and]],
to the end of every line as well as replacing all linebreaks. There also is a page 2 with only the first 500 items. I requested the queries here so thanks to Matěj Suchánek. Changing the output to be ordered alphabetically would improve it. "redcats" refers to nonexisting categories – further explanations are at the top of these reports. - By the way, I think
the resulting categories need to be null-edited
is too unclear. Prototyperspective (talk) 16:45, 7 October 2024 (UTC) [reply ]- @Wikiwerner: could you also look into these reports? I started this thread as a subthread of the thread right above where you participated. Prototyperspective (talk) 17:29, 31 January 2025 (UTC) [reply ]
- I am running the query now, which I can post-process to update the report. I have not yet a tool to update the word count. Does that matter? Wikiwerner (talk) 17:16, 2 February 2025 (UTC) [reply ]
- Thank you. No, the paragraph about top word counts is not important. Enhancing999 added it afterwards. If you update the report, please simply remove that part. Prototyperspective (talk) 17:27, 2 February 2025 (UTC) [reply ]
- Well, I have saved the output and removed the word count. The second report you mentioned, already appears to be updated every two weeks. Wikiwerner (talk) 19:03, 2 February 2025 (UTC) [reply ]
- Thank you. No, the paragraph about top word counts is not important. Enhancing999 added it afterwards. If you update the report, please simply remove that part. Prototyperspective (talk) 17:27, 2 February 2025 (UTC) [reply ]
- I am running the query now, which I can post-process to update the report. I have not yet a tool to update the word count. Does that matter? Wikiwerner (talk) 17:16, 2 February 2025 (UTC) [reply ]
- @Wikiwerner: could you also look into these reports? I started this thread as a subthread of the thread right above where you participated. Prototyperspective (talk) 17:29, 31 January 2025 (UTC) [reply ]
Monuments database in Russia
[edit ]Per discussion at Commons:Village pump#Monuments database in Russia.
There are >25K sub-categories of Category:Galleries of cultural heritage monuments in Russia (and about 275 in its subcategory, Category:Galleries of cultural heritage monuments in Crimea) named in the format (for example) Category:WLM/1010021052. That example duplicates Category:Threshing barn from Berezovaya Selga. The corresponding Wikidata item, Threshing barn from Berezovaya Selga (Q106488771), has a Wiki Loves Monuments ID (P2186) value of RU-1010021052
(note the "RU-
" prefix). That Wikidata item is linked to the alphanumerically named, not numbered, category.
For each of those 25K categories, we need a bot to do the following:
- Find the Wikidata item with the Wiki Loves Monuments ID (P2186) value (e.g.
RU-1010021052
)- If no Wikidata item is found, write a log entry and skip to the next category
- Find the Commons category that the Wikidata item is linked to
- If no Commons category is found; or if the linked category is of the numeric type, write a log entry and skip to the next category
- Redirect the numeric category (e.g. Category:WLM/1010021052) to the latter category (e.g. Category:Threshing barn from Berezovaya Selga)
- Ensure that the latter category transcludes {{Wikidata infobox}}
An alternative at 1.1 would be to create a Wikidata item; populating with data from e.g. https://ru-monuments.toolforge.org/wikivoyage.php?id=1010021052 - but this could be done later. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:14, 24 September 2024 (UTC) [reply ]
- @Pigsonthewing How does this diff look on the WLM cat with redirect? -- DaxServer (talk) 11:23, 19 January 2025 (UTC) [reply ]
- @DaxServer: Thank you. Looks good to me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:44, 19 January 2025 (UTC) [reply ]
Sending automated messages to users who upload suspicious SVG files claiming to be their authors
[edit ]I've noticed there are a *lot* of users who mistakenly think that making a vector version of a file grants them authorship status. I was thinking as a means to reduce how often this happens, it would be helpful to have a bot that would promptly flag likely cases and inform the uploader about relevant Commons policies.
I'd like there to be a bot that monitors new uploads to Commons:
- For each new SVG file that is uploaded:
- Check the file description to see if ANY of the following are true: Author field is equal to [[User:UPLOADERNAME|UPLOADERNAME]], license tag contains self (e.g. {{PD-self}}, {{self|cc-by-sa-4.0}})
- If any of those cases are True:
- Check to see if there is a file already on Commons with the same name but different file extension (e.g. File:MyCityFlag.svg was just uploaded, and File:MyCityFlag.jpg already exists)
- If there is a filename match, add an appropriate tag on the image's page and leave automated message on user's talkpage about it
- Check to see if there is a file already on Commons with the same name but different file extension (e.g. File:MyCityFlag.svg was just uploaded, and File:MyCityFlag.jpg already exists)
There are a couple of possible tags that could be added to the file's page. I'm thinking {{Disputed}} or {{Wrong license}} but there may be something more suitable.
As for the automated message to leave on users' talk pages, I'm thinking something along the lines of:
Dear USERNAME, thank you for your contribution to Commons! I am an automated bot and am responding to your upload of FILENAME. It appears you have listed yourself as the author, or used a license tag that implies you are the copyright holder of this file.
Making a vector version of an existing work of art is, in copyright terms, seen as a derivative work. While creating a vector version requires skill and effort, it is still legally considered derivative.
It appears your upload is a vector version of RASTER-FILENAME. The description of UPLOAD should be updated to ensure the original designer of the work is credited as author, and that the license tag reflects the copyright status held by the original designer.
You can credit yourself as the vectorizer using the Igen template. For example, add: |other fields={{Igen|Inkscape|+|u=[[User:USERNAME|USERNAME]]}} (replace Inkscape with relevant software as needed, information at Template:Image generation).
Please update the description of FILENAME promptly. Your contribution to Commons is appreciated!
I don't have any experience with running bots on wikis so I'm afraid I don't know how technically difficult this will be. The automated message should probably be refined - it's just a first draft. But I'm hoping this sort of thing will help new users figure this out much sooner and reduce how many files with inappropriate authorship/licensing need to be fixed. Intervex (talk) 21:37, 26 November 2024 (UTC) [reply ]
- I think better yet since *other* users might not know the name of the authors:
Dear USERNAME, thank you for your contribution to Commons! I am an automated bot and am responding to your upload of FILENAME.
It appears you have listed yourself as the author, or used a license tag that implies you are the copyright holder of this file.
Making a vector version of an existing work of art is, in copyright terms, seen as a derivative work.
While creating a vector version requires skill and effort, it is still legally considered derivative. It appears your upload is a vector version of RASTER-FILENAME.
The description of UPLOAD should be updated to ensure the original designer of the work is credited as author, and that the license tag reflects the copyright status held by the original designer.
You can credit yourself as the vectorizer using the Igen template. For example, add: |other fields={{Igen|Inkscape|+|u=[[User:USERNAME|USERNAME]]}} (replace Inkscape with relevant software as needed, information at Template:Image generation).
If you are unable to find the name of the author, feel free add: Template:Unknown author.
Please update the description of FILENAME promptly. Your contribution to Commons is appreciated!
SpinnerLaserzthe2nd (talk) 02:30, 28 November 2024 (UTC) [reply ]
Adding cat "Animated GIF files" to all instances of such
[edit ]Category:Animated GIF files is a fairly flat category containing most or probably more than half and most in-use animated GIF files. However, I noticed it's quite unreliable and does not contain a large fraction of animated GIF files. Could a bot please add this inferrable category to all files with the GIF filetype that are animated?
This can be useful to later have a filter for animated GIF files, to complete Animations of xyz categories, and for deepcategory searches and Petscans for animated GIF files in specific, and for allowing searching of all animated GIF files (e.g. via the category search box at the top of that page).
I don't know how one could check whether a GIF file is animated or not but there probably is a way for that (maybe using machine vision via using some machine vision package but not unlikely also possible in a much easier way). If somebody know how that could be done please add info about that here.
See this search query.
Prototyperspective (talk) 12:10, 2 December 2024 (UTC) [reply ]
- According to Google it's fairly straightforward in most programming languages to check if a gif file is animated. Is there a way or wiki API call that can tell if a Gif is animated without downloading it first? --Schlurcher (talk) 17:21, 2 December 2024 (UTC) [reply ]
- I'll probably implement this during the Christmas break. --Schlurcher (talk) 14:36, 12 December 2024 (UTC) [reply ]
- Sounds great! I don't know if there is an API to check whether or not it's animated but that service may need to download the full thing as well (don't know if it can just selectively download some of its metadata). Prototyperspective (talk) 15:55, 12 December 2024 (UTC) [reply ]
- I'll probably implement this during the Christmas break. --Schlurcher (talk) 14:36, 12 December 2024 (UTC) [reply ]
- Short status update. I've now updated my bot to add instance of (P31) → animated GIF (Q11201061) to all animated Gif files it touches. However, adding the category Category:Animated GIF files sounds straigt forward, but it is not. For one, it is not a flat category, as it has subcategories that should be excluded. I also think that structured data is the correct way to proceed here. Any thoughts? --Schlurcher (talk) 12:46, 15 December 2024 (UTC) [reply ]
- Thanks for that. If there is a way to autoadd categories, it does need a way to exclude subcategories (so to not add a category above a category that's already set). However, that may be the only thing that's needed and such a way would be very useful. Maybe just having some structured data set would be be best for metadata like this that is about the kind of file. However, currently I think it's not because
- that SD is not yet added automatically and otherwise people need a way to quickly conveniently add categories using HotCat or CataLot to specify this (maybe this could be changed with the bot)
- one can't search / filter via the SD as far as I know which is I think the current main use of this cat – one can do things like
deepcategory:"Animated_GIF_files" time-lapse -deepcategory:"Time-lapse animations"
or use the search box at the top of the Animated_GIF_files cat.
- Prototyperspective (talk) 13:43, 15 December 2024 (UTC) [reply ]
- Thanks for that. If there is a way to autoadd categories, it does need a way to exclude subcategories (so to not add a category above a category that's already set). However, that may be the only thing that's needed and such a way would be very useful. Maybe just having some structured data set would be be best for metadata like this that is about the kind of file. However, currently I think it's not because
Mass changing WD statements about files
[edit ]I have my own bot, but I need to manage WD statements linked with files, they aren't stored in wikitext, so my bot can't change them. I need, for several categories, to do a job: remove one WD property and add another, with a different value for different categories. Could someone here do that? MBH 08:23, 12 December 2024 (UTC) [reply ]
- @Schlurcher or Mike Peel: Could you help out with this? — 🇺🇦Jeff G. ツ please ping or talk to me🇺🇦 09:35, 12 December 2024 (UTC) [reply ]
- Some more detail would be needed. Also sounds like a job that could be done with AC/DC gadget. --Schlurcher (talk) 14:32, 12 December 2024 (UTC) [reply ]
- @Schlurcher my case is described on phab:T381945. On categories like Category:Views_from_The_First_Tower_observation_deck it's needed to set P1071 for all files and, if exist, remove P180, because earlier I was uploading such batches setting a name of summit to P180 property instead of P1071. MBH 02:50, 13 December 2024 (UTC) [reply ]
- @PMG looks like you're doing this automatically, how you did it? MBH 02:54, 13 December 2024 (UTC) [reply ]
- These edits were done with Help:Gadget-ACDC which was also my first suggestion. --Schlurcher (talk) 06:50, 13 December 2024 (UTC) [reply ]
- @MBH - I am using AC/DC. There is also option to remove properties so you can both remove and add something. PMG (talk) 19:12, 15 December 2024 (UTC) [reply ]
- @PMG looks like you're doing this automatically, how you did it? MBH 02:54, 13 December 2024 (UTC) [reply ]
- @Schlurcher my case is described on phab:T381945. On categories like Category:Views_from_The_First_Tower_observation_deck it's needed to set P1071 for all files and, if exist, remove P180, because earlier I was uploading such batches setting a name of summit to P180 property instead of P1071. MBH 02:50, 13 December 2024 (UTC) [reply ]
- I've not figured out bot editing of SDC yet, I suggest asking @Multichill: . Thanks. Mike Peel (talk) 17:41, 12 December 2024 (UTC) [reply ]
- Some more detail would be needed. Also sounds like a job that could be done with AC/DC gadget. --Schlurcher (talk) 14:32, 12 December 2024 (UTC) [reply ]
US GOV accounts on Flickr
[edit ]I am requesting a upload of all USGOV Flickr accounts.
Unfortunately, many of them are locked behind the copyright tag. The C copyright tag is (unfortunately) the default tag on Flickr and most likely were never changed to the proper tag of being public domain via USGOV work. A change to USGOV means it has to be manually changed, which someone never did.
I say this because there was a recent change in administration that seemed to aim to gut the govt, including shuttering US A.I.D.. I’m just concerned the Flickr images will get deleted. Thank y ou. SeichanGant (talk) 17:57, 17 February 2025 (UTC) [reply ]
- I would suggest asking for a bot to review the account and if there is a copyright tag, make a list somewhere on site and let a human examine it. Otherwise download the details. Leave the bot operator to handle the bot tasks and let us humans do what we can. Ricky81682 (talk) 20:33, 17 February 2025 (UTC) [reply ]
- Flagging to Don-vip's attention as Don-vip runs OptimusPrimeBot and rather than someone like myself re-inventing the wheel, this might be a task suited to its skillset. TheSandDoctor (talk) 19:18, 13 March 2025 (UTC) [reply ]
- It's tricky. Sometimes the works are public domain because made by federal employees and wrongly licensed as copyrighted on Flickr. But it's often really copyrighted, because even published on an official Flickr account, content may be created by someone else. So the license must be individually checked for each file, and so it's fastidious. For now my bot imports a lot of US Gov pictures released under a free license, I would like to categorize all these pictures before starting to look into the "copyrighted" ones. Help appreciated :) vip (talk) 23:09, 13 March 2025 (UTC) [reply ]
- Flagging to Don-vip's attention as Don-vip runs OptimusPrimeBot and rather than someone like myself re-inventing the wheel, this might be a task suited to its skillset. TheSandDoctor (talk) 19:18, 13 March 2025 (UTC) [reply ]
Adding deletion request notification to multiple image files
[edit ]I am requesting adding the following notification to all the files linked at Commons:Deletion requests/Akhtar Aly Kureshy: {{delete|reason=The following files are not in scope of Commons, as their subject is not any really notable person, and are a result of a promotional campaign to an Akhtar Aly Kureshy. For details see the linked nomination page.|subpage=Akhtar Aly Kureshy|year=2025|month=March|day=29}}
and also
{{subst:idw||Akhtar Aly Kureshy|plural}}
to the talk pages of their uploaders. Thanks very much! -- Jan Kameníček (talk) 22:57, 29 March 2025 (UTC) [reply ]