Talk:Spam blacklist/Archives/2017-09
Proposed additions
cross wiki spam
kickass2.nz
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)kickass2.nz
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
isohunt.tv
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)isohunt.tv
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
torrentproject2.com
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)torrentproject2.com
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
isohunt2.org
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)isohunt2.org
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
Per @Berean Hunter:; see discussion --Dirk Beetstra T C (en: U, T) 12:56, 23 September 2017 (UTC)
- @Beetstra: Added Added to Spam blacklist. --Dirk Beetstra T C (en: U, T) 12:56, 23 September 2017 (UTC)
Proposed removals
youtu.be
youtu.be
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)youtu.be
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
I wonder if it is really wise to block youtu.be, since it is not a commercial shortener, and it is official in youtube.com which itself gives youtu.be url when you ask it a link to a specific timing in the video. -- Camion (talk) 09:25, 7 September 2017 (UTC)
- @Camion: Declined, there are several youtube.com movies blacklisted, and there really is no reason to use the redirect (which we then would have to blacklist as well). Moreover, there is ongoing XWiki spamming of the shortener by spambots, this keeps that at bay. Dirk Beetstra T C (en: U, T) 10:39, 7 September 2017 (UTC)
- Definitely not to be unblocked. It is a shortener, and it is grossly abused. You should see the hits here and especially at Commons. — billinghurst sDrewth 14:19, 7 September 2017 (UTC)
citibank.co.in
citibank.co.in
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)citibank.co.in
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
Could you please remove this site from your blacklist, as I am trying to add a highly successful alumnus (CEO of Citi India) of Simon Business School to its Notable alumni and students section?—The preceding unsigned comment was added by Astro5665 (talk)
- @Astro5665: Declined, please consider local whitelisting of the exact document on this server on the wiki where you want to use it. Dirk Beetstra T C (en: U, T) 13:19, 14 September 2017 (UTC)
Primaltrek.com
primaltrek.com
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)primaltrek.com
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
Primal Trek, very simply I have no COI with that website in any way. As for all the links I added those were all content additions and I simply sourced the content appropriately, and I want (read: Need) this source/reference/link to be removed from the blacklist simply because I still need to use it to write articles in (Simple) English, Dutch, and Gronings. To compare how an article would look with or without this links please see:
w:nl:Gebruiker:Donald Trung/Koreaanse mun zonder Primal Trek
- Vs.
w:nl:Gebruiker:Donald Trung/Koreaanse mun met Primal Trek
(I know fully well that no-one here speaks Dutch, but anyone can see how much information is dependent on that one link 🔗.)
The only conflict of interest I have is the conflict with this website being blacklisted out of my interest for writing and expanding articles with subjects covered by this website. Those IPs are not my "sockpuppets" I had clearly written on my Commonswiki and En-Wiki user pages that I was those IPs and where I added Primal Trek, half of the pages on that list are also articles I created of which their content is often 25~50% Primal Trek.
Sent from my Microsoft Lumia 950 XL with Microsoft Windows 10 Mobile 📱. --Donald Trung (talk) 09:14, 22 September 2017 (UTC)
- @Donald Trung: Declined, not blacklisted on meta. Dirk Beetstra T C (en: U, T) 12:59, 23 September 2017 (UTC)
fiverr.com
fiverr.com
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)fiverr.com
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
On EN WP there is consensus that we can link to Wikipedia job postings from such sites to help us deal with undisclosed paid editing. As such is there a way we can allow this link in talk / Wikipedia space?
Doc James (talk · contribs · email) 17:20, 21 September 2017 (UTC)
- @Doc James: We can look to numerous means, between total removal (which I think that would be problematic), to whitelisting at individual wikis by those wikis. If it is only the occasional url, then definitely not a global removal. — billinghurst sDrewth 08:19, 22 September 2017 (UTC)
- @Billinghurst: It is strange why we have some of these types of sites in the blacklist but not others (such as upworks or elance). Do you know where one can look at the issues that occurred and why it was added? Doc James (talk · contribs · email) 14:32, 22 September 2017 (UTC)
- @Doc James: that is not strange, it means that fiverr.com was spammed and that we did not observe that with the others. It was blacklisted after being caught by COIBot, see User:COIBot/XWiki/fiverr.com - spambots and spammers.
- I don't understand the consensus to 'link to Wikipedia job postings' .. is a mere mentioning of the job posting with a non-working link not enough, it is just on talkpages, not in content namespaces, right? --Dirk Beetstra T C (en: U, T) 12:51, 23 September 2017 (UTC)
- User:Beetstra Yes just in talk and Wikipedia space. Happy to see it still blocked in article space. None working links are a pain. Doc James (talk · contribs · email) 18:00, 23 September 2017 (UTC)
- @Doc James: I am sorry, for 'convenience' use we are not going to allow spammers again, hence delisting Declined. I fully, totally agree that it is a pain, but either blame that on the spammers or on WMF (the latter having ignored a complete overhaul of the spam-blacklist extension for several years now, I guess editor retention is more important). --Dirk Beetstra T C (en: U, T) 05:10, 24 September 2017 (UTC)
- User:Beetstra Is there a way to block links from article space but allowing it on Wikipedia and User space? Or is that one of the improvements needed? Maybe we can get improvements on the next community tech team list Doc James (talk · contribs · email) 05:15, 24 September 2017 (UTC)
- @Doc James: It would only be possible by removing it from the spam blacklist, and adding it to a content-namespace-only filter. However, that type of filtering is very heavy on the server. People have been suggesting to overhaul the whole system of the spam blacklist, and I have suggested to make it more edit-filter like
(削除) (trying to find the phab ticket) (削除ここまで). task T6459 --Dirk Beetstra T C (en: U, T) 06:08, 24 September 2017 (UTC) - By the way, I don't know if you want to allow this on talkspaces .. you might still see editors spamming this to get the jobs, just not in mainspace but on user talkpages .. --Dirk Beetstra T C (en: U, T) 06:12, 24 September 2017 (UTC)
- Also to note that removing it from the blacklist applies to all wiki. enWP has all the power to whitelist the domain and write spam filters to manage their exceptions. — billinghurst sDrewth 10:48, 24 September 2017 (UTC)
- True, but early tests on such filters on one wiki did not give much hope for for scaling up... the filter tends to be rather heavy, and I don't think that having that for the convenience of being allowed to link on talkpages is sufficient. --Dirk Beetstra T C (en: U, T) 11:22, 25 September 2017 (UTC)
- Also to note that removing it from the blacklist applies to all wiki. enWP has all the power to whitelist the domain and write spam filters to manage their exceptions. — billinghurst sDrewth 10:48, 24 September 2017 (UTC)
- @Doc James: It would only be possible by removing it from the spam blacklist, and adding it to a content-namespace-only filter. However, that type of filtering is very heavy on the server. People have been suggesting to overhaul the whole system of the spam blacklist, and I have suggested to make it more edit-filter like
- User:Beetstra Is there a way to block links from article space but allowing it on Wikipedia and User space? Or is that one of the improvements needed? Maybe we can get improvements on the next community tech team list Doc James (talk · contribs · email) 05:15, 24 September 2017 (UTC)
- @Doc James: I am sorry, for 'convenience' use we are not going to allow spammers again, hence delisting Declined. I fully, totally agree that it is a pain, but either blame that on the spammers or on WMF (the latter having ignored a complete overhaul of the spam-blacklist extension for several years now, I guess editor retention is more important). --Dirk Beetstra T C (en: U, T) 05:10, 24 September 2017 (UTC)
- User:Beetstra Yes just in talk and Wikipedia space. Happy to see it still blocked in article space. None working links are a pain. Doc James (talk · contribs · email) 18:00, 23 September 2017 (UTC)
- @Billinghurst: It is strange why we have some of these types of sites in the blacklist but not others (such as upworks or elance). Do you know where one can look at the issues that occurred and why it was added? Doc James (talk · contribs · email) 14:32, 22 September 2017 (UTC)
Troubleshooting and problems
Discussion
History-of-China.com
history-of-china.com
- (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)history-of-china.com
- (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)
Seems to have some useful information 🛈 on the Mongol Empire period in China (the Yuan Dynasty), no idea why it's blacklisted. 🤔 --Donald Trung (Talk 🤳🏻) (My global lock 😒🌏🔒) (My global unlock 😄🌏🔓) 09:37, 26 September 2017 (UTC)
- not globally blocked. Please address your concerns to English Wikipedia where there is a local block. — billinghurst sDrewth 12:19, 4 October 2017 (UTC)
- I actually need it for Dutch Wikipedia, well that's good news. --Donald Trung (Talk 🤳🏻) (My global lock 😒🌏🔒) (My global unlock 😄🌏🔓) 09:34, 5 October 2017 (UTC)
- This section was archived on a request by: — billinghurst sDrewth 12:00, 4 January 2018 (UTC)
Make all groups non-capturing
Currently, there are almost 200 capturing groups used in the blacklist. Because these capture, the regex engine has to devote extra resources to them, and because nothing is done with the groups, this extra expenditure is pointless. These groups can be made non-capturing by adding ?:
just after the opening parenthesis. Note that if any group already has a ?
following the opening parenthesis, the group shouldn't be touched. (The current coding is not actively problematic to my knowledge, so this is more an ounce-of-prevention, best practices/consistency thing; there are already ~160 groups that are non-capturing in the list.) Dinoguy1000 (talk) 21:28, 3 September 2017 (UTC)
- @Dinoguy1000: I noticed that User:Billinghurst has adapted all those regexes. --Dirk Beetstra T C (en: U, T) 05:01, 14 September 2017 (UTC)
- started, and noted to where. Will complete when I have a little more time. — billinghurst sDrewth 10:11, 14 September 2017 (UTC)
- Cool, I'd actually already forgotten about this request. If you're comfortable enough with regexes, I also noticed one or two groups that could be reduced to character sets (though I'd have to look through the list again to find them). There's probably other potential optimizations lurking in the list, too (though most of them are probably so minor as to not be worth worrying about from a purely optimization perspective). Dinoguy1000 (talk) 12:14, 14 September 2017 (UTC)
- @Dinoguy1000: You could just copy the whole page into a user-sandbox of yourself, and adapt them. A diff between the current version of the spam blacklist and your sandbox then gives us the opportunity to see what changed, and decide to copy it back into the blacklist if we don't see any problems. We do similar things with cleaning up old regexes that can go, or maintenance-type combination of multiple regexes into one. --Dirk Beetstra T C (en: U, T) 13:17, 14 September 2017 (UTC)
- I'm not broken up enough about it to do so at this time, I think, though I'll definitely keep the option in mind in the future. Dinoguy1000 (talk) 13:25, 15 September 2017 (UTC)
- @Dinoguy1000: You could just copy the whole page into a user-sandbox of yourself, and adapt them. A diff between the current version of the spam blacklist and your sandbox then gives us the opportunity to see what changed, and decide to copy it back into the blacklist if we don't see any problems. We do similar things with cleaning up old regexes that can go, or maintenance-type combination of multiple regexes into one. --Dirk Beetstra T C (en: U, T) 13:17, 14 September 2017 (UTC)
- Cool, I'd actually already forgotten about this request. If you're comfortable enough with regexes, I also noticed one or two groups that could be reduced to character sets (though I'd have to look through the list again to find them). There's probably other potential optimizations lurking in the list, too (though most of them are probably so minor as to not be worth worrying about from a purely optimization perspective). Dinoguy1000 (talk) 12:14, 14 September 2017 (UTC)
- started, and noted to where. Will complete when I have a little more time. — billinghurst sDrewth 10:11, 14 September 2017 (UTC)
- This section was archived on a request by: —MarcoAurelio (talk) 11:12, 26 February 2018 (UTC)
Global Email blacklist
Hello, FYI: Email blacklist is available now. --Steinsplitter (talk) 13:10, 13 September 2017 (UTC)
- This section was archived on a request by: —MarcoAurelio (talk) 11:12, 26 February 2018 (UTC)