Jump to content
Wikimedia Meta-Wiki

Talk:Spam blacklist/Archives/2014-07

From Meta, a Wikimedia project coordination wiki
Latest comment: 10 years ago by Glaisher in topic Troubleshooting and problems
Warning! Please do not post any new comments on this page. This is a discussion archive first created in July 2014, although the comments contained were likely posted before and after this date. See current discussion or the archives index.

Proposed additions

Latest comment: 10 years ago 17 comments5 people in discussion
This section is for completed requests that a website be blacklisted

vin-decoder.com


    vin-decoder.com

    (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)vin-decoder.com
    (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

    See LinkReports. Additional spam in [1] and [2].
    Added Added. -- seth (talk) 23:35, 2 July 2014 (UTC)

    youtube.com/v/


    youtube.com

    (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)youtube.com
    (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

    This can be used to bypass the SBL. Will this block things that shouldn't actually be blocked? --Glaisher [talk] 17:48, 22 May 2014 (UTC)

    Which youtube.com/v/ would bypass which rule on the blacklist? I'm not sure what you mean. You are not talking about youtu.be, right? --Dirk Beetstra T C (en: U, T) 12:07, 26 May 2014 (UTC)
    @Beetstra: Usually youtube links are in the format youtube.com/watch?v=code but youtube.com/v/code can also be used. For instance, XePjp-H3TBI video is currently blacklisted by the regex \byoutube\.com/watch\?.*\bv=(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b. However if youtube.com/v/XePjp-H3TBI is used, it will not be rejected. --Glaisher [talk] 16:21, 3 June 2014 (UTC)
    Well, it's not the only way around that restriction. You could just use youtube.com/embed/XePjp-H3TBI too, for example. Or even youtube.com/watch/?v=foo would get around that. And that's not even including sites like youtube "repeaters", etc. - e.g. yourepeat.com/watch?v=XePjp-H3TBI. This is not something we can ever cover 100% in my opinion. PiRSquared17 (talk) 20:48, 3 June 2014 (UTC)
    For youtube.com/v - I would just change the regex that is there: '\byoutube\.com\/(?:watch\?v=|embed\/)(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b' (not sure about the regex).
    For the repeaters - if they are true repeaters, and hence practically just a 'redirect' for the same movie - I would not have a lot of mercy on them, either pre-emptively but at the very least a 'one strike and they are completely out' (just like we do with normal redirect sites). --Dirk Beetstra T C (en: U, T) 06:00, 4 June 2014 (UTC)
    That regex doesn't blacklist youtube.com/v/XePjp-H3TBI, let alone youtube.com///v//XePjp-H3TBI. Maybe it would be easier to just block any URL containing the video ID... I'm not sure. PiRSquared17 (talk) 23:42, 5 June 2014 (UTC)
    Hi!
    The mentioned methods could probably summarized by the following regexp:
    \byoutube\.com/+(?:watch/*\?.*\bv=|embed/+|v/+)(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b
    or, being even more restrictive
    \byoutube\.com/.*(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b
    If we just blocked the IDs (without the condition of being related to youtube), this could -- in rare cases -- lead to unwanted blockings of other URLs that contain on of the same IDs by chance.
    BTW: tqedszqxxzs and khM48EQyVdc are unavailable videos, so I guess, they could be removed from blacklist (if the spamming is more than 1a old). -- seth (talk) 15:42, 15 June 2014 (UTC)
    /video/ redirects to watch?v= PiRSquared17 (talk) 21:33, 15 June 2014 (UTC)
    The best thing probably is to block \byoutube\.com/.*(?:tqedszqxxzs|XePjp-H3TBI|khM48EQyVdc|A4jgXQQns8A|oVBOnv\-xrEY)\b. I'll change that now. -- seth (talk) 22:20, 29 June 2014 (UTC)
    Closing as Added Added for archiving. --Glaisher (talk) 16:19, 11 July 2014 (UTC)

    Spambots


      maxmanpoweradvice.com

      (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)maxmanpoweradvice.com
      (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)


        sejour-caraibes.com

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)sejour-caraibes.com
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

        Multiple additions by spambot accounts and IPs. --Glaisher (talk) 16:14, 11 July 2014 (UTC)

        Added Added. --Glaisher (talk) 16:15, 11 July 2014 (UTC)

        French travel site


        voyage-de-noces.org

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)voyage-de-noces.org
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)


        sejour-saint-martin.com

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)sejour-saint-martin.com
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)


        vacancestop.com

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)vacancestop.com
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)


        sejoursaintmartin.com

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)sejoursaintmartin.com
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

        A couple variations are already blocked, and all resolve to 82.165.21.151. — billinghurst sDrewth 01:30, 24 July 2014 (UTC)

        Added Added -- — billinghurst sDrewth 01:30, 24 July 2014 (UTC)

        Cross-wiki spam


        hisosoccer.com

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)hisosoccer.com
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)


        hisosoccer.blogspot.com

        (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)hisosoccer.blogspot.com
        (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

        Multiple additions xwiki.--Glaisher (talk) 11:21, 26 July 2014 (UTC)

        Added Added --Glaisher (talk) 11:22, 26 July 2014 (UTC)

        Proposed removals

        Latest comment: 10 years ago 16 comments8 people in discussion
        This section is for archiving proposals that a website be unlisted.

        meloteca.com


          meloteca.com

          (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)meloteca.com
          (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

          It was added back in 2008 without any reason or discussion. According to the logs at User:COIBot/LinkReports/meloteca.com, there was a single instance of small-scale spamming of a particular page on that site across a few wikipedias, on 27 March 2008, and the whole site was blocked the next day without any recorded discussion. Cross-wiki linksearch reveals it's in use in several articles, pointing to relevant pages of the website. --Waldir (talk) 14:36, 29 June 2014 (UTC)

          @Waldir: Digging through the history (and I needed to dig out a deleted page) ... Cross-wiki link additions by 89.155.235.205. --COIBot 23:12, 27 March 2008 (UTC). So it might be worth looking at the XWiki page. I will keep digging. Looking at the page, it doesn't look problematic. — billinghurst sDrewth 09:28, 7 July 2014 (UTC)
          Removed Removed. — billinghurst sDrewth 11:23, 7 July 2014 (UTC)
          Thanks, billinghurst :) --Waldir (talk) 21:08, 7 July 2014 (UTC)

          Remulve my site foxylex.dk to spam list


          foxylex.dk

          (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)foxylex.dk
          (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

          Please remulve my site foxylex.dk to spam list this site is Law Danish site - this is not spam received was confusion Thanks for understanding — The preceding unsigned comment was added by 130.204.156.232 (talk)

          Removed Removed. Comfortable with the removal. Though I would hope that you please look at the appropriate addition of weblinks prior to your next edits. — billinghurst sDrewth 09:33, 7 July 2014 (UTC)

          Ascender Corporation


          ascenderfonts.com

          (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)ascenderfonts.com
          (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)


          ascendercorp.com

          (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)ascendercorp.com
          (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

          Ascender Corporation is a typeface foundry that was involved in the design of fonts like Droid, Liberation and several others. Their domains were presumably blacklisted in 2010 due to spamming, but it also prevents linking to them for sourcing. I don't see a reason to keep this global blacklist entry anymore. Don Cuan (talk) 11:03, 5 June 2014 (UTC)

          Talk:Spam_blacklist&oldid=1999210#ascenderfonts.com, I am not adverse to delisting as it has been ~ four years, though would expect that we would monitor and put them back pretty quickly if it recurs. You can always ask for a local whitelisting at the wiki where you are trying to reference with the url. — billinghurst sDrewth 04:01, 6 June 2014 (UTC)
          Removed Removedbillinghurst sDrewth 12:31, 23 July 2014 (UTC)

          uservoice.com


            uservoice.com

            (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)uservoice.com
            (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

            Was added by User:Mike.lifeguard as a URL shortener. Unfortunately, UserVoice is not a URL shortener, but used as feedback collection to many large software products. This specially includes Microsoft products like Visual Studio and the .NET Framework, Windows Phone, and Microsoft Office and SharePoint.

            The filter should be removed, so that Wikipedia can provide helpful links to the appropriate feedback sites within the articles of the given products.

            MovGP0 (talk) 12:27, 12 June 2014 (UTC)

            I am not sure that it is not open to abuse if it is removed. I would also think that it is quite possible for us to point back to each manufacturers site, and they have the ability to point onwards to their relevant pages. Remember that the WPs are not a directory listing but an encyclopaedia. If you wish to take it forward, I would think that trying for a whitelisting at a WP where you wish to add the links is the place to start. If the site has no issues with exploitation of the link then we can look to remove it from the blacklist, or you could look to the next WP. — billinghurst sDrewth 13:12, 12 June 2014 (UTC)
            Declined inactive request, no follow-up from initial inquiry — billinghurst sDrewth 12:33, 23 July 2014 (UTC)

            cypress.com


            cypress.com

            (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)cypress.com
            (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

            I suggest taking cypress.com off the blacklist.

            The Wikipedia:WP:ELYES guideline recommends that an article about some subject should link to the subject's official site, if any. And so the Wikipedia:Cypress Semiconductor article should link the official website of Cypress Semiconductor. (Their official website is cypress.com , right?)

            Since the manufacturer of a chip is generally viewed as a reliable source of technical information about that chip, certain pages on that manufacturer's website are good references in articles like Wikipedia:List of common microcontrollers and Wikipedia:PSoC.

            The cypress.com regex was apparently added to this blacklist 19 June 2010,[3] in response to a request on this talk page.[4]

            --DavidCary (talk) 18:49, 23 June 2014 (UTC)

            @DavidCary:, Hi. This is a mess. When the URL was blacklisted, it would have been standard process to first remove all the links, because a blacklisted link blocks editing a page containing a link, unless the user identifies the link and removes it. There is a link in the Wikipedia article, and it's been standing since before the blacklisting. The immediate fix, for Wikipedia, is to ask for local whitelisting, either of the entire domain or of specific pages. So I looked for history of this, and found that there will be, ah, kicking and screaming if one goes for the whole site:
            • removal request June 2011 on en.wiki while it was blacklisted here, so the requests were in the wrong place, see the comments from regular antispammers and links to spam reports.
            • removal request June 2013 on en.wiki ditto.
            • As to whitelist requests on en.wiki,[5], there have been six. The most recent request links to the others; it expired without action, which is common. One request was granted. It looks like nobody requested a whitelisting for the raw URL for the company web site, which probably would not be spammed. Beetstra? You've done these on enwiki before. This is an obvious legitimate usage for the company article.
            • the November 2013 request, denied
            • If you want to whitelist there, David, be aware that it can take a long time. However, if you place a complete request, showing a need for the link and not merely a desire, and nobody responds for, say, a week, you can go to w:WP:AN and ask for any admin to do it.
            On one of the denied requests, what the requestor really wanted to do wasn't appropriate for Wikipedia, but would have been perfect for Wikiversity, and we get whitelistings there, on the occasions they are needed, usually in a day. I don't see that many Wikipedians understand what Wikiversity is for. Some real-world classes use Wikiversity for article projects that later get transwiki'd to Wikipedia. --Abd (talk) 02:50, 24 June 2014 (UTC)
            You should seek a whitelisting for the relevant "about" page at the wikipedia of interest, and possibly discuss a broader whitelisting. That a wiki has a rule about a url, is a local rule which stewards will pay heed, though will never redeem the issues of the spamming. That said, a successful whitelisting at a wiki with no corresponding spam issues provides an evidence base for the removal from a blacklist. — billinghurst sDrewth 11:19, 24 June 2014 (UTC)

            Note that this was blacklisted in 2009, removed afterwards, and re-blacklisted in 2010 .. because .. the spamming continued. Nonetheless, I would not be completely against removing this and try again.

            Regarding the sourcing - it would be primary source for data, generally it is better to find secondary sourcing stating the same. I know that that is not always acceptable or possible, but since this will, likely, only affect a couple of pages per wiki, I would consider to whitelist those where there is nothing else. Obviously, there is nothing against the local whitelisting of, e.g., an index.htm or the about page (we do not generally whitelist the raw-url, the regex becomes complex and/or editors would be able to spam/abuse the base-url again). --Dirk Beetstra T C (en: U, T) 08:31, 25 June 2014 (UTC)

            Removed Removedbillinghurst sDrewth 12:34, 23 July 2014 (UTC)

            Troubleshooting and problems

            Latest comment: 10 years ago 7 comments4 people in discussion
            This section is for archiving Troubleshooting and problems.

            t.co incorrectly blocking


            t.co

            (LinkSearch: meta | en | es | de | fr | ru | zh | simple | c | d | Wikipedias: top 25 · 50 · major wikis · sc · gs)(Search: Google | en (G) | fr (G) | de (G) | meta (G) | backlinks | → links ←)t.co
            (Reports: Report ← track | XWiki | Local | en | find entry)(DomainTools: whois | AboutUs | Malware?)

            t.co is a url shortner, so blocking this is necessary. But, some areas using .co as a second level domain are involved in this blacklist. For example, Japanese company Kinki Nippon Tourist Individual Tour Sales Co., Ltd. (近畿日本ツーリスト個人旅行販売) has a domain www.knt-t.co.jp , but this cannnot be linked now.--Jkr2255 (talk) 03:41, 25 February 2014 (UTC)

            Status: Done
            I was able to make the regex more specific for the shortener — billinghurst sDrewth 09:28, 25 February 2014 (UTC)

            User:Billinghurst - this needs to be done differently, as t.co was now linkable: see diff. I have undone this adaptation and returned to \bt\.co\b for now, please adapt it to something that does solve the problem. --Dirk Beetstra T C (en: U, T) 16:19, 27 April 2014 (UTC)

            https://www.mediawiki.org/wiki/Extension:SpamBlacklist#Usage is obviously wrong. t.co has been added 70 times since the change of the rule, the ones I checked typical redirects which should have been blocked. --Dirk Beetstra T C (en: U, T) 16:29, 27 April 2014 (UTC)

            bugzilla:64541billinghurst sDrewth 11:08, 28 April 2014 (UTC)
            The bugzilla suggested (?<!-)\bt\.co\b but this did not prevent the addition of twitter links. — billinghurst sDrewth 13:10, 23 July 2014 (UTC)
            Closed Closed Should be possible to add it now. Special:Diff/9315971. --Glaisher (talk) 16:56, 26 July 2014 (UTC)

            Discussion

            This section is for archiving Discussions.

            AltStyle によって変換されたページ (->オリジナル) /