Edit - Meta Stack Exchange

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Rev

Required fields*

Can a machine be taught to flag spam automatically?

TL;DR: We did it, so... yes.

What is this?

Charcoal is the organization behind the SmokeDetector bot and other nice things. This bot scans new posts across the entire network for spam posts and reports them to various chatrooms where people can act on them. If a post has been created or edited, anywhere on the network, we've probably seen it. The bot utilizes our knowledge of how spammers work and what they have previously posted to come up with common patterns and rules to detect spam in the new and updated posts. You've likely seen the SmokeDetector bot if you visit chatrooms such as Tavern on the Meta, Charcoal HQ, SO Close Vote Reviewers and others across the network. Over time, the bot has become very accurate.

Now we are leveraging the years of data and accuracy to automatically cast spam flags. With approximately 58,000 posts to draw from and over 46,000 true positives, we have a vast trove of data to utilize.

What problem does this address?

To put it simply, spam. Stack Exchange is one of the most popular networks of websites on the Internet, and all of it gets spammed at some point. Our statistics show that we see about 100 spam posts per day that get past the system filters.

A decent chunk of this isn't the type you'd want to see at work (or at all). The faster we can get this off the home page, the better for all involved. Unfortunately, it's not unheard of for spam to last several hours, even on the larger sites such as Graphic Design.

Over the past three years, efforts with Smokey have significantly cut the time it takes for spam to be deleted. This project is an extension of that, and it's now well within reach to delete spam within seconds of it being posted.

What are we doing?

For over 3 years, SmokeDetector has reported potential spam across the Stack Exchange network so that users can flag the posts as appropriate. Users have provided feedback to inform the bot on whether the detection was correct or not (referred to as "feedback"). This feedback is stored in our web dashboard, metasmoke (code). Over time, we've used this feedback to evaluate our patterns ("reasons") and improve our accuracy. Several of our reasons are over 99.9% accurate.

Early last year, and after getting a baseline accuracy from jmac (thank you!), we realized we could use the system to automatically cast spam flags. On Stack Overflow the current accuracy of users flagging spam posts is 85.7%. Across the rest of the network users are 95.4% accurate. We determined we can beat those numbers and eliminate spam from Stack Overflow and the rest of the network even faster.

Without going into too much detail (if you really want it, it's available on our website), we leverage the accuracy of each existing reason to come up with a weight indicating how certain the system is that a post is spam. If this value exceeds a specific threshold, the system will cast up to three spam flags on the post. We cast multiple flags utilizing a number of different users' accounts and the Stack Exchange API. Via metasmoke, users are given the opportunity to enable their accounts to be used to flag spam (you can too, if you've made it this far). When a post is eligible for flagging because it exceeded the threshold set by each individual user, accounts are randomly selected from the pool of enabled users to cast a single flag each, up to a maximum of three per post so that we never unilaterally nuke something. (For this reason, accounts with moderator privileges on a site aren't selected to cast automatic spam flags, and only one flag is cast on sites with a deletion threshold of 3 flags.)

What are our safety checks?

We designed the entire system with accuracy and sanity checks in mind. Our design collaborations are available for your browsing pleasure (RFC 1, RFC 2, and RFC 3). The major things that make this system safe and sane are:

We give users a choice as to how accurate they want to be with their automatic flags. Before casting any flags, we check that the preferences the user has set result in a spam detection accuracy of over 99.5%¹ over a sample of at least 1000 posts. Remember, the current accuracy of humans is 85.7% on SO and network wide it is 95.4%.
We do not unilaterally spam nuke a post, regardless of how sure we are it is spam. This means that a human must be involved to finish off a post, even on the few sites with lower spam thresholds.
We’ve designed the system to be tolerant of faults - if there’s a malfunction anywhere in the system, any user with access to SmokeDetector can immediately halt all automatic flagging - this includes all network moderators. If this happens, it needs a system administrator to step in to re-enable flags.
We've discussed this with a community manager and have their blessing on the project.

Results

We have been casting an average of 60-70 automatic flags per day for over two months, for a total of just over 6000 flags network wide. These flags were cast by 22 different users. In that time, we've had four false positives. We would like to be able to automatically cancel these particular cases. This isn't possible though, so we've created a feature request to retract flags via the API. In the meantime, the flags are either manually retracted by the user or declined by a moderator.

Weights and Accuracy

The above graph plots the weight of the reasons against its overall volume of reports and accuracy. As minimum weight increases, accuracy (yellow line and rightmost Y-axis) and total reports (blue line) on the left-hand scale increase. The green line represents the total number of reports (possible spam posts), and the blue line the number of true positives, which are verified by user feedback.

Automatic Flags per day

This shows the number of posts we've automatically flagged per day over the last month. The jump on February 15th, is due to increasing the number of automatic flags from 1 per post to 3 per post. You can see a live version of this graph on metasmoke's autoflagging page.

Spam Hours

Spam arrives on Stack Exchange in waves. It is easy to see the time of day that many spam reports come in. The hours, above, are UTC time. The busiest spam times of day are the 8 hour block between 4 am and noon. We have affectionately named this "spam hour" in the chat room.

Average Time to Deletion

Our goal is to delete spam quickly and accurately. The graph shows the time it takes for a reported spam post to be removed from the network. This section has three trend lines that show these averages. The first, red section is when we were simply reporting the posts to chatrooms and all flags had to come from users. You can see we are pretty constant in the time it takes to remove spam during this period. It took, on average, just over five minutes to get a post removed.

The green trend line is when we were issuing a single automatic flag. At implementation, we eliminated a full minute from time to deletion and after a month we'd eliminated two full minutes compared to no automatic flags.

The last section, the orange, is when we implemented three automatic flags to most sites. This was rolled out last week, but it's already had a dramatic improvement on the time to deletion. We are seeing between 1 and 2 minutes to time to deletion.

As mentioned above, spam arrives in waves. The dashed and dotted lines on the graph show the average deletion time during these two different time periods. The dashed lines show deletion time during 4 am and noon UTC, and the dotted lines show the rest of the 24 hour period. An interesting thing this graph shows is that time to deletion during spam hour was higher when we didn't cast any automatic flags. It was removed faster outside of spam hour. That reversed when we started issuing a single auto-flag. The spam hour time to deletion is slightly lower than the average. Comparing the two time periods though, time to deletion during non-spam hour at the end of the non-flagging time period and the end of the single flag period are roughly the same.

We'll update these in a few weeks too, to better show the trend we are seeing with three automatic flags.

Discussion

We are confident in SmokeDetector and the three years of history it has. We've had many talented developers assist us over the years and many more users have provided feedback to improve our detection rules. Let us know what you want us to elaborate on, features you're wondering about or would like to see added, or things we might have missed in the process or the tooling. Take a look at the feature we'd really like Stack Exchange to consider so that we can further improve this system (and some of the other community built systems). We'll have Charcoal members hanging around and answering your questions. Alternatively, feel free to drop into Charcoal HQ and have a chat.

¹ As of 2018年03月05日, the accuracy threshold is 99.75%, instead of 99.5%.

Answer*

## Why not push this a bit further? Would it not be even more transparent and effective?

As you have demonstrated and I had no doubt, programs are more efficient than humans.

Currently you are using other users' flags to reduce the number of humans needed to nuke the post with the objective to decrease effort and time to deletion (which will have, as a benefit, less interest to spam SE and less effort from SE users to flag it).

While this is great and again I have no doubt that the algorithms used are more precise and effective than any human, the problem of responsibility remains, four normal users stating "the bot flagged for me", a fewer normal users deciding if the post should be nuked.

**Push it further! It will be more transparent and effective**

What I suggest is to use moderator accounts on the different sites, to directly spam-delete the post.

If these moderators (as we) trust the algorithms and the statistics, let them use their account, the result will be:

1. Responsibility of who deleted the post.

2. Possibility for who was responsible to restore reputation and post if something goes wrong (it's the moderator)

3. Increased efficiency of the spam blocking system. If we and moderators trust it, let's immediately delete these posts without using "socket" users.

You would only need volunteer moderators on major SE sites and since many moderators from different SE sites are already involved in this, I think it will not be a problem.

These moderators need to agree to use the account and are ready to check what have been deleted. In case of any harm is done, they have the privilege to restore the situation.

Draft saved

Draft discarded

Edit Summary*

Cancel

16

Moderators are ATM not allowed to autoflag on their sites because we never want to unilaterally nuke something. We want all of Charcoal to see if it needs action - having a mod nuke would negate that.

Mithical
– Mithical

2017年02月20日 20:28:02 +00:00
Commented Feb 20, 2017 at 20:28
29

That's a step we're not willing to take yet. Sending flags from mortals allows us to send signal to the system; sending flags from a moderator account would be a whole new step. If we were to do that, it'd be extremely limited in scope and with a lot of staff consultation.

Undo
– Undo

2017年02月20日 20:28:33 +00:00
Commented Feb 20, 2017 at 20:28
11

@PetterFriberg Main difference is that using a moderator account could nuke a post within a few seconds of it being posted, bringing a 100-rep penalty and SpamRam fun - all without human eyes ever needing to be set upon it.

Undo
– Undo

2017年02月20日 20:36:22 +00:00
Commented Feb 20, 2017 at 20:36
11

@PetterFriberg We may expand the system, but we're not going to expand from 3 flags to moderator flags - it'll be staged. If 3 flags goes well, maybe we move to 4. If that goes well, maybe 5, and maybe on to 6. We'll be talking to Stack Exchange staff throughout that process, and moderator flags aren't even something we'd consider until we're ready to use 6 flags anyway.

ArtOfCode
– ArtOfCode

2017年02月20日 20:36:28 +00:00
Commented Feb 20, 2017 at 20:36
42

As a moderator, I feel that it is inappropriate to give someone else (even an automated system) access to my moderator privileges to take any sort of action on my behalf. As a moderator, my flags are immediate with few checks.

Thomas Owens
– Thomas Owens

2017年02月20日 21:35:15 +00:00
Commented Feb 20, 2017 at 21:35
28

I don't think giving access to the moderator account to a bot is a good idea. It might even be a violation of the moderator agreement, it certainly is if non-moderators have access to the bot. Moderator accounts have access to PII, if a bot were to cast binding flags, it should happen via an SE-provided API that doesn't require giving out the full access a real moderator has.

Mad Scientist
– Mad Scientist

2017年02月20日 21:42:12 +00:00
Commented Feb 20, 2017 at 21:42
2

@PetterFriberg no, we will never ask to give Smokey employee access, nor will SE ever let us. It would effectively allow anyone with access to the account (including non-mods such as me) to use employee-only tools (nuke every question ever created? why not?)

angussidney
– angussidney

2017年02月24日 07:28:13 +00:00
Commented Feb 24, 2017 at 7:28
2

Letting regular users do this is already irresponsible. Setting accuracy aside, it misrepresents what the user is doing and who or what is doing the flagging. Another consideration: Flag weight still exists IIRC, just behind-the-scenes -- this allows the user to use a bot to inflate their own trustworthiness in the eyes of the system, giving greater weight to their own manual flags.

user154510
– user154510

2017年02月24日 23:00:35 +00:00
Commented Feb 24, 2017 at 23:00
2

@MatthewRead The first step in getting something integrated with SE is to prove it can be done without their systems.

ɥʇǝS
– ɥʇǝS

2017年02月25日 00:12:16 +00:00
Commented Feb 25, 2017 at 0:12
1

@ɥʇǝS I doubt that. Regardless, the first step backward is abusing their systems and user account access. I can't and won't speak for SE, they might totally be OK with this, but it conflicts with everything I know as a mod.

user154510
– user154510

2017年02月25日 00:14:44 +00:00
Commented Feb 25, 2017 at 0:14
4

@MatthewRead A number of Charcoal people are mods, myself included. We've had a chat about this with a CM over several months as its been moving towards implementation, and have been given permission to do this.

ArtOfCode
– ArtOfCode

2017年02月25日 00:16:06 +00:00
Commented Feb 25, 2017 at 0:16
6

We are also moving towards tighter integration with SE, as Pops said in a comment above somewhere, but that takes both time and developer effort on their part, which is spread thin right now. Both of these options have advantages and disadvantages, but we believe the benefits of getting spam deleted faster outweigh the negatives of this system.

ArtOfCode
– ArtOfCode

2017年02月25日 00:18:30 +00:00
Commented Feb 25, 2017 at 0:18
4

@MatthewRead I'm actually a little confused as to the nature of your general objection, as SpamRam is essentially already a fully automated post nuking system that you are presumably OK with. Smokey merely expands the spam filter with some extra rules, and as a bonus still requires human confirmation unlike the usual spam filter. If Smokey's ruleset were simply integrated into the existing fully automated system it does not seem like you would have the same objection, rather, you would likely appreciate SE improving their existing bot that unilaterally nukes posts (i.e. their spam filters).

Jason C
– Jason C

2017年02月25日 00:38:26 +00:00
Commented Feb 25, 2017 at 0:38
3

@MatthewRead To address "this allows the user to use a bot to inflate their own trustworthiness in the eyes of the system, giving greater weight to their own manual flags": There are more than 100 people signed up right now. We issue ~230 flags per day. It's load balanced (randomly distributed, actually) across those 100 users, then across a dozen or so high-spam sites. (230/100)/12 is a very small number. It's not going to win you an election.

Undo
– Undo

2017年02月25日 00:48:42 +00:00
Commented Feb 25, 2017 at 0:48
1

I don't see how my comments could possibly be interpreted as being against automation. I am against abusing user accounts for this. It's true that I would have no objection with it being integrated into the system. "It's not going to win you an election" is a straw man.

user154510
– user154510

2017年02月28日 18:52:44 +00:00
Commented Feb 28, 2017 at 18:52

| Show 15 more comments

How to Edit

Correct minor typos or mistakes
Clarify meaning without changing it
Add related resources or links
Always respect the author’s intent
Don’t use edits to reply to the author

How to Format

create code fences with backticks ` or tildes ~
```
like so
```
add language identifier to highlight code
```python
def function(foo):
print(foo)
```
put returns between paragraphs
for linebreak add 2 spaces at end
_italic_ or **bold**
indent code by 4 spaces
backtick escapes `like _so_`
quote by placing > at start of line
to make links (use https whenever possible)

<https://example.com>

[example](https://example.com)

<a href="https://example.com">example</a>

formatting help »
answering help »

How to Tag

A tag is a keyword or label that categorizes your question with other, similar questions. Choose one or more (up to 5) tags that will help answerers to find and interpret your question.

complete the sentence: my question is about...
use tags that describe things or concepts that are essential, not incidental to your question
favor using existing popular tags
read the descriptions that appear below the tag

If your question is primarily about a topic for which you can't find a tag:

combine multiple words into single-words with hyphens (e.g. stack-overflow), up to a maximum of 35 characters
creating new tags is a privilege; if you can't yet create a tag you need, then post this question without it, then ask the community to create it for you

popular tags »