Skip to content
Naked Security Naked Security

New algorithm could auto-squash trolls

Researchers have come up with a tool that spots troll behaviour and low readability 80% of the time. That's 20% "oops." Worth it?

Ah trolls. A species we know well – those people who bounce around in comments sections flinging language dung all over the intertubes.

Well, that language dung comes in handy when trying to spot a troll, it turns out.

Researchers have found that bad writing is one behaviour of several characteristics that can be crunched in a new algorithm that can predict commenters who’ll be banished for trollery.

Don't feed the trolls, image courtesy of ShutterstockThe researchers, from Stanford and Cornell Universities, say in their paper that their algorithm can pinpoint future banned users (FBUs) with an 80% AUC (Area Under the Curve is a type of accuracy scoring that takes false positives into account).

The researchers analysed troll behaviour in the comments sections of three online news communities: the general news site, the political news site, and the computer gaming site

Those sites all have a list of users who’ve been banned for antisocial behavior: a total of over 10,000 antisocial lab rats.

The sites also have all of the messages posted by the banned users throughout their period of online activity, giving the researchers a handy pool of subject material, they said:

Such individuals are clear instances of antisocial users, and constitute 'ground truth' in our analyses.

The algorithm compares messages posted by users who were ultimately banned against messages posted by users who were never banned, managing to spot FBUs after analysing as few as 5 to 10 posts.

They found clear differences between the two groups:

    • Trolls’ posts are more garbled. The researchers used several readability tests, including the Automated Readability Index (ARI), to gauge how easy it is to read a given chunk of text. They found that nearly all of the 10,000 FBUs studied displayed a lower perceived standard of literacy and/or clarity than the median for their host groups, with even that lackluster standard dropping as they neared their ultimate ban.
    • Trolls swear more. Not only do they swear more, they’re also pretty decisive. They don’t tell others to “perhaps” go P off and F themselves, since they don’t tend to use conciliatory/tentative words such as “could”, “perhaps”, or “consider” – words that research has found tend to minimise conflict.
    • Trolls are not sunshiney people. At least, they tend to stay away from positive words.
    • Trolls tend to wander. They have a tendency to veer off-topic.
    • Trolls like to dig in for protracted flame wars. This behaviour differs by community – on Breitbart and IGN, FBUs tend to reply to others’ posts, but on CNN, they’re more likely to start new discussions. But across all communities, they like to drag others into fruitless discussion, getting significantly higher replies than regular users and protracting the discussion by chiming in far more frequently per thread than normal people.

The communities themselves aren’t entirely off the hook when it comes to being turned into troll playgrounds, the researchers say.

[Communities] may play a part in incubating antisocial behavior. In fact, users who are excessively censored early in their lives are more likely to exhibit antisocial behavior later on. Furthermore, while communities appear initially forgiving (and are relatively slow to ban these antisocial users), they become less tolerant of such users the longer they remain in a community. This results in an increased rate at which their posts are deleted, even after controlling for post quality,

The researchers say the algorithm should be of high practical importance to those who maintain the communities.

But given its 80% accuracy, that still leaves 20% of commenters who could be unfairly tarred and feathered, they admitted.

Fed-up, out-of-patience communities themselves throw gas on the fire by overreacting to minor infractions – which can come off as unfair and cause FBUs to behave even more badly, the researchers say.

As well, the classification of a given user as troll or non-troll could stand to be a lot more nuanced. Feigning ignorance, for example, and asking naive questions might be a troll tactic too subtle to show up on the algorithm’s radar.

All of which suggests that patience might be a better approach than auto-squashing trolls, at least for now:

Though average classifier precision is relatively high (0.80), one in five users identified as antisocial are nonetheless misclassified. Whereas trading off overall performance for higher precision and [having] a human moderator approve any bans is one way to avoid incorrectly blocking innocent users, a better response may instead involve giving antisocial users a chance to redeem themselves.

Readers, what do you think? Are trolls redeemable? Please tell us your thoughts in the comments section below.

Image of troll courtesy of Shutterstock.


troll-squashing 80% innocents being tarred and feathered 20%. Worth the risk methinks…


I agree that a human moderator needs to make the final (subject to appeal?) decision, but this tool should give the moderator a very good starting point.


How does it react when presented with different, but correct spellings? For example, US English often uses different spelling and grammar to that used for UK English. Some words commonly used colloquially in a country may not nbe used in other countries, for example ‘outwith’ is not uncommon in Scotland but almost never used in England or Wales. Likewise the spelling of the natural element we in the UK spell as ‘suphur’ is apparently now given as ‘sulfur’ in the US. None of these variations are indicative of a troll at work.


I goofed when I mentioned misspelling. From what I gather, readability indexes don’t check for spelling errors. The paper mentions using Mechanical Turk to farm out readability assessment but does not say that misspellings were (or were not) assessed by those reviewers. As well, they mention that “Dictionary-based approaches may miss non-dictionary words,” such as Obummer, FWIW. I’m not sure whether non-dictionary ..words correlated with trollery, but my guess is that they don’t. [article corrected to remove reference to misspelling -Ed]


The paper mentions is needs 5-10 posts to make a classification. I would imagine that a tool to ban trolls would use a confidence level to perform automatic bans. I wouldn’t want to scare away users because I start banning after 5 posts. However, I wouldn’t want trolls to be allowed to make 100 posts before being banned either. If this algorithm is implemented into a tool, I’d like to see a way to adjust the minimum post counts required before the tool made a decision. That seems like the most logical approach. It’s the same reason more dots on a line-graph give us higher degrees of accuracy.


As a moderator myself, with a couple of forums, I’d expect that the algorithm would react barely, if at all, faster than the “report to moderator” link or whatever that’s typically included in most boards/forums/etc. Definitely the only “action” the algorithm should take is to fire off a warning message to a human admin, rather than take it upon itself to do the banning or whatever, to more or less objectively confirm complaints from others.

It’s generally agreed that the best working policy is “Don’t feed the trolls” … discourage any kind of reply, and let the topic “die” by slipping back a few pages in the index for lack of activity.


Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to get the latest updates in your inbox.
Which categories are you interested in?
You’re now subscribed!