Ah trolls. A species we know well – those people who bounce around in comments sections flinging language dung all over the intertubes.
Well, that language dung comes in handy when trying to spot a troll, it turns out.
Researchers have found that bad writing is one behaviour of several characteristics that can be crunched in a new algorithm that can predict commenters who’ll be banished for trollery.
The researchers, from Stanford and Cornell Universities, say in their paper that their algorithm can pinpoint future banned users (FBUs) with an 80% AUC (Area Under the Curve is a type of accuracy scoring that takes false positives into account).
The researchers analysed troll behaviour in the comments sections of three online news communities: the general news site CNN.com, the political news site Breitbart.com, and the computer gaming site IGN.com.
Those sites all have a list of users who’ve been banned for antisocial behavior: a total of over 10,000 antisocial lab rats.
The sites also have all of the messages posted by the banned users throughout their period of online activity, giving the researchers a handy pool of subject material, they said:
Such individuals are clear instances of antisocial users, and constitute 'ground truth' in our analyses.
The algorithm compares messages posted by users who were ultimately banned against messages posted by users who were never banned, managing to spot FBUs after analysing as few as 5 to 10 posts.
They found clear differences between the two groups:
- Trolls’ posts are more garbled. The researchers used several readability tests, including the Automated Readability Index (ARI), to gauge how easy it is to read a given chunk of text. They found that nearly all of the 10,000 FBUs studied displayed a lower perceived standard of literacy and/or clarity than the median for their host groups, with even that lackluster standard dropping as they neared their ultimate ban.
- Trolls swear more. Not only do they swear more, they’re also pretty decisive. They don’t tell others to “perhaps” go P off and F themselves, since they don’t tend to use conciliatory/tentative words such as “could”, “perhaps”, or “consider” – words that research has found tend to minimise conflict.
- Trolls are not sunshiney people. At least, they tend to stay away from positive words.
- Trolls tend to wander. They have a tendency to veer off-topic.
- Trolls like to dig in for protracted flame wars. This behaviour differs by community – on Breitbart and IGN, FBUs tend to reply to others’ posts, but on CNN, they’re more likely to start new discussions. But across all communities, they like to drag others into fruitless discussion, getting significantly higher replies than regular users and protracting the discussion by chiming in far more frequently per thread than normal people.
The communities themselves aren’t entirely off the hook when it comes to being turned into troll playgrounds, the researchers say.
[Communities] may play a part in incubating antisocial behavior. In fact, users who are excessively censored early in their lives are more likely to exhibit antisocial behavior later on. Furthermore, while communities appear initially forgiving (and are relatively slow to ban these antisocial users), they become less tolerant of such users the longer they remain in a community. This results in an increased rate at which their posts are deleted, even after controlling for post quality,
The researchers say the algorithm should be of high practical importance to those who maintain the communities.
But given its 80% accuracy, that still leaves 20% of commenters who could be unfairly tarred and feathered, they admitted.
Fed-up, out-of-patience communities themselves throw gas on the fire by overreacting to minor infractions – which can come off as unfair and cause FBUs to behave even more badly, the researchers say.
As well, the classification of a given user as troll or non-troll could stand to be a lot more nuanced. Feigning ignorance, for example, and asking naive questions might be a troll tactic too subtle to show up on the algorithm’s radar.
All of which suggests that patience might be a better approach than auto-squashing trolls, at least for now:
Though average classifier precision is relatively high (0.80), one in five users identified as antisocial are nonetheless misclassified. Whereas trading off overall performance for higher precision and [having] a human moderator approve any bans is one way to avoid incorrectly blocking innocent users, a better response may instead involve giving antisocial users a chance to redeem themselves.
Readers, what do you think? Are trolls redeemable? Please tell us your thoughts in the comments section below.