#WebSci 20 – Paper Session: Hate Speech and Propaganda by Ashton Kingdon

Posted on behalf of Ashton Kingdon

Paper 1: DeepHate: Hate Speech Detection via Multi-Faceted Text Representations
Rui Cao, Roy Ka-Wei Lee and Tuan-Anh Hoang

This paper acknowledges that whilst there may be many traditional machine learning and deep learning methods to automatically detect hate speech on social media fora, many of these methods only consider single-type textual features and consequently neglect richer textual information that could be utilised to improve detection. It is clear that tech giants have made a substantial effort to combat the spread of hate speech on their platforms, providing clear policies on hateful conduct, implementing mechanisms for users to report hate speech, and employing content moderators to detect hate speech actively. However, the researchers recognise that such approaches can be considered labour intensive, time consuming, and thus not scalable or sustainable in the long-term. Indeed, the gravity of the issues and limitations of manual approaches has motivated the search for automatic hate speech detection methods, and, in recent years, researchers from the data mining and natural language processing fields have proposed several strategies. Hence, the focus of this research is in the proposal of new novel deep learning technologies that utilise multi-faceted text representations for automatic hate speech detection.

Ā Paper 2: Russian trolls speaking Russian: Regional Twitter operations and MH17
Alexandr Vesselkov, Benjamin Finley and Jouko Vankka

This paper focuses on the role of social media in promoting media pluralism and the governmental manipulation of social media to spread propaganda and disinformation. In particular, focus is placed on the alleged system of professional ā€˜trollsā€™ operating both domestically and internationally in Russia. In 2018, to improve transparency regarding these potential ā€˜trollsā€™, Twitter released longitudinal research on the accounts identified as Russian trolls and the tweets they were disseminating. Whilst it is recognised that the foreign-targeted English language operations of these ā€˜trollsā€™ have received significant attention, research has yet to analyse their Russian language domestic and regional targeted activities, despite the fact that half of the tweets released are in Russian. This paper addresses this gap by characterising the Russian-language operations of Russian trolls, utilising both descriptive and temporal analysis, and then zooming in to focus specifically on ā€˜trollā€™ operations relating to the crash of Malaysia airlines flight MH17, one of the deadliest incidents in the conflict in Ukraine. Key findings include ā€“ Russian language trolls have run 164 hashtag campaigns; 29% of these were political statements praising Russia and Putin; 26% criticised Ukraine; 9% criticised the United States and the Obama Administration. The ā€˜trollā€™ accounts were also found to be actively re-sharing information in periodic temporal patterns, suggesting the trolls were using automation tools for posting.

Paper 3: Still Out There: Modelling and Identifying Russian Troll Accounts on Twitter
Jane Im, Eshwar Chandrasekharan, Jackson Sargent, Paige Lighthammer, Taylor Denby, Ankit Bhargava, Libby Hemphill, David Jurgens and Eric Gilbert

This paper also focused on Russia, more specifically, the Internet Research Agency, and its alleged attempt to interfere with the 2016 US election by running fake accounts on Twitter, referred to as Russian Trolls. Their paper develops machine learning models that predict whether or not a specific twitter account is a Russian troll. By utilising a dataset of 170,000 control accounts, the researchers demonstrate that it is possible to use this model to find active accounts on Twitter that are likely still acting on behalf of the Russian state. The use of behavioural and linguistic features provide evidence that it is possible to distinguish between a troll and a non-troll with a precision of 78.5%.

Paper 4: Measuring and Characterizing Hate Speech on News Websites
Savvas Zannettou, Mai Elsherief, Elizabeth Belding, Shirin Nilizadeh and Gianluca Stringhini

This paper addressed the type of content that was attracting hateful discourse and the possible effects of social networks on the commenting activity on different news articles. The researchers performed a large-scale quantitative analysis of 125 million comments posted on 412,000 articles over the course of 19 months. The research sough to address the following research questions:

  1. Is hateful commenting activity correlated with real world events?
  2. Can we find important differences between the users that are posting on news sites according to their partisanship?
  3. Can we find linguistic differences in articles that attracts substantial members of hateful comments when compared to articles that do not?
  4. Do news articles attract more hate comments after they are posted on other web communities like 4chan and Reddit?

The content was analysed using temporal, user-based, and linguistic analysis to uncover what elements attract hateful comments on news articles. The research found that there were statistically significant increases in hateful commenting activity around real-world divisive events like the Unite the Right rally in Charlottesville and political events such as the 2016 American presidential election.

Paper 5: ACT: Automatic Fake News Classification Through Self-Attention
Nujud Aloshban

The focus of this research was on automatic fact checking and the proposed different approaches based upon traditional machine learning methods using hand-crafted lexical features. The paper specifically identifies the weaknesses in analysing text claims without considering the facts that are not explicitly given, but can be derived from it. Hence, the researchers proposes an end-to-end framework titled Automatic Fake News Classification Through Self-Attention (ACT) which exploits different supportive articles to claims which mimic manual fact-checking processes. The model presents an approach that computes the claim credibility by aggregating over the prediction generated by every claim-retrieved article pair.

 

Leave a Reply

Your email address will not be published. Required fields are marked *