Posted on behalf of Robert Thorburn
The sixth paper session at the 2020 Web Science conference presented a broad scope collectively titled as “Text, Topics and Trends”. Presented by academics from Europe to Japan, the papers in this session tackled topics ranging from an exploration of linked subcultures to a classification of information exchanged during disasters. Although these areas might at first seem somewhat unrelated, there is significant overlap in what is being studied and how such studies are carried out. This is because language use within a set group or period is the central area of investigation, while techniques such as Natural Language Processing are commonly used. Consequently, when the first presenter described their work as “trying to understand how communities differ from the broader hegemony”, we can see this sentiment echoed through all the papers.
There were, however, also significant differences between some of the groups who’s language was analysed by the various papers. So for instance, cultural subgroups were brought together by their pre-existing beliefs, often deterministic in nature, while other groups were brought together by circumstances which were often short-lived. This reflected in the nature of language used with groups coalescing around pre-existing beliefs or shared topics of interest, having a high tendency towards creating or employing neologisms. A site like The Urban Dictionary is a prime example of this since it hosts not only such new words but also presents lexicological entries for them. Of course, a fair few of these entries are fake or humorous. There is certainly also a case to be made that such activities could, over time, impact on linguistic shifts.
The level of involvement of such sites may be incidental though, with individual influencers having much more sway with their followers than a platform in general. Accordingly, agency in such shifts rests with leadership figures while the platforms concerned are “simply” points of congregation for those interested in, or affected by, the topic at hand. These points of congregation do, however, make things notably easier for researchers since there is a clear point for data collection. This allowed for data from set time-periods to be collected after the fact and processed as needed, which positively impacts on the technological overhead for such a study. As a result, even modestly funded projects could cope with computational needs, though the researcher all generally agreed that more resources would be needed to conduct such research at scale, which is significant since larger projects are needed to drive the research forward.