Simon Schweighofer and David Garcia will present their findings at the “Emotion and culture” session of the ISRE 2019 Conference, which takes place from July 11 to 13, 2019, in Amsterdam.
Abstract
Text-based sentiment detection methods presuppose that emotional expressions are distributed uniformly within a text. We tested this assumption by analyzing more than 17 Million public online messages from five social media in English, German, and Chinese. We applied established sentiment lexica, including the English, German, and Chinese versions of the Linguistic Inquiry and Word Count dictionaries, and Affective Norms lexica of word valence and arousal in English and German.
Our results show clear positional patterns of emotional expression both at the level of sentences and whole messages: First, shorter messages and sentences contain more intense and more frequent emotional terms. Second, emotional terms are clustered at the beginning and at the end of sentences and messages. And third, negative terms are preferentially found at the end of sentences and messages. These patterns are reflected both in frequency and intensity of emotional terms, are stable across multiple languages and corpora, and are observable across several orders of magnitude of message length. This suggests that these patterns might be cultural universals.
We offer an explanation of these patterns in terms of the well-known serial position effect: If we assume that speakers prioritize the transmission of emotions to their audience, it makes sense that they communicate emotions at the beginning and end of messages, in order to take advantage of the greater retention rate in short-term memory. Our results show how the computerized analysis of large-scale datasets can reveal patterns of emotional expression that are stable across contexts and cultures.