Analysing emotion and sentiment data

This post is an extract from the chapter “Approaches to emotion and sentiment analysis” that I contributed to The SAGE Handbook of Digital and Social Media Marketing, which was edited by Annmarie Hanlon and Tracy L. Tuten.

Once data have been collected using the relevant data collection tool, researchers can analyse those inputs looking for ratings or other metrics of emotion or sentiment, in the case of surveys and experiments. Alternatively, they may look at terms, phrases or expressions that reflect sentiment as in the case of interview transcripts or online conversations. In the latter case, and given that the first purpose of emotion and sentiment analysis is to identify the valence of the underlying emotion, the data analysis process usually involves the conversion of qualitative data to quantitative one. For instance, terms associated with a positive emotion would be converted into “+1”, neutral ones into “0”, and negative ones into “-1”. This is followed by measuring the frequency with which each type of emotion is displayed.

If our goal were to also identify strength of emotion in order to distinguish between different types of positive or negative emotion (e.g., sadness vs anger), then we would go a step further by attributing weights to particular attributes. For instance, the negative term “furious” would be weighted more heavily than the term “annoyed”, to convey that it refers to a high level of arousal.

This section provides an overview of the key elements of the process of analysing digital data related to emotion and sentiment. First, we highlight the importance of pre-processing the data. Then, we distinguish between top down and bottom up approaches, before considering the most popular methods in each. 

            1. Pre-processing

Before emotion and sentiment data can be analysed, it needs to be pre-processed – namely, cleaned and organised. If the main input is text (e.g., collected through online interviews or online conversations), then the input data will need to be organised into sentences, with the various forms of a word (e.g., like, likes, liked…) linked so that they can be processed as one[1]. Likewise, it is a good idea to convert negative sentences into their opposite, to ensure that we capture the right emotion or sentiment. For instance, “not good” would be converted to “bad”, so that the phrase in question is accurately recorded as having negative valence. In addition, we need to remove common but irrelevant words that work as pause or as connecting phrases words, such as “er” and “hum” or “a” and “the”; otherwise, we will end up with very high counts of terms that do not provide any insight regarding either the prevalent emotion, or sentiment, and its causes. Finally, we may want to classify key words according to some relevant attribute (e.g., noun, adjective, technical term, …), so that, later, we can filter the data according to those attributes.

If the data are in quantitative format (e.g., collected through online questionnaires or experiments), then we will need to look out for missing values (e.g., parts of the questionnaire that were not completed). If the values missing refer to a variable that is not part of the model (for instance, if location is not important), then we can simply ignore those answers. However, if there are missing inputs for important variables, then the missing data will have to be handled[2]. In the case of experiments, it is also customary to conduct manipulation checks in order to assess whether the respondent is engaged with the experiment, and reading the questions carefully and fully. Manipulation checks should also be used to assess whether the manipulated variable is producing the intended effect. For instance, whether watching a scary film trailer induces anxiety as intended, or, instead, induces boredom. Finally, manipulation checks can be used to test for mediating effects, such as whether watching the scary movie produces a state of anxiety due to empathy with the main character as hypothesized by the researchers, or for some other reason.

It is also possible that the data will be in visual format. For instance, social media posts often include photos, gifs, videos, emojis and other non-text characters. As a first step in pre-processing these images, the researchers may want to classify the images in terms of type (e.g., photo vs video) to allow for similar types of images to be treated together. Another common step is to transform these image – e.g., resizing or rotating – in order to help with the visualisation of the image’s content. In the case of emojis, it may be necessary to adjust for visual distortions induced by the operating system, or the social media platform where the image was collected. A study by Miller and colleagues found that classification of emojis as representing a positive, neutral or negative emotion could vary by as much as 25% depending on whether the image had been collected from an Apple iPhone or a Google Nexus phone, thus showing the need to consider such technical aspects, when analysing emojis to study the expression of emotions and sentiments.

In summary, pre-processing is a crucial – and, often, time consuming – step in the analysis of emotion or sentiment data, which can significantly impact the quality of the insight that can be generated.

            2. Approaches to emotion and sentiment analysis

When conducting emotion or sentiment analysis, we can follow three approaches: a top down approach, a bottom up approach, or a combination of the two.

            The top down approach starts with a list of terms that represent the constructs of interest, such as ‘happy’ for a positive emotion vs. ‘angry’ for a negative one, or ‘like’ for a positive sentiment vs. ‘unfavourable’ for a negative one. This is usually in the form of a word dictionary, but could also include other symbols, such as emoticons or quantitative ratings. For instance, in their analysis of customer reviews, Villarroel Ordenes and colleagues (2017), used the star ratings from the product review platform, the dictionary of emotion words from the LIWC program, and the emoticons dictionary from Pcnet. 

            The bottom up approach, in contrast, searches the dataset to identify the terms (words, symbols, etc.) that are commonly used to express emotions or sentiment, in that particular research context. This could be done, for instance, by using semantic network analysis tools like Leximancer; or by using machine learning. An example of this approach can be found in Tirunillai and Tellis (2014). These two researchers used unsupervised machine learning to identify how users of product review websites expressed product quality in different markets, from mobile phones to toys.

            While published research on emotion and sentiment analysis tends to favour one approach or the other, it is also possible to combine both approaches. One example of such a combination is provided in Nisar et al (2020)’s investigation of the impact of online word of mouth on firms’ reputation. The researchers used unsupervised machine learning (bottom up approach) to identify prevalent topics, followed by predefined dictionaries (top down approach) to classify the emerging topics into positive, negative or neutral sentiment.

            3. Popular emotion and sentiment analysis methods

This section provides a brief overview of popular emotion and sentiment analysis methods in management research. It is beyond the scope of this chapter to provide detailed technical descriptions of each method. However, examples are offered for each method, and these provide an opportunity for further reading. We start by discussing methods that follow a top-down approach, before considering those that follow a bottom-up approach.

            Top down approaches aim to operationalise constructs within the specific setting of the research. For instance, we may want to understand the drivers of customer sentiment towards coffee. Following a top down approach, we take the expressions of such sentiment as the starting point, and then look for how they vary with the variables of interest, such as product features, the social context, the time of the day, or other parameters. Table 2 lists popular emotion and sentiment analysis methods that follow a top-down approach.

A common top-down method is the “predefined dictionary”, whereby we use an existing word list or dictionary for specific emotions (e.g., the LWIC dictionary); and, then, essentially, search through the dataset (e.g., selection of product reviews, or interview transcripts) to count how many times the words on the list appear in the dataset. If the dataset consists of images, then, instead of words, we would use a dictionary of images commonly associated with each emotion (e.g., hearts, smiles or thumbs up, to denote positive emotions).

Alternatively, we can use the “customised dictionary” method, whereby we adapt or create a word list that is specific to the context of our research. For instance, the expression “salty” could refer to the amount of salt in the food, in one context; but be used to describe someone’s bad mood, in another. So, before proceeding with data analysis, we would need to add “salty” to our dictionary, to signify the intended meaning. Alternatively, we could change the classification of words in a pre-existing dictionary, to reflect the fact that they reflect a different type of emotion or sentiment in the setting where we are conducting the research. For instance, “sick” is likely to refer to a negative emotion for adults, but a positive emotion for teenagers and young adults. Hence, if we are studying texts from the first group, this term receives a negative score. However, if we are studying inputs from the second group, this term needs to receive a positive score. Once the dictionary has been customised, we can use it to search through the dataset, and to count the frequency of appearance of different terms in the dataset.

A third common, top down method in emotion or sentiment analysis is the “rule based” method. Here, we combine various dictionaries, and classify the dataset based on the co-occurrence of terms from the dictionaries according to a specific set of rules. For example, Packard and Berger (2017), used a rule that flagged reviews containing a first-person pronoun, as well as words starting with “recomm” (e.g., recommend), “endorse” or “suggest”. An example of the use of such a rule would be “I suggest this book for everyone”. Once the rules regarding terms’ co-occurrence have been defined, we search through the dataset, and count the frequency of rule occurrence in the dataset.

Table 2. Popular top-down analysis methods for construct operationalisation

MethodDescriptionExample
Predefined dictionaryUse an existing word list or dictionary (e.g., LWIC), to search through the dataset. Count occurrence of selected terms (words, images…) in the dataset.Villarroel Ordenes et al (2017)
Customised dictionaryCreate new terms’ list, or adapt existing one to the research context (e.g., by changing classification of words). Then, use this dictionary to search through dataset, and to count occurrence of selected terms (words, images…) in the dataset.Marinova et al (2018)
Rule basedDefine rules for combining the dictionaries, and for searching the dataset. Then, count occurrence of the selected combinations in the dataset.Packard and Berger (2017)

Adapted from Villarroel Ordenes & Zhang (2019)

Bottom down approaches can be used to operationalise constructs, too, though the terms are derived from classification of the dataset, rather than imposed form an external structure. There are two popular methods for doing this (see table 3). One common method to doing this is through “supervised machine learning”. Using this method, we start by classifying the dataset according to emotion or sentiment (for instance, based on star reviews, or quantitative rating). Subsequently, we search the dataset to identify the terms (words, symbols, images…) associated with those emotions. Afterwards, the rules developed through this process can be used to predict emotion or sentiment in another dataset, based on the presence of the features identify through the supervised learning process. 

Another bottom-up method for operationalising constructs is “deep learning”. Here, researchers start with a dataset classified for relevant inputs. For instance, for restaurant reviews, the relevant inputs might be i) food, ii) service, and iii) price. Then, using an algorithm (for instance, convolutional neural network), the researchers extract the relevant classes among the dataset, and calculate the predictive power of each component[3]. Using this method, Zhang and Luo (2019) found that restaurant survival is more influenced by the availability of photos of the food, than the text reviews; and by the total volume of user-generated content than the valence of the sentiment expressed in that content.

Table 3. Popular bottom-up analysis methods for construct operationalisation

MethodDescriptionExample
Supervised machine learningClassify the dataset according to the known outputs. Then, search the input dataset to identify the terms associated with those outputs, and the rules connecting inputs and outputs.Shirdastian et al. (2019)
Deep learningClassify the dataset for relevant inputs. Then, extract the relevant classes among the dataset, and calculate the predictive power of each component.Zhang & Luo (2019)

Adapted from Villarroel Ordenes & Zhang (2019)

In addition to operationalising constructs, bottom up approaches can be used to identify latent topics. There are three popular methods for doing this (see table 4). The first such method is “Semantic network” which treats each term (e.g., word) as a node, and, then, analyses the dataset to identify common co-occurrences between nodes (Figure 2). These co-occurrences are referred to as links. This way, researchers can identify which target features are mostly associated with positive vs negative sentiment or emotions. They can also calculate the strength of the emotion or sentiment, by analysing the frequency of co-occurrence. Using this approach to study vaccine hesitancy, Kang and colleagues (2017) found that negative sentiment towards vaccination tended to be associated with generic topics such as the vaccine industry or doctors; whereas positive sentiment tended to be associated with specific diseases such as measles, HPV or meningococcal disease. 

Figure 2. Illustration of semantic network approach

Adapted from (Danwoski et al, 2021)

Another popular bottom-up method for latent topic elicitation is “unsupervised machine learning”. Here, researchers use training datasets without labels. It is then incumbent on the algorithm to identify the best way of grouping (or clustering) the data points (words, images…), and to develop rules for how the different terms may be related. This was the method used by Tirunillai and Tellis (2014) to analyse product reviews, leading them to conclude that customers used different ways to express sentiment depending on the type of product. In turn, Arefieva et al (2021) demonstrate the application of this method to the analysis of images.

A third, and final, method for latent topic elicitation is “word embeddings”. This method aims to identify, categorise and quantify words hat have the same meaning, or are used in similar ways, in the dataset – e.g., recommend, suggest, advise… It is these clusters of terms with semantic similarity (rather than individual terms) that are used to measure emotion or sentiment in the dataset. This methodology is illustrated in Tang et al’s (2014) analysis of 10 million tweets.

Table 4. Popular bottom-up analysis methods for latent topic elicitation

MethodDescriptionExample
Semantic networkIdentifies key terms (the nodes) associated with a subject of interested, and the common co-occurrences between each term (the links).Kang et al (2017)
Unsupervised machine learningIdentify clusters of different terms used to express sentiment, and develop rules for how the different terms related to each other.Arefieva et al (2021)
Word embeddingsIdentify clusters of terms with semantic similarity within a given dataset. Then, use these groupings to measure sentiment.Tang et al’s (2014)

Adapted from Villarroel Ordenes & Zhang (2019)


[1] In linguistics, this is referred to as “Lemmatisation”.

[2] A detailed explanation of the methods for doing so falls outside of the scope of this chapter. Readers are directed to a research methods book, such as Rose et al. (2014).

[3] A step-by-step example of classification of images, using deep learning techniques, is provided in https://engineeringblog.yelp.com/2015/10/how-we-use-deep-learning-to-classify-business-photos-at-yelp.html

Leave a comment