This post is an extract from the chapter “Approaches to emotion and sentiment analysis” that I contributed to The SAGE Handbook of Digital and Social Media Marketing, which was edited by Annmarie Hanlon and Tracy L. Tuten.
There are, broadly speaking, three approaches to collecting emotion and sentiment data online: 1) asking customers about emotionally charged episodes that they experienced in the past, via online interviews or questionnaires; 2) placing customers in emotionally charged situations via online experiments; or 3) observing customers when they experience an emotionally charged incident via online conversations. Each approach has its merits and limitations, as summarised in Table 1, and discussed next.
Table 1. Overview of methods for collection of sentiment data, online
| Method | Advantages | Disadvantages |
| Online interviews and questionnaire | Can direct participants to specific aspects of interest to the researcher.Detailed understanding of causes (interviews), or the relationship between variables (survey).Cost effective.Suitable for participants that place a high value on their time, and for extended projects. | Quality of insight relies on participants ability to verbalise emotions.Participants may be unable to explain the causes and/or consequences of the emotion.Participants may be unwilling to revisit emotionally charged memories.Loss of non-verbal cues. |
| Online experiments | Suitable to study emotions.Simultaneous manipulation and measurement of variables.Observations in real time.Ability to testing relationship between emotions and behaviours, either directly (participant experiments) or indirectly (scenario-based experiments). | Unsuitable to study sentiment.Ethical concerns regarding mood induction and manipulationTechnical challenges of isolating independent and dependent effects;Limited ability to produce realistic experiments.Cost. |
| Online conversations | Abundance of data already available (but certain online communities may require permission to use data).Ability to study consumers at scale, in real time, and in a natural setting.Low cost. | Biased samples.Unsuitable for sensitive topics.Inability to confirm participants’ profile.Difficult to obtain informed consent.Social desirability bias.Limited control over content.Complexity of data handling and analysis. |
1. Online interviews and questionnaires
A common method used by marketing scholars and practitioners to collect emotion and sentiment data is to conduct online interviews or questionnaires, asking research participants to recall a situation that was emotionally charged, and then ask them questions about the phenomenon of interest to the researcher. In the former, the researcher obtains verbal accounts about the phenomena and behaviours of interest; whereas, in the latter, the researcher generates quantified, or quantifiable, data.
Interviews offer the researcher a detailed understanding of the observed behaviour, and their causes. For instance, Barnes and colleagues (2021) used online interviews to understand experiences of customer delight during the Covid-19 crisis. In turn, questionnaires can assist researchers in identifying relationships between the variables studied, or measure the effect of different factors on the phenomenon of interest. For instance, Wollebaek et al (2019) conducted an online survey to study how emotions such as anger and fear shape how people seek political information and debate political issues online.
In addition to the specific advantages of interviews and questionnaires, doing this type of data collection online can be very cost effective. It eliminates the need for travel, which saves time and money. It can also broaden the geographic reach of data collection, and makes it feasible to collect data across different time zones. The ability to conduct online interviews and questionnaires became particularly valuable during the Covid-19 pandemic, allowing researchers to continue collecting data despite the imposition of movement restrictions in many countries, to contain the spread of the virus. Rose et al (2014) recommend the use of technology-mediated interviews and questionnaires for collecting data from participants that place a high value on their time, such as managers, because it offers more flexibility than face to face interviewing; as well as for extended projects, or those aiming to capture the participants’ views in real time. For high volumes of data collection, it is also possible to use artificial intelligence. Namely, chatbots with Natural Language Processing capabilities are being introduced to collect valuable sentiment information, in the context of customer support and job interviews.
However, studying emotions and sentiment via online interviews and questionnaires has some limitations. Interviews and questionnaires rely on research participants being able to clearly express their emotions and sentiments, and to accurately identify the causes and/or consequences of those feelings. For instance, some research participants may not recognise that the use of social media impacts on their mental health, even though the former has been shown to impact the latter. Moreover, research participants may be unwilling to discuss emotionally charged memories, especially those that caused distress. Indeed, researchers may find it difficult to obtain ethical clearance from their institutions’ ethical boards to conduct interviews or surveys which may cause emotional distress in participants. More generally, conducting technology mediated interviews and questionnaires may result in the loss of important cues such as body language or mannerisms, which provide important information, to complement the words – particularly in terms of possible discomfort about the topic beings discussed, or the extent to which the research participant is being open about their emotions or sentiment.
2. Online experiments
In experiments we test research participants’ emotional responses to specific scenarios. Given the nature of experiments, this approach is not suitable to study sentiment. The main advantage of experiments is that they allow for the manipulation and measurement of the impact of separate independent variables, on the phenomenon of interest. For instance, manipulating the temperature of a retail outlet to measure its impact on purchase behaviour. Unlike interviews and surveys, experiments do not rely on the research participants’ ability or willingness to accurately recall or explain the emotion being studied. are commonly used in research, particularly in the fields of behavioural psychology and consumer behaviour.
Online experiments are, typically, done in one of two ways. One way of conducting online experiments is to place the research participants in situations where they are likely to experience a particular emotion. For instance, by manipulating the amount of emotional content displayed in Facebook’s News Feed, researchers tested the extent to which research participants’ emotions are influenced by observation of other Facebook users’ expression of emotions. This approach to online experiments has the advantage of directly testing the relationship between emotions on the one hand, and context or behaviour on the other. However, it presents some challenges in terms of the ability to manipulate moods. There are also ethical concerns around the elicitation of negative emotions, and the obtaining of informed consent.
The other common way of using experiments to study emotions is by placing research participants in the role of witnesses of an emotionally charged situation, for instance, by reading a description or by watching a video. Sands and colleagues used this method to test the effect of different types of service interaction on consumers satisfaction with the customer service provided, by asking research participants to read different scenarios where they tried to solve a hypothetical problem with their laptops. While this approach addresses some of the ethical concerns previously mentioned, there are serious questions about the extent to which the scenarios are realistic and salient to the participants; and the extent to which it is possible to induce the desired emotion simply by observing an exchange between third parties.
Manipulation of individual variables requires over simplification of the situation. Namely, an experiment examining the impact of different types of service interaction on customer satisfaction can only consider one type of agent (e.g., Chatbot vs human) and one type of script (e.g., educational vs entertainment) at a time; whereas actual customer interactions maybe have a blend of all these characteristics, plus many others.
3. Online conversations
The third and final approach to collecting emotion and sentiment data is by observing online conversations. There are a vast number and range of digitised texts publicly available online, such as chat threads, social media posts, discussion forums, product reviews, and other consumer generated content which can be collected via data scraping software, or through simple coding applications. There are also private sources such as e-mails, apps, digital diaries and other forms of digital text. Not only are these resources in digital format, already, but a significant share of them contain expressions of emotion, or expression of sentiment about products and brands, thus providing reach insight into the role of consumers’ experiences. We can also observe how emotionally charged content spreads, both in time and across users, thus obtaining insight about new behaviours related to sentiment elicitation or amplification, such as “whispering” or “flaming”. For instance, Liu (2019), found that messages posted by social bots are not only more likely to be negative than those posted by other users, but they are also significantly more likely to go viral.
These platforms are so popular and became such a part of consumers’ lives, that they are now seen as a valuable mechanism for studying consumer behaviour at large scale in a natural setting, and in a non-intrusive manner, thus obviating many of the limitations of online experiments. It has also been argued that when customers voice their emotions online, it’s because the experience has been highly impactful. Moreover, because much of these data are generated in real time, they also obviate the technical and ethical limitations of online interviews and surveys. Thus, it is no surprise that many marketing managers, as well as scholars, routinely use digital conversations as sources of insight about consumer sentiment, as well as consumer emotion.
In order to be able to collect such data, researchers need to be familiar with the usage practices and netiquette of the digital platform where the content is being collected from, such as relevant abbreviations, conventions and hashtags, in order to understand the context of the conversations. In some cases, researchers will also themselves have to become members of the online community of interest, and engage in activities such as creating a profile or an avatar, and posting content. Researchers also need to be aware that collecting digital texts from online communities (e.g., discussion boards in the parenting website, Netmums), and from within organisations (e.g., intranets) is likely to require special permission.
Despite the many advantages, and the popularity, of using online conversations for studying consumer emotions and sentiment, there are some limitations, too. It is important to remember that the profile of digital platforms users differs from that of the general population. For instance, research by Hecht and Stephens (2014) suggests that, in most social media platforms, there is a skew towards urban users and urban perspectives. Hence, digital platforms are best suited for projects that investigate the behaviour, perceptions or attitudes of those platform users, as in the case of Liu (2019)’s study of virality on Twitter. They are not suited to study the general population or heterogeneous social groups. Moreover, they are not suited to study topics that require introspection, or which are of a sensitive nature, be it personal or commercial.
Moreover, there is less certainty about the identity, or even the profile, of the research participants than in interviews, surveys or experiments. For instance, it may not be possible to confirm whether online reviews were written by genuine customers; or indeed by a person at all. Social bots – i.e., software agents that mimic human communication – are widely present and active in many social media and other digital platforms . In such cases, McKenna and colleagues (2017) advise developing research questions where the identity of participants is not important.
It is also challenging, or even impossible, to obtain informed consent. Many researchers believe that data that are in the public domain – such as social media – can be used without obtaining informed consent. However, in many discussion forums, intranets and other online communities, users may expect some level of privacy and control over their conversations, specially where the subject of analysis is contentious.
In the case of public platforms, their open nature means that the content shared is open to scrutiny. Therefore, the users of those platforms may be inclined to say things or behave in ways that protect their ego, or that are socially acceptable, rather than in line with what they truly feel or think. For instance, Ke and colleagues (2020) found that users of review websites are more likely to post something after a friend has done a review, than after a stranger has done the same.
Furthermore, contrary to what happens with the other data collection approaches discussed in this chapter, researchers are likely to have limited control over the topic of discussion. The online conversations may not focus specifically on what the researcher is looking for, even if they are broadly related to the topic. For instance, customer reviews may focus on the product, whereas the researcher is more interested in the packaging.
Finally, the large volume and variety of unstructured data collected creates complexity in terms of handling and analysis. For instance, a product review or a tweet post may have both text and image, or even an hyperlink to an external website, each requiring a different approach. The text may be analysed in terms of syntax or semantics; the images in terms of symbology or style; and the hyperlink in terms of site linked to, or impact on further sharing activity.
