Using online texts as sources of data in research

Even since I came across the blog post from Helen Kara, arguing that “researchers should make as much use of secondary data as possible before we even think about gathering any primary data”, that I have been looking for new opportunities to use pre-existing online data to complement, or even replace, fresh data collection efforts.

It’s in this spirit that, today, I am sharing with you the paper “Uniting the Tribes: Using Text for Marketing Insight”. The paper was authored by Jonah Berger, Ashlee Humphreys, Stephan Ludwig, Wendy W. Moe, Oded Netzer and David A. Schweidel. The paper was published in the Journal of Marketing, and you can find an open access version here. It provides an overview of possible uses of textual data for research in marketing, as well as some methodological steps for doing so.

Berger and colleagues outline two main uses of texts: as a reflection of the producer of the text vs as a shaper of attitudes and behaviour in the receiver of the text. So, for instance, a social media feed can be studied with a focus on the producer of the text, to look at questions such as Who is writing? Why are they writing? What do they feel? Why do they feel that? What have they done? What do they want? Etc… An example of this approach is Blose, Umar, Squicciarini and Rajtmajer (2020)’s examination of tweets to understand issues of concern to US residents, during the Covid-19 pandemic. Alternatively, a social media feed may be studied with a focus on the consumers of the text, to answer questions such as: Which topics get the most attention? How does language shape reaction to the message? How engaged are the audience? What do the readers feel about the topic or the messenger? And so on. An example of this approach is Wang, Zhang, Li, McLeay and Gupta (2021)’s analysis of the impact of companies’ online responses to the Covid-19 crisis on consumers’ sentiment towards the brands.

There are numerous sources that discuss where to find good secondary data, as well as how to collect, analyse and present it. For instance, this video by Helen Kara on Finding And Using Secondary Data In Research; or this video by Marco Tulio Pires on free tools to collect, analyse and display (online) data. Though, admittedly, it is easy to fall into the habit of always using the same sources. And that is why I found the paper by Berger and colleagues so useful. Among other things, it includes the table below, which classifies texts in terms of producers and intended receivers, and provides various examples:

Image source

Have you come across any interesting sources or uses of data, recently?

