Sentiment analysis: what is it?
Sentiment analysis aims to extract people’s opinions about products, brands, issues, or concepts from texts, such as reviews, news texts or social media posts. Sentiment analysis can be done in several ways. One way is to analyse texts manually. This is usually a laborious task. It doesn’t scale well.
Since the early 2000s, researchers and industry users have developed various approaches for automating sentiment analysis. Automated sentiment analysis can run on a large scale and provide insights for market and social research. The goal is to measure sentiments about a given research object correctly, fast and at low costs.
Sentiment analysis: what are the challenges?
Sentiment analysis is a feature of many analytics apps, e.g., social media monitoring. But many solutions on the market oversimplify and return answers that are too vague or unreliable to be used as a basis for business decisions.
The closer you look the uglier it gets. We came to this conclusion based on our experience using various software libraries for R and Python, cloud platform APIs and models embedded in social media monitoring tools. We tested their validity for automated sentiment analysis in small experimental projects and on the enormous datasets, our specialised social media monitoring tool Cosmention (www.cosmention.com) generates.
This series of four articles aims to make you aware of the difficult challenges a valid sentiment analysis must meet. It’s vital to look closely at what is being sold as sentiment analysis and demand proof of the validity.
Hence, it is a guide for market researchers and market research clients: What to watch out for when choosing a sentiment analysis software or provider.
We won’t discuss the general requirements for data analysis and social media monitoring, such as defining a clear analysis scope and removing irrelevant, duplicate and spam records from datasets. However, we’d like to point out that many projects already struggle with these requirements.
The 10 challenges of sentiment analysis- General aspects
1. Unclear target: what is a sentiment expression referring to?
To understand the meaning of a sentiment expression (i.e., what sentiment a series of words is expressing), you need to know its target. The sentence “I like the product” isn’t useful if we don’t know what “the product” is.
When analysing reviews, the target product name (the name of the product we wish to understand the sentiment about) is part of the review’s metadata. This also includes the review author’s name, the website where the review was published and the time of publishing.
But social media posts or other texts don’t have metadata that clearly identifies a target. A text can be about many distinct products, brands, people and issues. And a sentiment expressed in it could refer to any of these.
Not having a clear target means that the expressions’ meaning is unclear. The fact that a keyword (a word we want to understand the sentiment behind) was used in a text also doesn’t necessarily indicate that one or more sentiment expressions in the text refer to that keyword. They may instead refer to a different target in the same text. One can often see from simply reading a few posts that many sentiment expressions don’t refer to a keyword that happens to be elsewhere in the text. Such errors are more likely in long texts covering many topics.
Hence, clearly identifying the connection between sentiment expression and the target is essential for any sentiment analysis. The solution is to make the target explicit. This can be done by combining named entity recognition (NER) and dependency parsing.
NER is a machine learning method that highlights entities like products or brands in a text. It works best when the entities have unique names, like “MAC Cosmetics”, but it is not well suited to identifying more general issues in discussion. The next step is dependency parsing, which analyses the grammar of a sentence and reveals how sentiment expressions relate to the identified entities.