[This article first appeared in the Clarabridge Bridgepoints newsletter, Q1 2009.]
Feelings influence purchasers. Our attitudes determine what we buy, whom we buy from, and what we’re willing to pay. They are central to perceptions of quality, style, and value and sometimes matter as much in decision-making as facts such as features and price. They also determine what we say about products and services to our friends, on social networks, and publicly.
Sentiment analysis is the application of text-analytics technology to detect, extract, and assess attitudinal information – feelings, opinions, perceptions – expressed in human communications. It’s not enough to study behavioral data. While sentiment may be inferred from transactions and customer interactions, the richest sources of attitudinal data are customer (and shopper and reviewer and expert) voices expressed in blogs, social media, and forum postings, in e-mail, in contact-center transcripts, and in survey responses. These sources record sentiment in unstructured text.
Text analytics transforms these voices into data for automated, computerized processing. The technology is far faster and more efficient and consistent than manual analysis, allowing adopters to systematically mine previously inaccessible information sources. The benefits can be very significant, in applications that include customer-experience management (CEM), brand and reputation management, product and service quality management, financial and market analysis, and the like. The common thread is the value sentiment analysis brings to both finding and solving individual problems and to understanding trends and larger, market demands. In the words of Chris Jones, manager of analytics at financial-software publisher Intuit, “Once we understand what motivates our customers, we can make improvements.” How do Intuit analysts react according to Chris? “’We’ve never gotten access to this data before. This is incredible; it’s awesome.’”
Sentiment Analysis by Example
Sentiment analysis may seem straight-forward, but a few examples will show how difficult the problem can get. Start with a snippet of a hotel review, copied from an on-line review site, emphasis added:
“From the moment I checked in to the moment I checked out, service was absolutely amazing. All of the staff were very friendly and accommodating – a far cry from the service I’ve received at other national chains in the city.”
A favorable review. We see that from a few, positive words. A happy guest!
Note that the polarity of sentiment in this excerpt applies at both the review level and the topic level. Two particular aspects of the guest experience are rated, service quality and staff attitude. Software should be capable of discerning particular topics or concepts and the sentiment about them individually.
Note also the modifiers “absolutely” and “very,” which indicate strong sentiment intensity, as do the repetition and use of capitals and exclamation points in the title of another review:
“Loud. Very LOUD!!”
We see that style and punctuation matter. Another example shows that picking up sentiment polarity via keywords is not enough. Consider:
These are essentially the same words as in the first example but the tone is different. The topic is staff friendliness and the opinion is mildly (intensity) negative (polarity). But human communications can be a lot more complicated than this. What if that snippet had said:
“Hotel staff could not have been more friendly.”
In this case, the modifier “not” doesn’t negate the “more friendly” sentiment. Not is part of a construction that intensifies the sentiment!
Now consider another snippet:
“Fortunately, this time when we returned from dinner, all problems were solved… or so we thought.”
“Fortunately” and “All problems were solved” read as good signs, that is until we get to a negating clause, “or so we thought.” The idiomatic “X is Y… Not!” pattern is, even here, a relatively simple indicator of irony. It takes a fair amount of sophistication (for a computer program) to understand that “smooth as silk” is good while “clear as mud” is not and that a phrase such as “a far cry from the service I’ve received at other national chains in the city,” from the first example above, can be positive or negative depending on the context of its use.
This last snippet came from a long review that related the guest’s hotel experience and interactions. In such cases, techniques such as discourse analysis, a look at narrative threads in a sequence of comments, come into play.
Boosting Analysis Ability
These examples hint at the difficulty of discerning and analyzing sentiment in text. But often, accompanying numerical data or external clues can provide hints that will boost sentiment-analysis ability.
For instance, a typical product or service review Web site will provide numerical ratings – one star for poor and five stars for excellent, for example – and it’s a good bet that attitudes captured in the review text will reflect those ratings. The travel-review site tripadvisor.com provides six, detailed 1-to-5 ratings for hotels: Value, Rooms, Location, Cleanliness, Check in/front desk, and Service. Similarly, surveys will frequently intermix questions that take rating responses with questions taking free-text responses. Review-text or survey-verbatim sentiment will correlate with the numerical ratings. When surveys include pick-lists, then verbatims will likely expand on the topics selected from the lists.
Some review sites feature Pro and Con or Liked and Disliked subsections. These similarly constitute a sentiment-polarity gimme for the individual features or aspects listed.
In these cases, high-level document structure provides a head-start in breaking down sentiment by topic and topics by sentiment polarity.
Monitoring vs. Understanding
As a parting thought, consider the difference between monitoring and understanding. Quite a variety of media monitoring services have emerged in the last year or two, many with little beyond the ability to count brand or company mentions and track trends, perhaps with basic sentiment scoring capabilities. Those scores might simply aggregate positive and negative indicators at a document rather than at a detailed feature (topic or concept or named entity) level. Some tools will apply statistical techniques to produce topic or theme clusters for graphical display with the heavy interpretive burden imposed on the analyst.
Monitoring and shallow analysis of this sort is helpful in early-warning and fast-reaction situations, but it suffers significant limitations. Monitoring and shallow analysis do not provide deep understanding of root causes behind issues or trends. Sometimes, they seem little better than divination, than reading tea leaves. And by pulling only from published, on-line sources, monitoring provides only a limited view of critical topics. Think how much more useful an on-line review or a survey would be if views expressed could be matched with data from actual transactions or interactions – an individual’s hotel stay, purchase, service call, or warranty claim – or with demographic or market-segmentation data that classifies the reviewer or respondent by home-community median income or number of times eating out in the average month.
For this level of understanding, it helps to couple and integrate sentiment analysis with other forms of BI and analytics. The combination of methods provides lift for analyses: results that are more accurate and more usable than results delivered by any one method alone.
Sentiment analysis can yield deep insights into customer and market views. It applies text analytics to allow analysts to see beyond transactions to the root causes behind customer perceptions and satisfaction, to the motivations behind purchases and other behaviors. It is an important tool for understanding customer and market needs, however and wherever they are expressed. It can benefit any organization concerned with reputation, customer satisfaction, product and service quality, responsiveness, and the like, issues that are central to competitiveness and profitability.