When someone remarks “It goes without saying,” he’s about about to explain, possibly in detail, something the listener is supposed already to know. Similarly, a speaker “who needs no introduction” is often about to get one. When you’re negotiating with someone who says “It’s not about the money,” the sticking point is probably just that, the money, and that classic dump-your-boyfriend line, “It’s not you, it’s me” means “you’re not smart/good looking/caring/interesting enough for me.”
Say What You Mean
Why don’t we say what we mean? Actually, I’d say we usually do. It’s just that our meaning is other than the literal reading. This indirection presents a huge challenge for automated natural-language processing (NLP). Meaning is complicated by sarcasm, irony, idiom, and other bits of sentiment, subjectivity, and ambiguity that make the language-understanding challenge yet more difficult. Ambiguity is easily illustrated. Is “cheap” positive, negative, neutral, or… situational? I’m happy to find a cheap airfare because I’m getting the same service as others at a lower price than most, but a cheap appliance may suffer from quality issues. Context matters.
Clues would be great but aren’t easily available. 13 Little-Known Punctuation Marks We Should Be Using, listed by Adrienne Crezo in the mental_floss blog, would help, but frankly widely-and-quickly accepted new conventions — think of emoticons and Twitter hashtags — are rare. Fortunately, people are good at “reading between the lines,” at discerning and decoding obscured meaning in everyday speech and writing. Machines have to work smart to match human capabilities. Idiom is a contextual indicator, and so are cooccurrence of words and terms and a variety of other algorithmically discernible clues. Software, in the form of computational linguistics, is catching on.
Teaching Computers to Understand
Can we teach computers to reliably and accurately understand human communications? My career for the last ten years has been premised on the belief we can, via text analytics, sentiment analysis, and statistically powered search for patterns. I’m encouraged by innovations in machine learning, particularly supervised methods that build predictive models from training sets. Leading-edge solution providers such as Converseon, a “social-media agency,” have invested significant resource in building training sets for a variety of business domains (e.g., hospitality, consumer electronics, and financial services), in their case, via crowd-sourcing, which helps extend human analyses to Web scale. (Disclosure: I’ve done a small amount of paid consulting for Converseon, which is a sponsor of my next Sentiment Analysis Symposium, October 30 in San Francisco.) Active learning is gaining traction: machine learning extended with human correction of machine-classified results. It’s a recently announced addition, for instance, to SAS Text Analytics capabilities.
I learn a lot from experts, so I recruited social psychologist and social-business pioneer Kate Niederhoffer as one of the keynotes at the October 30 sentiment symposium. Kate’s talk is titled “Sentiment Driven Behaviors; Sentiment Driven Decisions.” The idea is that effective descriptive and predictive methods, drawing from social and enterprise-feedback sources, applied in business settings, will meld language analysis with psychological profiling and data drawn from the variety of available sources. My friend Tom Anderson of Anderson Analytics has been applying a “triangulation” strategy years, matching text-extracted insights to profiles and numerically-coded survey response data.
Many researchers and practitioners share a fascination with language (and with data), with text (and speech) that communicates not only facts and opinions but also clues regarding motivations. Consider the example of Yoda-speak, as described by Andrew McAfee in When Did Yoda Start Writing CEO Speeches?:
“Instead of saying ‘Our costs are rising,” [business folk will] say ‘Things are not great right now, from a cost perspective.’ What’s going on here, I suspect, is that they know the overall sentiment they want to convey. In this case, it’s not a good one; costs are rising. So on the fly they construct a sentence that leads with the sentiment (things are not great) and backloads with the reason why (from a cost perspective).”
I don’t completely agree with McAfee’s interpretation of his example. Many readers would see “our costs are rising” as neutral rather than negative. Expressions, and not just words, may be ambiguous. I ran a quick poll I ran back in June that showed this effect: What is the sentiment of “I purchased a Honda yesterday”? Twelve of 22 responses, 55%, rated that statement positive while the rest saw it as neutral. You need context, or information that’s more explicitly conveyed, to fully understand meaning. The Yoda-speak phrasing of McAfee’s example doesn’t provide context but it does reinforce the negativity of “not great right now.”
From Sentiment to Intent
Context is king, as is interpretation in light of business goals. Here rises the matter of intent or, more formally, intentional analysis. We wish to get at nuance, to distinguish feelings from hopes from plans. From what you say, we want to know what you plan to do. This is a topic I blogged last winter, in an article on sports and political odds-making via the SentiBet system. Start-up Aiaioo Labs scores intent signals that include purchase, wish, inquire, direct, complain, sell, compare, and quit. As-a-service provider OpenAmplify similarly seeks to get at intent signals within broader sentiment expressions.
Maybe you share my fascination with language. Maybe you’re simply interested in business use, in better business decision making. It doesn’t quite go without saying that there is immense potential business value in sentiment and intent signals in social, online, and enterprise data. That said, I hope I have made clear that software solutions (and crowd-sourcing!) are poised to help you discover that business value, to automate understanding of meaning, sentiment, and intent.