Big Data Analytics: Facts and Feelings

Old-school BI and analytics are about crunching numbers, and only about numbers. The old school produces indicator values, abstracted away from context, nonetheless applied to justify context-sensitive decisions. Models tend to be opaque, with little explanatory power. Root causes are teased out only via a laborious slice-and-dice search for patterns that correlate, but do not definitively link, measurements and business outcomes. And for all our talk about “closed loop” decision-making, the old school over-relies on dashboards and visualizations that illustrate issues, without explanation or any action recommendations.

While old-school approaches still dominate, contrast their limitations with the possibilities afforded by modern Big Data analytics. With Big Data, we embrace the variety of database-captured facts, streamed machine/sensor/device data, and media-sourced feelings. Big Data velocity is about data still in context, still situationally relevant, analyzed in-flight to enable us to react appropriately to conditions as they unfold. And volume is especially evident when you consider the amount of media (text, speech, images, and video) being produced and consumed online and on-social.

As we unify (even if still too-infrequently) the elements of this complex mix of numbers and media-derived information, we gain a more complete, timely, and sensitive view of our customers, prospects, and markets. We create descriptive and predictive models that encompass behaviors mined from clickstreams and geographic tracking linked to actions. Models (can) extend to psychological and demographic profiles. They weigh transactions and interactions.

In this rich data mix, the ability to discern and exploit feelings – mood, attitudes, opinions, and emotion – has become more critical than ever. It is through automated sentiment analysis that this ability scales. Until recent years, you had to have trained analysts read each comment, review, e-mail message, and article you sought to understand for business purposes. Nowadays, trained software will do the job, handling the variety, velocity, and volume of Big Data sources with the analytical uniformity that only automation can deliver.

Automation here covers both analytical algorithms and human analyses, the latter via crowd-sourcing where the machine parcels out tasks to raters and performs verification tasks to ensure reliability. But most potential users – in market research and marketing, financial services and capital markets, customer service and customer experience, clinical medicine, and the spectrum of research disciplines – do associate the term “sentiment analysis” with machine analyses.

Some methods apply machine learning, typically but not always supervised (that is, starting from a training set), sometimes utilizing active learning to incorporate human feedback that can boost model accuracy. Other methods use linguistic artifacts and techniques – lexicons, linguistic rules, taxonomies, word nets, and deep parsing – sometimes generated by analysts, sometimes enhanced via machine learning.

The aim is to transform text and other media into data sources, to extract sentiment along with the entities (names of people, places, companies, etc.) and topics, themes, and concepts that extracted mood, opinions, and emotion applies to. These forms of sentiment explain the why behind the what we derive raw numbers. They illuminate the root causes behind transactional and behavioral patterns. The goal is to add feelings to facts as fuel for a new generation of Big Data Analytics.

Leave a Reply