Text Analytics 2015

Read on for my annual year-past/look-ahead report, on text analytics technology and market developments, 2015 edition.

A refreshed definition: Text analytics refers to technology and business processes that apply algorithmic approaches to process and extract information from text and generate insights.

Text analytics hasn't been around quite this long: Summary account of silver for the governor written in Sumerian Cuneiform on a clay tablet. From Shuruppak, Iraq, circa 2500 BCE. British Museum, London.

Text analytics hasn’t been around quite this long: Summary account of silver for the governor written in Sumerian Cuneiform on a clay tablet. From Shuruppak, Iraq, circa 2500 BCE. British Museum, London.

Text analytics relies on natural-language processing, based on statistical, linguistic, and/or machine-learning methods. Yet no single technical approach prevails, and no single solution provider dominates the market. There are myriad open source options and an array of via-API cloud services. Academic and industry research is thriving, realized in systems that scale up to go big, fast, and deep, and that scale out, going wide — distributed and accommodating source diversity.

Last year, 2014, was a good year for text analytics, and 2015 will be even better, with particular focus on streams, graphs, and cognitive computing and on an already extensive array of applications. As in 2014, these applications will not always be recognized as “text analytics” (or as natural language processing, for that matter). My 2014 observation remains true, “text-analytics technology is increasingly delivered embedded in applications and solutions, for customer experience, market research, investigative analysis, social listening, and many, many other business needs. These solutions do not bear the text-analytics label.”

So in this article:

  • Market Perceptions
  • Tech Developments
  • Investments
  • Community

Market Perceptions

I released a market study, Text Analytics: User Perspectives on Solutions and Providers.

From Alta Plana's 2014 market study: Do you currently extract or analyze...

From Alta Plana’s 2014 text analytics market study: Do you currently extract or analyze…

The report is available for free download (registration required) via altaplana.com/TA2014, thanks to its sponsors. The key finding is More (and synonyms): Progressively more organizations are running analyses on more information sources, increasingly moving beyond the major European and Asian languages, seeking to extract more, diverse types of information, applying insights in new ways and for new purposes.

Text analytics user satisfaction continues to lag

Technical and solution innovation is constant and robust, yet user satisfaction continues to lag, particularly related to accuracy and ease of use. One more chart from my study, at right…

Text Technology Developments

In the text technology world, cloud based as-a-service (API) offerings remain big, and deep learning is, even more than last year, the alluring must-adopt method. Deep learning is alluring because it has proven effective at discerning features at multiple level in both natural language and other forms of “unstructured” content, images in particular. I touch on these topics in my March 5 IBM Watson, AlchemyAPI, and a World of Cognitive Computing, covering IBM’s acquisition (terms undisclosed) of a small but leading-edge cloud/API text and image analysis provider.

I don’t have much to say right now on the cognitive computing topic. Really the term is agglomerative: It represents an assemblage of methods and tools. (As a writer, I live for the opportunity to use words such as “agglomerative” and “assemblage.” Enjoy.) Otherwise, I’ll just observe that beyond IBM, the only significant text-analytics vendor that has embraced the term is Digital Reasoning. Still — Judith Hurwitz and associates have an interesting looking book just out on the topic, Cognitive Computing and Big Data Analytics, although I haven’t read it. Also I’ve recruited analyst and consultant Sue Feldman of Synthexis to present a cognitive-computing workshop at the 2015 Sentiment Analysis Symposium in July.

Let’s not swoon over unsupervised machine learning and discount tried-and-true methods — language rules, taxonomies, lexical and semantic networks, word stats, and supervised (trained) and non-hierarchical learning methods (e.g., for topic discovery) — in assessing market movements. I do see market evidence that software that over-relies on language engineering (rules and language resources) can be hard to maintain and adapt to new domains and information sources and languages, and difficult to keep current with rapidly emerging slang, topics, and trends. The response is two-fold:

  • The situation remains that a strong majority of needs are met without reliance on as-yet-exotic methods.
  • Hybrid approaches — ensemble methods — rule, and I mean hybrids that include humans in initial and on-going training process, via supervised and active learning for generation and extension of linguistic assets as well as (other) classification models.

I wrote up above that 2015 would feature a particular focus on streams and graphs. The graphs part, I’ve been saying for a while. I believe I’ve been right for a while too, including when I not-so-famously wrote “2010 is the Year of the Graph.” Fact is, graph data structures naturally model the syntax and semantics of language and, in the form of taxonomies, facilitate classification (see my eContext-sponsored paper, Text Classification Advantage Via Taxonomy). They provide for conveniently-queryable knowledge management, whether delivered via products such as Ontotext’s GraphDB or platform-captured, for instance in the Facebook Open Graph.

I did poll a few industry contacts, asking their thoughts on the state of the market and prospects for the year ahead. Ontotext CTO Marin Dimitrov was one of them. His take agrees with mine, regarding “a more prominent role for knowledge graphs.” His own company will “continue delivering solutions based on our traditional approach of merging structured and unstructured data analytics, using graph databases, and utilizing open knowledge graphs for text analytics.”

Marin also called out “stronger support for multi-lingual analytics, with support for 3+ languages being the de-facto standard across the industry.” Marin’s company is based in Bulgaria, and he observed, “In the European Union in particular, the European Commission (EC) has been strongly pushing a multi-lingual digital market agenda for several years already, and support for multiple languages (especially ‘under-represented’ European languages) is nowadays a mandatory requirement for any kind of EC research funding in the area of content analytics.”

José Carlos González, CEO of Madrid-based text analytics provider Daedalus, commented on the “‘breadth vs depth’ dilemma. The challenge of developing, marketing and selling vertical solutions for specific industries has lead some companies to focus on niche markets quite successfully.” Regarding one, functional (rather than industry-vertical) piece of the market, González believes “Voice of the Customer analytics — and in general all of the movement around customer experience — will continue being the most important driver for the text analytics market.”

One of Marin Dimitrov’s predictions was more emerging text analytics as-a-service providers, with a clearer differentiation between the different offers. Along these lines, Shahbaz Anwar, CEO of analytics provider PolyVista, sees the linking of software and professional as a differentiator. Anwar says, “We’re seeing demand for text analytics solutions — bundling business expertise with technology — delivered as a service, so that’s where PolyVista has been focusing its energy.”

Further —

Streams are kind of exciting. Analysis of “data-in-flight” has been around for years, for structured data, formerly primarily known as part of complex event process (CEP) and applied in fields such as telecomm and financial markets. Check out Julian Hyde‘s 2010 Data In Flight. For streaming (and non-streaming) text, I would call out Apache Storm and Spark. For Storm, I’ll point you to a technical-implementation study posted by Hortonworks, Natural Language Processing and Sentiment Analysis for Retailers using HDP and ITC Infotech Radar, as an example. For Spark, Pivotal published a similar and even-more-detailed study, 3 Key Capabilities Necessary for Text Analytics & Natural Language Processing in the Era of Big Data. Note all the software inter-operation going on. Long gone are the days of monolithic codebases.

But in the end, money talks, so now on to part 3 —

Follow the Money: Investments

Investment activity is a forward-looking indicator, suggesting optimism about a company’s growth potential and likely profitability and more particularly, the viability of the company’s technology and business model and the talent of its staff.

I’ll run through 2014 funding news and mergers & acquisitions activity, although first I’ll again note one 2015 acquisition in the space, IBM’s purchase of AlchemyAPI, and NetBase’s $24 million Series E round.

In 2014, chronologically:

  1. Verint‘s $514m purchase of KANA opens lines beyond the call center.” I’ve linked to a 451 story; here’s Verint’s press release. Verint is a global playing in customer interaction analytics and other fields; target KANA had itself bought voice of the customer (VOC)/text analytics provider Overtone back in 2011. Big question: Has Verint (by now) replaced its OEM-licensed Clarabridge text analytics engine with the Overtone tech? (January 6, 2014)
  2. HootSuite Acquires uberVU To Bolster Analytics Offering,” acquiring a social intelligence company whose tech stack includes text analytics. (January 22, 2014)
  3. Confirmit Acquires Social Intelligence and Text Analytics Innovator Integrasco.” A couple of Norwegian firms get hitched: A voice of the customer (VOC)/enterprise feedback vendor and text analytics/social intelligence vendor. (January 22, 2014)
  4. Sentisis, which focuses on Spanish-language analysis, collected €200 in seed funding, per Crunchbase. Now, a year later, Sentisis has done a $1.3 million Series A round. (February 14, 2014 and March 18, 2015)
  5. A French social listening company: Synthesio Secures $20 Million Investment from Idinvest Partners. Here’s an interview I did with Synthesio’s text analytics lead, Pedro Cardoso, last November: “From Social Sources to Customer Value: Synthesio’s Approach.” (March 14, 2014)
  6. Gavagai, a Swedish start-up, pulled in 21 million knonor, which translates to about US$3.2 million. (The name “gavagai” is a sort-of inside joke and was used once before by an NLP company!) (March 24, 2014)
  7. Text analytics via Hadoop played a part in FICO‘s April acquisition of Karmasphere. (April 16, 2014)
  8. Pegasystems buys Bangalore analytics startup MeshLabs,” reported the Times of India. (May 7, 2014)
  9. Attensity Closes $90 Million in Financing.” Attensity was one of the first commercial text-analytics providers, going beyond entities to “exhaustive extraction” of relations and other information. I put out a two-part appraisal last summer, “Attensity Doubles Down: Finances and Management” and “Attensity, NLP, and ‘Data Contextualization’.” (May 14, 2014)
  10. Inbentacompany from Barcelona specialized in Intelligent Customer Support software with Artificial Intelligence and Natural Language Processing, raises $2 million from Telefónica.” (May 14, 2014)
  11. Brandwatch pulled in $22 million in new funding. There’s a press release from the social analytics vendor, which has basic text analytics capabilities — former CTO Taras Zagibalov presented at my 2011 sentiment symposium (slides, video) — although it appears they’re a bit stale. (May 22, 2014)
  12. NetBase Completes $15.2 Million Round of Expansion Funding.” NetBase has strong multi-lingual text analytics and appears to be on a not-as-fast-as-they-would-hope path to an IPO: The company just took in another $24 million, in Series E funding, on March 13, 2015. Taking on more fuel before IPO take-off, I assume. (July 15, 2014)
  13. Synapsify bags $850K.” Congratulations (again) to Stephen Candelmo! (July 24, 2014)
  14. Innodata Announces Acquisition of MediaMiser Ltd.“, which you can learn about from the target’s perspective as well,
    “Well, this happened: We’ve been acquired by Innodata!.” (July 28, 2014)
  15. Digital Reasoning Raises $24 Million in Series C Round Led by Goldman Sachs & Credit Suisse Next Investors.” Cognitive computing! (October 9, 2014)
  16. Maritz Research buys Allegiance, forms MaritzCX” This is an interesting take-over, by a research firm — Maritz is/was a customer of Clarabridge’s, and maybe of other text-analytics providers — of a customer-experience firm that in turn licensed Attensity’s and Clarabridge’s technology, although Clarabridge’s seemingly on a capability-limited basis. (November 5, 2014)
  17. Brand a Trend, a Cloud – based Text Analytics Company based out of Heidelberg, Germany, announced a $4.5 million round of funding that it will use to push into the U.S. and the booming digital market” — that’s the SUMMICS product — following on a $600 thousand 2013 founding investment and a February 2014 $euro;800 thousand seed investment. (November 11th 2014)
  18. Natural language generation: “Narrative Science pulls in $10M to analyze corporate data and turn it into text-based reports.” Rival Arria did an IPO, as NLG.L, in December 2013. (November 28, 2014)

Reports and Community

Let’s finish with opportunities to learn more, starting with conferences because there is still no substitute for in-person learning and networking (not that I dislike MOOCs, videos, and tutorials.) Here’s a selection:

  • Text Analytics World, March 31-April 1 in San Francisco, co-located with Predictive Analytics World.
  • Text by the Bay, “a new NLP conference bringing together researchers and practitioners, using computational linguistics and text mining to build new companies through understanding and meaning.” Dates are April 24-25, in San Francisco.
  • The Text Analytics Summit (a conference I chaired from its 2005 founding through 2013’s summit) will take place June 15-16 in New York, the same dates as…
  • The North American instance of IIeX, Greenbook’s Insight Innovation Exchange, slated for June 15-17 in Atlanta. I’m organizing a text analytics segment; send me a note if you’d like to present.
  • My own Sentiment Analysis Symposium, which includes significant text-analysis coverage, is scheduled for July 15-16 in New York, this year featuring a Workshops track in parallel with the usual Presentations track. In case you’re interested: I have videos and presentations from six of the seven other symposiums to date, from 2010 to 2014, posted for free viewing. New this year: A half-day workshop segment devoted to sentiment analysis for financial markets.

The 2014 LT-Accelerate conference in Brussels.

If you’re in Europe or fancy a trip there, attend:

On the vendor side,

Moving to non-commercial, research-focused and academic conferences… I don’t know whether the annual Text as Data conference will repeat in 2015, but I have heard from the organizers that NIST’s annual Text Analysis Conference will be scheduled for two days the week of November 16, 2015.

The 9th instance of the International Conference on Weblogs and Social Media (ICWSM) takes place May 26-29 in Oxford, UK. And the annual meeting of the Association for Computational Linguistics, an academic conference, move to Beijing this year, July 26-31.


I’ve already cited my own Text Analytics: User Perspectives on Solutions and Providers.

Butler Analytics’ Text Analytics: A Business Guide, issued in February 2014, provides a good, high-level business overview.

And I’m exploring a report/e-book project, topic (working title) “Natural Language Ecosystems: A Survey of Insight Solutions.”

If you know of other market activity, conference or resources I should include here, please let me know and I’ll consider those items for an update. In any case…

Thanks for reading!

Disclosures +

I have mentioned many companies in this article. I consult to some of them. Some sponsored my 2014 text-analytics market study or an article or a paper. (This article is not sponsored.) Some have sponsored my conferences and will sponsor my July 2015 symposium and/or November 2015 conference. I have taken money in the last year, for one or more of these activities, from: AlchemyAPI, Clarabridge, Daedalus, Digital Reasoning, eContext, Gnip, IBM, Lexalytics, Luminoso, SAS, and Teradata. Not included here are companies that have merely bought a ticket to attend one of my conferences.

If your own company is a text analytics (or sentiment analysis, semantics/synthesis, or other data analysis and visualization) provider, or a solution provide that would like to add text analytics to your tech stack, or current or potential user, I’d welcome helping you with competitive product and market strategy on a consulting basis. Or simply follow me on Twitter at @SethGrimes or read my Breakthrough Analysis blog for technology and market updates and opinions.

Finally, I welcome the opportunity to learn about new and evolving technologies and applications, so if you’d like to discuss any of the points I’ve covered, as they relate to your own work, please do get in touch.

A Sentiment Analysis Insight Agenda

Just announced: The agenda for the 2015 Sentiment Analysis Symposium, July 15-16 in New York.

The 2015 symposium will offer the same great content and networking as always — you’ll hear from agency and brand leaders, insights professionals, researchers, and technologists — this year expanded into two simultaneous tracks:

  • Presentations — an array of keynotes, presentations, and panels that explore business value in opinion, emotion, behavior, and connection — in online, social, and enterprise sources, in both text and other big data formats.
  • Workshops — longer-form deep dives into technologies and techniques — including metrics & measurement, cognitive computing, machine learning.

Mix and match program segments — whatever suits your background and interests.

New this year: A half-day segment (within the Workshop track) on Sentiment Analysis for Financial Markets. Our 40th-floor conference venue at the New York Academy of Sciences overlooks Wall Street, so this segment is a well-placed addition to the symposium’s coverage of consumer, public, and social for market research, customer experience, healthcare, government, and other application areas.

Who’ll be presenting? Here’s a thought-leader sampling:

  • Industry analysts Dave Schubmehl from IDC and Anjali Lai from Forrester.
  • Agency folks Francesco D’Orazio from Face, Brook Mille from MotiveQuest, and Karla Wachter, Waggener Edstrom.
  • Technical innovators including Bethany Bengtson from Bottlenose, Moritz Sudhof from Kanjoya, Karo Moilanen from TheySay, and CrowdFlower CEO Lukas Biewald.
  • Speakers on sentiment analysis in healthcare from the Advisory Board Company, Westat, and DSE Analytics.
  • A set of forward-looking talks on wearables, speech analytics, emotional response via facial recognition, and virtual assistants.

Presenting the longer-form workshops, running in parallel with the presentations, we’ll have:

  • Prof. Bing Liu, presenting a half-day Sentiment Analysis tutorial.
  • Dr. Robert Dale on Natural Language Generation.
  • Sue Feldman, Synthexis, on Cognitive Computing.
  • Research leader Steve Rappaport on Metrics and Measurement.
  • … and more that we’ll share soon.

As usual, we’ll have a set of lightning talks — quick takes on technologies, solutions, and services — and lots of networking opportunities.

Please join us.

Visit the symposium agenda page for the full program. Register now at sentimentsymposium.com/registration.html. Discounted Early registration runs until May 15; register by March 31 for an Extra $100 off. Save even more with our Team registration, and we also have a 50% government/academic discount and a special low rate for full-time students.

Do join us to stay on the leading edge of sentiment technologies and applications!

IBM Watson, AlchemyAPI, and a World of Cognitive Computing

In the news: IBM has bought text- and image-analysis innovator AlchemyAPI, for inclusion in the Watson cognitive computing platform.

AlchemyAPI sells text analysis and computer vision capabilities that can be integrated into application, services, and data systems via a SaaS API. But I don’t believe you’ll find the words “cognitive computing” on AlchemyAPI‘s Web site. So where’s the fit? What gap was IBM seeking to fill?

For an IBM description of Watson cognitive computing, in business-accessible terms, see the video embedded in this article.

IBM explains Watson cognitive computing

My definition: Cognitive computing both mimics human capabilities — perception, synthesis, and reasoning — and applies human-like methods such as supervised learning, trained from established examples, to discern, assess, and exploit patterns in everyday data. Successful cognitive computing is also superhuman, with an ability to apply statistical methods to discover interesting features in big, fast, and diverse data. Cognitive platforms are scalable and extensible, able to assimilate new data and methods without restructuring.

AlchemyAPI fits this definition. The automated text understanding capabilities offered by AlchemyAPI and competitors — they include TheySaySemantria, OntotextPingar, MeaningCloud, LuminosoExpert System, DatumboxConveyAPI, BitextAylien, and others, each with its own strengths — add value to any social or enterprise solution that deals with large volumes of text. I haven’t even listed text analysis companies that don’t offer an on-demand, as-a-service option!

AlchemyAPI uses a hybrid technical approach that combines statistical, machine learning, and taxonomy-based methods, adapted for diverse information sources and business needs. But what sets AlchemyAPI apart is the company’s foray into deep learning, the application of a hierarchy of neural networks to identifying both broad-stroke and detailed language and image features.

So AlchemyAPI isn’t unique in the natural-language processing (NLP) domain, but the company does have lasting power. The success is measurable. AlchemyAPI, founded in 2005, was relatively early to market with an on-demand text analysis service and has won an extensive developer following although I’ll bet you $1 that the widely circulated 40,000 developer figure counts API-key registrations, not active users. The company is continually rolling out new features, which range from language detection and basic entity extraction to some of the most fine-grained sentiment analysis capabilities on the market. By contrast, development of the most notable early market entrant, OpenCalais from Thomson Reuters, stalled long ago.

Agility surely plays a role in AlchemyAPI’s success, management foresight that led the company to jump into computer vision. CEO Elliot Turner described the opportunity in an April, 2014 interview:

“Going beyond text, other areas for big progress are in the mining of audio, speech, images and video. These are interesting because of their incredible growth. For example, we will soon see over 1 billion photos/day taken and shared from the world’s camera phones. Companies with roots in unsupervised deep-learning techniques should be able to leverage their approaches to dramatically improve our ability to correctly identify the content contained in image data.”

Yet there’s competition in image analysis as well. Given work in sentiment analysis, most of the companies I follow apply the technology for emotion analytics — they include Affectiva, Emotient, Eyeris, and RealEyes — but consider that Google couldn’t build a self-driving car without technology that “sees.” The potential impact of computer vision and automated image analysis seems limitless, with plenty of opportunity to go around.

Why did IBM, a behemoth with immense research capabilities, need to go outside by acquiring AlchemyAPI? I’d speculate that IBM’s challenge is one that share by many super-large companies: Inability to effectively commercialize in-house innovation. Regardless, the prospect of bringing onto the Bluemix cloud platform all those NLP-interested developers, whether 40,000 or some lesser active number, was surely attractive. The AlchemyAPI technology will surely plug right in: Modern platforms accommodate novelty. As I wrote above, they’re able to assimilate new data and methods without restructuring.

And Watson? It’s built on the IBM-created Apache UIMA (Unstructured Information Management Architecture) framework, designed for functional extensibility. AlchemyAPI already fits in, via a set of “wrappers” that I expect will be updated and upgraded soon. But truth is, it seems to me that given Watson’s broad and proven capabilities, these added capabilities provide only a relatively small technical boost, in two directions. First, AlchemyAPI will provide market-proven unsupervised learning technology to the Watson stack, technology that can be applied to diverse language-understanding problem. Second, as I wrote, AlchemyAPI offers some of the most fine-grained sentiment analysis capabilities on the market, providing certain information-extraction capabilities not currently closely linked to Watson. What IBM will do with AlchemyAPI’s image-understanding capabilities, I can’t say.

Beyond these technical points, I’m guessing that the bottom-line attractions were talent and opportunity. IBM’s acquisition press release quotes AlchemyAPI CEO Elliot Turner: “We founded AlchemyAPI with the mission of democratizing deep learning artificial intelligence for real-time analysis of unstructured data and giving the world’s developers access to these capabilities to innovate. As part of IBM’s Watson unit, we have an infinite opportunity to further that goal.” It’s hard to beat infinite opportunity or, for a company like IBM, a chance to build on a combination of agility, talent, enthusiasm, market-sense, and foresight that is hard to find in house or in the commercial marketplace.

Disclosure: I have mentioned numerous companies in this article. AlchemyAPI, IBM, Converseon (ConveyAPI), Daedalus (MeaningCloud), Lexalytics (Semantria), Luminoso, Ontotext, and TheySay have paid to sponsor my Sentiment Analysis Symposium conference and/or my Text Analytics 2014 market study and/or the Brussels LT-Accelerate conference, which I co-own.

An extra: Video of a talk, Deep Learning for Natural Language Processing, with Stephen Pulman of the University of Oxford and text-analysis solution provider TheySay, offered at the 2014 Sentiment Analysis Symposium. Deep learning techniques are central to AlchemyAPI’s text and image analysis capabilities as well.

Counting Bugs at LinkedIn

LinkedIn has a bug problem, in two senses. There are long-standing, unresolved errors, and there are agitators like me (or is it only me?) who keep finding more and say so.

This article is my latest “bug LinkedIn” entry. My latest finds center on counting. They’re very visible. I’ll show you two instances, and toss in a screenshot of a special slip-up.

(See also My Search for Relevance on LinkedIn, on search-relevance deficiencies, posted in March 2014; my April 2014 via-Twitter reporting of incorrect rendering of HTML character entities, here and here, since fixed although a very similar error in LinkedIn e-mail notifications remains unresolved; a February 2013 Twitter thread about the lack of needed LinkedIn profile spell-checking — do a LinkedIn search on “analtics” (missing Y) and you’ll see what I mean; and my July 2012 LinkedIn, Please Take on Group Spammers, still very much an issue. And the flawed algorithm I reported in my July 2014 article, LinkedIn Misconnects Show That Automated Matching is Hard, remains uncorrected.)

So what’s new?

How Many Moderation Items?


LinkedIn groups are a great feature. I belong to quite a few, and I moderate several.

Check out the moderation screen to the right. The Discussions tab is displayed, indicating 35 items in the Submissions Queue pending approval — the same number is shown next to the Manage menu-bar item — except we don’t see any pending discussion submissions, do we? We do, however, see a tick box next to… nothing.

Counting bug #1.

Actually, I can explain this error. LinkedIn provides group moderators a Block & Delete option under the Change Permissions drop down, as shown in the right-middle of my shot. A diligent moderator will use it to ban group members who repeatedly submit off-topic content. I’ve used Block & Delete. Each time I use it, while the submitted items disappear, they’re still being counted. My guess is that they’re still in LinkedIn’s database, but now flagged with a “blocked” status.

So we have a bad counting query that can be easily fixed. All LinkedIn Engineering has to do is add a condition — if in SQL, in the WHERE clause — so that only non-blocked moderation entries are counted.

How Many Connection Requests?

LinkedInAmyx1Counting error #2 involves connections requests. It’s a two-fer — two errors, actually. I can’t explain them, but I can show them and describe them.

First, check out the Inbox image, which shows a connection invitation that I’ve already accepted. Note the “1st” next to the person’s name. The second image confirms that he and I are connected. Look closely at his profile and you will see “Connected 1 day ago.”

LinkedInAmyxThe second image also the drop-down under “Add Connections,” which, again erroneously, shows a pending connection invitation from the person I’m already connected to.

But that’s not all! Did you notice the little plus-person icon in the upper-right of each of those screens? Did you notice the number displayed in a red background? It’s 3. Now, how many connection requests do you see under Invitations in the drop-down of the second image. I see 2.

Counting error #2.

LinkedIn ad placeholder


And finally, a lagniappe, an extra for those who have read this far. Check out the item under “Ads You May Be Interested In” in the image to the right


A Loyal, Paying User

Finally, let me reassert that I am a loyal, paying LinkedIn user. Did you notice the word “premium” next to the “in” logo in the screenshots I posted?

There’s always room for improvement, and of course, LinkedIn capabilities have advanced light years since I wrote, in InformationWeek in 2004, “LinkedIn is the only social-networking system I looked at that currently deserves enterprise consideration.” Myself, I may be a more astute industry analyst and better writer now, in 2015, than I was then. Here’s to progress!! … and also to getting even the little things right.

Call for Speakers: Sentiment Analysis Symposium + Workshops

The Sentiment Analysis Symposium is the first, biggest, and best conference to tackle the business value of sentiment, mood, opinion, and emotion — linked to the spectrum of big data sources and exploited for the range of consumer, business, research, media, and social applications.

SAS12w800The key to a great conference is great speakers. This year’s symposium — running two full days, July 15-16, 2015 in New York — will feature a presentations track and a workshop track. Whether you’re a business visionary, experienced user, technologist, or consultant, please consider proposing a presentation. Choose from among the suggested topics or surprise us. Help us build on our track record of bringing attendees — over 175 total at the 2014 symposium — useful, informative technical and business content. Please submit your proposal by January 23, 2015.

While the symposium is not a scientific conference, we welcome technical and research presentations so long as they are linked to practice. In case you’re not familiar with the symposium: Check out videos from the March, 2014 New York symposium and from prior symposiums.

As in past years, we’re inviting talks on customer experience, brand strategy, market research, media & publishing, social and interaction analytics, clinical medicine, and other topics. On the tech side, show off what you know about natural language processing, machine learning, speech and emotion analytics, and intent modeling. New this year, we’re planning a special segment on sentiment applications for financial markets.

Please help us create another great symposium! Submit your proposal online by January 23.


P.S. If you’re not up for speaking but would like to register, register today to benefit from Super Early discount rates. And if you represent a solution provider or consultancy that would like to sponsor the symposium, it would be great to have your support, so just send me a note.

Roland Fiege, IPG Mediabrands

Quick Q&A: On the Earned Media Value of a Brand’s Social Activities

Earned, paid, and owned media are distinct species. If you haven’t laid out cash for a mention of your brand, product, or personnel in a media outlet, whether online or social, you’re deemed to have earned the coverage. (Take “earned” with a grain of salt. You may have laid out big bucks for a publicist or efforts to build your brand’s visibility.) If you’ve bought the coverage — advertising, for instance — that’s paid. And if it’s your outlet, then that media is owned.

Whether media is earned, paid, or owned, you want to measure the extent of attention and the effectiveness of your message. The effort can get quite involved, when multiple channels and multiple exposures are in the mix. The get a precise picture, you have to engage in attribution modeling. When social platforms come into play, the effort can be substantial.

General social business challenges, and technical responses, are central topics at LT-Accelerate, a unique European conference, taking place December 4-5, 2014 in Brussels. We’ll have Roland Fiege of IPG Mediabrands speaking, on methodologies and tools for measuring the earned value of brand social-media activity. If this topic interests you as well, you’ll want to learn more. A quick Q&A I recently conducted with Roland is a start, then I hope you’ll join us in Brussels. First a brief bio —

Roland Fiege, IPG Mediabrands

Roland Fiege, IPG Mediabrands

Roland Fiege is head of social strategy at Mediabrands Social, home of Performly. In his spare time, he is working on a PhD project researching methodologies for measuring the value add of marketing on Facebook and Twitter. And next, our —

Q&A with Roland Fiege of IPG Mediabrands

Q1: The topic of this Q&A is social media analytics. What’s your personal SMA background and your current work role?

Roland Fiege: My personal SMA background started with consulting projects evaluating social media listening systems back in 2009. In 2010-11, I was part of an international team at US technology company MicroStrategy that developed a solution that analyzed the social graphs of Facebook users to help brands to understand the interests and affinities of their “fans” better.

In my current work role, we analyze user interactions responding to brand messages on social media channels and have developed a model that attributes an monetary “earned media value” to these interactions. This allows brands to quantify and valuate the outcome of their social media investments.

Q2: What are key technical and business goals of the analyses you’re involved in?

Roland Fiege: The technical challenges are to keep the solution up to date with ongoing API changes by the most popular social networks and how to loop back “real time” bidding price benchmarks into our systems (vs. a static benchmark). Another challenge is to meet the EU data privacy standards that enterprises,German especially, try to comply with.

Business-wise, the challenge is to establish a common understanding how to attribute and valuate user interactions.

Q3: And what particular analytics approaches or technologies do you favor, whether for text, network, geospatial, behavioral, or other analyses?

Roland Fiege: We basically gave up on automated text analysis when it comes to sentiment. It never worked in Europe with all the different languages, dialects, irony etc. There was too much manual work involved that clients were not willing to pay for.

Currently we concentrate on the quantification for user engagement and its financial valuation.

Q4: To what extent do you get into sentiment and subjective information?

Roland Fiege: Our experience is that if users like, share, and comment on brand content, it mostly is positive or neutral sentiment involved. Contrary to this, most user posts on brand channels are negative and in correlation with negative customer experiences. Since we measure the monetary value of brand communication, we only measure fans/follower interactions on brand content.

Q5: How do you recommend dealing with high-volume, high-velocity, diverse social postings — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Roland Fiege: We do not only rely on the APIs that Twitter, Facebook and YouTube (Google) provide but also user other (fire hose) data providers to get the most complete picture/dataset, also for retrospective analysis.

Q6: Could you provide an example (or two) that illustrates really well what you’ve been able to accomplish via SMA, that demonstrate strong ROI?

Roland Fiege: What we accomplish: Clients manage to optimize their content strategies in near real time, can compare the performance of their content (agencies) in different regions and countries, and can identify savings potential in the millions. It is the first time brands can calculate the total cost of ownership of their social media channels and have a clear Input vs. Outcome result all condensed into one KPI: Money.

Q7: I’m glad you’ll be speaking at LT-Accelerate. Please tell me about your presentation, briefly: What attendees will learn.

Roland Fiege: In this talk you will learn about the latest methodologies and tools to measure the Earned Media value of a brand’s activities on Facebook, Twitter and YouTube in hard currency.

Q8: Finally, do you have recommendations to share, regarding choice of data sources, metrics, analytical methods, and visualizations, in order to best align with desired business outcome?

I will share those in my presentation in as much detail as possible.

Thank you, Roland, for your responses. I’m looking forward to hearing more, at LT-Accelerate in Brussels.

Lipika Dey, Tata Consultancy Services

The Analytics of Digital Transformation, per Tata Consultancy Services

Next month’s LT-Accelerate conference will be the third occasion I’ve invited Lipika Dey to speak at a conference I’ve organized. She’s that interesting a speaker. One talk was on Goal-driven Sentiment Analysis, a second on Fusing Sentiment and BI to Obtain Customer/Retail Insight. (You’ll find video of the latter talk embedded at the end of this article.) Next month, at LT-Accelerate in Brussels, she’ll be speaking on a particular topic that’s actually of quite broad concern, E-mail Analytics for Customer Support Centres.

As part of the conference lead-up, I interviewed Lipika regarding consumer and market analytics, and — given her research and consulting background — techniques that best extract practical, usable insights from text and social data. What follows are a brief bio and then the full text of our exchange.

Dr. Lipika Dey, Tata Consultancy Services

Dr. Lipika Dey, senior consultant and principal scientist at Tata Consultancy Services

Dr. Lipika Dey is a senior consultant and principal scientist at Tata Consultancy Services (TCS), India with over 20 years of experience in academic and industrial R&D. Her research interests are in content analytics from social media and news, social network analytics, predictive modeling, sentiment analysis and opinion mining, and semantic search of enterprise content. She is keenly interested in developing analytical frameworks for integrated analysis of unstructured and structured data.

Lipika was formerly a faculty member in the Department of Mathematics at the Indian Institute of Technology, Delhi, from 1995 to 2006. She has published in international journals and refereed conference proceedings. Lipika has a Ph.D. in Computer Science and Engineering, M.Tech in Computer Science and Data Processing, and 5 Year Integrated M.Sc in Mathematics from IIT Kharagpur.

Our interview with Lipika Dey –

Q1: The topic of this Q&A is consumer and market insight. What’s your  personal background and your current work role, as they relate to these domains?

Lipika Dey: I head the research sub-area of Web Intelligence and Text Mining at Innovation Labs, Delhi of Tata Consultancy Services. Throughout my academic and a research career, I have worked in the areas of data mining, text mining and information retrieval. My current interests are focused towards seamless integration of business intelligence and multi-structured predictive analytics that can reliably and gainfully use information from multitude of sources for business insights and strategic planning.

Q2: What roles do you see for text and social analyses, as part of comprehensive insight analytics, in understanding and aggregating market voices?

Lipika Dey: The role of text in insight analytics can be hardly over-emphasized.

Digital transformation has shifted control of the consumer world to consumers from providers. Consumers — both actual and potential — are demanding, buying, reviewing, criticising, influencing others, and thereby controlling the market. The decreasing cost of smart gadgets is ensuring that all this is not just for the elite and tech-savvy. Ease of communicating in local languages on these gadgets is also a contributing factor to the increased user base and increased content generation.

News channels and other traditional information sources have also adopted social media for information dissemination, thereby paving the way for study of people’s reactions to policies and regulations.

With so much expressed and exchanged all over the world, it is hard to ignore content and interaction data to gather insights.

Q3: Are there particular tools or methods you favor? How do you ensure business-outcome alignment?

Lipika Dey: My personal favourites for text analytics are statistical methods and imprecise reasoning techniques used in conjunction with domain and business ontologies for interpretation and insight generation. Statistical methods are language agnostic and ideal for handling noisy text. Text inherently is not amenable to be used within a crisp reasoning framework. Hence use of imprecise representation and reasoning methodologies based on fuzzy sets or rough sets is ideal for reasoning with text inputs.

The most crucial aspect for text analytics based applications is interpretation of results and insight generation. I strongly believe in interactive analytics platforms that can aid a human analyst comprehend and validate the results. Ability to create and modify business ontology with ease and view the content or results from different perspectives is also crucial for successful adoption of a text analytics based application. Business intelligence is far too entrenched in dashboard-driven analytics at the moment. It is difficult to switch the mind-set of a whole group at once. Thus text analytics at this moment is simply used as a way to structure the content to generate numbers for some pre-defined parameters. A large volume of information which could be potentially used is therefore ignored. One possible way to practically enrich business intelligence with information gathered from text is to choose “analytics as a service” rather than look for a tool.

As a researcher I find this the most exciting phase in the history of text analytics. I see a lot of potential in the yet unused aspects of text for insight generation. We are at the confluence where surface level analytics has seen a fair degree of success. The challenge now is to dive below the surface and understand intentions, attitudes, influences, etc. from stand-alone or communications text. Dealing with ever-evolving language patterns that are also in turn influenced by the underlying gadgets through which content is generated just adds to the complexity.

Q4: A number of industry analysts and solution providers talk about omni-channel analytics and unified customer experience. Do you have any thoughts to share on working across the variety of interaction channels?

Lipika Dey: Yes, we see many business organizations actively moving towards unified customer experience. Omni-channel analytics is catching up. But truly speaking I think at this point of time it is an aspirational capabilty. A lot of information is being pushed. Some of it is contextual. But I am not sure whether the industry is still in a position to measure its effectiveness or for that matter use it to its full potential.

It is true that a multitude of sources help in generating a more comprehensive view of a consumer, both as an individual as well as a social being. Interestingly, as data is growing bigger and bigger, technology is enabling organizations to focus on smaller and smaller groups, almost to the point of catering to individuals.

As a researcher I see exciting possibilities to work in new directions. My personal view is that the success of omni-channel analytics will depend on the capability of data scientists to amalgamate domain knowledge and business knowledge with loads and loads of information gathered about socio-cultural, demographic, psychological and behavioural factors of target customers. Traditional mining tools and technologies will play a big role, but I envisage an even greater role for reasoning platforms which will help analysts play around with information in a predictive environment, pick and choose conditional variables, perform what-if analysis, juggle around with possible alternatives and come up with actionable insights. The possibilities are endless.

Q5: To what extent does your work involve sentiment and subjective information?

Lipika Dey: My work is to guide unstructured text analytics research for insight generation. Sentiments are a part of the insights generated.

The focus of our research is to develop methods for analysing different types of text, mostly consumer generated, to not only understand customer delights and pain-points but also to discover the underlying process lacunae and bottlenecks that are responsible for the pain-points. These are crucial insights for an enterprise. Most often the root cause analysis involves overlaying the text analytics results with other types of information available in the form of business rules, enterprise resource directory, information exchange network etc. for generating actionable insights. Finally it also includes strategizing to involve business teams to evaluate insights and convert the insights into business actions with appropriate computation of ROI.

Q6: How do you recommend dealing with high-volume, high-velocity, diverse data — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Lipika Dey: Tata Consultancy Services has conducted several surveys across industry over the last two years to understand organizational big data requirements. The findings are published in several reports available online. (See the Tata Consultancy Services Web site, under the Digital Enterprise theme.) One of the key findings from these surveys was that many business leaders saw the impending digital transformation as siloed components affecting only certain parts of the organization. We believe that this is a critical error.

The digital revolution that is responsible for high volumes of diverse data arriving at high velocity does not impact only a few parts of business — it affects almost every aspect. Thus our primary recommendation is to harness a holistic view of the enterprise that encompasses both technology and culture. Our focus is to help organizations achieve total digital transformation through an integrated approach that spans sales, customer service, marketing, and human resources, affecting the entire universe of business operations. The message is this: Business processes need to be rethought. The task at hand is to predict and prioritize the most likely and extreme areas of impact.

Q7: So what are the elements of that rethinking and that prioritization?

Lipika Dey: We urge our clients to consider the four major technology shifters under one umbrella. Big data initiatives should operate in tandem with social-media strategy, mobility plans, and cloud computing initiatives. I’ve talked about big data. The others —

Social media has tremendous potential for changing both business-to-business and business-to-consumer engagement. It is also a powerful way to build “crowdsourcing” solutions among partners in an ecosystem. Moving beyond traditional sales and services, social media also has tremendous role in supply-chain and human resource management.

Mobile apps are here to transform the way business operated for ages. They are also all set to change the way employees use organizational resources. Thus there is a pressure to rethink business rules and processes.

There will also soon be a need for complete infrastructure revision to ward off the strains imposed in meeting data needs. While cloud computing initiatives are on the rise, we still see them signed up by departments rather than enterprises. The fact that cloud offerings are typically paid for by subscription makes them economical when signed up by enterprises.

Having said that we also believe there is no “one size fits all” strategy. Enterprises may need to redesign their workplaces where business will work closely with IT to redesign its products and services, mechanisms for communicating with customers, partners, vendors and employees, business models and business processes.

Q8: Could you say more about data and analytical challenges?

Lipika Dey: The greatest challenges while dealing with unstructured data analytics for an enterprise is to measure accuracy, especially in absence of ground truths and also effectiveness of measures taken. To check effectiveness of actionable insights, one possibility is to use the A/B testing approach. It is a great way to understand the target audience and evaluate different options. We also feel it is always better to start with internal data — something that is assumed to be intuitively understood. If results match known results, well and good — your faith in the chosen methods increase. If they don’t match — explore, validate and then try out other alternatives, if not satisfied.

Q9: Could you provide an example (or two) that illustrates really well what your organization and clients have been able to accomplish via analytics, that demonstrate strong ROI?

Lipika Dey: I will describe two case studies. In the first one, one of our clients wanted to analyze calls received over a particular at their toll-free call-center. These calls were of unusually high duration. The aim was to reduce operational cost for running the call center without compromising on customer satisfaction. The calls were transcribed into text. Analysis of the calls revealed several insights that could be immediately transformed into actionable insights. The different types of analyses carried out and insights revealed were broadly categorized into different buckets as follows:

(a) Content based analysis  identified that these calls contained queries pertaining to existing customer accounts, queries about new products or services, status updates about transactions, and eventually requests for documents.

(b) Structural analysis revealed that each call requested multiple services and for different clients, which eventually led to several context switches for search of information, thereby leading to high duration. It also revealed that calls often landed at wrong points and had to be redirected several times before they could be answered.

Based on the above findings, a restructuring of the underlying processes and call-center operations were suggested with an estimated ROI based on projected reduction in number of calls requesting for status updates or documents to be dispatched etc. based on available statistics.

In the second case study, analysis of customer communications for the call-center of an international financial institution, done periodically over an extended period, revealed several interesting insights about how customer satisfaction could be increased from their current levels. The bank wished to obtain aggregated customer sentiments around a fixed set attributes related to their products, staff, operating environment, etc. We provided those, and the analysis also revealed several dissatisfaction root causes that were not captured in the fixed set of parameters. Several of these issues were not even within the bank’s control since those were obtained as external services. We correlated sentiment trends for different attributes with changes in customer satisfaction index to verify correctness of actions taken.

In this case, strict monetary returns were not computed. Unlike in retail, computing ROI for financial organizations require long-term vision, strategizing, investment and monitoring of text analytics activities.

Q10: I’m glad you’ll be speaking at LT-Accelerate. Your talk is titled “E-mail Analytics for Customer Support Centres — Gathering Insights about Support Activities, Bottlenecks and Remedies.” That’s a pretty descriptive title, but is there anything you’d like to add by way of a preview?

Lipika Dey: A support centre is the face of an organization to its customers and emails remain the life-line of support centres for many organizations. Hence organizations spend a lot of money on running these centres efficiently and effectively. But unlike other log-based complaint resolution systems, when all communication within the organization and with the customers occur through emails, analytics becomes difficult. That’s because a lot of relevant information about the type of problems logged, the resolution times, the compliance factors, the resolution process, etc. remains embedded within the messages and that too not in a straight forward way.

In this presentation we shall highlight some of the key analytical features that can generate interesting performance indicators for a support centre. These indicators can in turn be used to measure compliance factors and also characterize group-wise problem resolution process, inherent process complexities and activity patterns leading to bottlenecks — thereby allowing support centers to reorganize their mechanisms. It also supports a predictive model to incorporate early warnings and outage prevention.

Thanks Lipika, for sharing insights in this interview and in advance for your December presentation.

The Voice of the Customer × 650 Million/Year at Sony Mobile

We understand that customer feedback can make or break a consumer-facing business. That feedback — whether unsolicited, social-posted opinions, or gained during support interactions, or collected via surveys — captures valuable information about product and service quality issues. Automated analysis is essential. Given data volume and velocity, and the diversity of feedback sources and languages that a global enterprise must deal with, there is no other way to effectively produce insights.

Olle Hagelin, Sony Mobile

Olle Hagelin, Sony Mobile

Consumer and market analytics — and supporting social, text, speech, and sentiment analysis techniques — are subject matter for the LT-Accelerate conference, taking place December 4-5, 2014 in Brussels. We’re very happy that we were able to recruit Olle Hagelin from Sony Mobile as a speaker.

Olle started in the mobile phone business 1993 as a production engineer. He has held many roles as a project and quality manager. He was responsible for the Ericsson Mobile development process and for quality at a company level. Olle is currently quality manager in the Quality & Customer Service organization at Sony Mobile Corporation. Olle is responsible for handling feedback from the field.

Our interview with Olle Hagelin –

Q1: The topic of this Q&A is consumer and market insight. What’s your personal background and your current work role, as they relate to these domains?

Olle Hagelin: My responsibility is to look into all customer interactions to determine Sony Mobile’s biggest issues from the customer’s point of view. We handle around 650 million interactions per year.

Q2: What roles do you see for text and social analyses, as part of comprehensive insight analytics, in understanding and aggregating market voices?

Olle Hagelin: I think text and social analyses can replace most of what is done today.

Everyone’s customer will sooner or later express what they want on the Net. And opinions won’t be colored by your questions. You just put your ear to the ground and listen. You probably want to ask questions too but that will be to get details, to fine tune — not to understand the picture, only to understand what particular shade of green the customer is seeing out of 3,500 shades of green.

Q3: Are there particular tools or methods you favor? How do you ensure business-outcome alignment?

Olle Hagelin: You will always prefer the tool you use/can. For our purposes what we get from Confirmit and the tool Genius is perfect. But again it is to find issues, to mine text to find issues and understand sentiment of issues. If you are a marketing person it may be that other tools that are better.

Business-outcome alignment is a big statement and I don’t try to achieve that. If it comes, nice, but my aim is only to understand customer issues and to ensure that they are fixed as soon as possible. And I suppose the in-the-end result is business-outcome alignment?

Q4: A number of industry analysts and solution providers talk about omni-channel analytics and unified customer experience. Do you have any thoughts to share on working across the variety of interaction channels?

Olle Hagelin: Yes. Do it. I do. Sorry. Politically correct: Sony Mobile does and has since 2010. All repairs, all contact center interactions, and as much social as possible. As said above, we handle around 650 million interactions per year.

Q5: To what extent does your work involve sentiment and subjective information?

Olle Hagelin: A lot although it could be more. Especially to determine which issues hurt the customer most. Identifying the biggest, most costly issues etc. is easy, but to add on pain-point discovery would be good.

Sentiment/subjective analyses are used frequently to look into specific areas but not as part of the standard daily deliverable. Hopefully everyday will be in place in a year or two.

Q6: How do you recommend dealing with high-volume, high-velocity, diverse data — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Olle Hagelin: This can be discussed for days. But in short: Look at what you have and start from that. Build up piece-by-piece. Don’t attempt a big do-it-all system because it will never work and always be outdated. If you know only one part well — say handling either structured data or unstructured data — don’t try yourself to take a big bite of the other part, the part you don’t know well. Instead, buy help and learn slowly.

Sony Mobile works to split the data up into structured and unstructured parts. We work with them separately to identify issues first and then compare. We know structured data well and got very good support and help with the unstructured part. After four years we can do a lot ourselves, but without support from Confirmit with the hard unchewed mass of unstructured data — Confirmit handles text in the language it is written in (no translations) — we wouldn’t be able manage.

The end result is to make it quick and easy to get to the point.

After working with this data many years, we now have a good understanding of what issues that will be seen in all systems and which will not.

Q7: Could you provide an example (or two) that illustrates really well what your organization and clients have been able to accomplish via analytics, that demonstrate strong ROI?

Olle Hagelin: Two cases that we fixed quickly recently —

First is an issue when answering a call. The call always went to speaker mood. We identified the problem and it was fixed by Google within two weeks — it was an issue in Chrome.

Another one was several years ago: A discussion about a small and in-principle invisible crack in the front of a phone stopped sales in Germany. After we issued a statement that the problem is covered by warranty and will be fixed within warranty coverage, sales started again. It turned out almost no one wanted a fix! As I said, you had to look for the crack to see it.

I have many more examples, but I think for daily work, the possibility of quick-checking social to see whether an issue has spread or not has been the most valuable contributor. And that ability keeps head count down.

Q8: You’ll be presenting at LT-Accelerate. What will you be covering?

Olle Hagelin: I’ll show how Sony Mobile uses social and also text mining of CRM data to quickly identify issues, and how we get an understanding of how big they are with complementing structured data.

Added to this, the verbatim from customers can be used as feedback to engineers so they can reproduce issues in order to fix them.

My thanks to Olle. Please consider joining us at LT-Accelerate in Brussels to hear more!

The Right Investments for the Social Analytics Journey: Dell’s View

It’s common to talk of the “customer journey,” of the path an individual takes from needs awareness, via research and evaluation, to purchase and, in the case of a happy customer, loyalty and a lasting relationship. The customer journey may involve multiple channels and touchpoints.

Social touchpoints are among the most important, at every stage of the customer journey. We explore them in this interview with Shree Dandekar, general manager for social analytics at Dell and a speaker at the LT-Accelerate conference, December 4-5, 2014 in Brussels.

The ability to understand, measure, and shape social influence and advocacy is hugely important. You need software to do the job right, software that automates collection, filtering, and analysis of social and online text in conjunction with network and market analytics. Techniques are rapidly evolving, making social media analytics innovation a topic of great interest for brands and agencies across industry.

The social business challenge and technical responses are central topics at LT-Accelerate. I’m very much looking forward to Shree’s presentation, about tools and techniques for social ROI. If this topic interests you as well, you’ll want to learn more. An interview I recently conducted with Shree is a start, then I hope you’ll join us in Brussels.

Shree Dandekar, Dell

Shree Dandekar, Dell

Shree has been at Dell for 14 years, in roles covering software design, product development, enterprise marketing and technology strategy. He is responsible for developing and driving the strategy for Dell’s predictive analytics and BI solutions.

Q1: The topic of this Q&A is social media analytics. What’s your personal SMA background and your current work role?

Shree Dandekar: I am the GM for our social analytics offering and have been responsible for taking Dell products in this space to market.

Q2: What are key technical and business goals of the analyses you’re involved in?

Shree Dandekar: Given that we are in the business of offering social analytics to our customers, our technical and business goals are tailored around that. Specifically, technical goals are focused on making our social analytics product robust enough to support our customers’ needs. This does include making sure we capture the right sentiment, glean the right insights, and prep the data to ensure both business and social context information can be surfaced in an efficient manner. Our business goals are to make sure our customers can realize their “social nirvana” by identifying themselves on the social analytics journey and making the right investments in moving to the next level.

Q3: And what particular analytics approaches or technologies do you favor, whether for text, network, geospatial, behavioral, or other analyses? [You don’t have to cover all these analysis types.]

Shree Dandekar: We use predictive analytics algorithms to derive insights from Social Media data. Dell has invested significant IP in building its text and natural language processing (NLP) capabilities and our social media analytics offerings is directly built on top of that foundation. Dell also recently acquired a leading predictive analytics player: Statistica. Statistica Text Miner is an extension of Statistica Data Miner, ideal for translating unstructured text data into meaningful, valuable clusters of decision-making “gold.” As most users familiar with text mining already know, real-world data comes in a variety of forms, not always organized or easily ready to analyze. Text mining digs for the underlying information not readily apparent in traditional structured data.  These data sources can be extremely large as well.  Statistica Text Miner is optimized and has recently been further enhanced for working with such data.

Q4: To what extent do you get into sentiment and subjective information?

Shree Dandekar: Dell joined forces with a leading text analytics provider to leverage sentiment and text analytics. Their patented NLP engine uses a mix of rules and dictionaries to break down and analyze customer feedback text, and to score it on an 11-point sentiment scale for added granularity and measurement. The sentiment and text analytics solution enables Dell to make sense of the vast amount of customer feedback data available. In order to make the insights relevant to Dell’s business and understand brand health through the voice of the customer, the social analytics team developed a proprietary metric, the SNA metric. This metric is an indicator of purchase intent, giving Dell a clear view into customer advocacy of the Dell brand.  Once the social media data is collected, analyzed, and scored for sentiment, it is then scored against Dell’s SNA scale.

Q5: How do you recommend dealing with high-volume, high-velocity, diverse social postings — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Shree Dandekar: Dell is using this patent-pending software (SNA) and integrating it into all aspects of the business from product development, marketing, Net Promoter Score (NPS) diagnosis, customer support/service, sales, and M&A. Measuring more than 1.5 million conversations annually, the system provides the ability to drill down to very granular parts of the business in real time. It serves as a source of uniform distribution and assimilation of customer feedback for multiple business functions. This enhances Dell’s avowed policy of customer centricity and direct feedback. And, since it updates in real-time, SNA accelerates customer feedback on important topics enabling shorter response cycles without negatively affecting the brand health.

Q6: Could you provide an example (or two) that illustrates really well what you’ve been able to accomplish via SMA, that demonstrate strong ROI?

Shree Dandekar: The Dell social media analytics portfolio includes the patent-pending Social Net Advocacy (SNA) metric. SNA is designed to measure the net advocacy of a brand or topic, calculated from the sentiment and context of social media conversations (see figure). Dell uses SNA internally to help the company deliver an enhanced experience to its customers. SNA is integrated within the Dell Social Media Command Center, which enables the company to monitor and react to online conversations in real time.

Dell measures SNA at the brand level and also extends this measurement to more than 150 topics representing various aspects of the business. Online conversations are analyzed for topics including products, services, marketing, customer support, packaging and even community outreach efforts. Each of these conversations influences brand perception and therefore affects the overall advocacy or health of the brand. SNA enables organizations to understand, quantify and contextualize online feedback, leading to informed business decisions that help improve the overall customer experience. Organizations can integrate customer feedback in near-real time for short response cycles — meaning that an organization can quickly connect with a customer and discuss relevant solutions.

The customer feedback derived from the SNA program is delivered across the entire organization, from departments such as customer care and quality control to marketing and product development. The real time analysis and measuring of social data has allowed Dell to proactively quell any public concerns before they grow into potentially larger issues. Moreover, Dell is able to add context to the sentiment and SNA scores such as understanding whether the customer is a brand advocate or not.

For example, within hours after the launch of a specific Dell product, the social analytics team saw a declining trend in SNA (decreased by more than 50%). When the analyst team looked further into the issue, they found a significant number of social media conversations expressing anger over the pricing for the new product. They turned to Dell’s chief blogger who quickly wrote a post explaining the situation and rectifying the price concerns. Within one day, Dell was able to return to original sentiment levels. Moreover, the general manager didn’t even need to be brought into the issue- employees are empowered to make quick and informed decisions.

Q7: Finally, do you have recommendations to share, regarding choice of data sources, metrics, analytical methods, and visualizations, in order to best align with desired business outcome?

Shree Dandekar: With the explosive growth of social media, customers are increasingly taking their conversations to online platforms such as Twitter, Facebook, community forums, wikis and blogs. Because social media has the power to influence brand reputation, daily engagement with people who are discussing an organization’s brand has become a critical step for understanding the market — and in some cases, converting detractors into brand advocates.

Through social media analytics, organizations can determine who is doing the talking: Are they customers, influencers or others? They can find out when specific events caused positive or negative conversations and also measure general brand sentiment on a daily, weekly and monthly basis. This rich data enables enterprises to obtain real-time customer insights that can help solve complex business challenges.

The development of a social media analytics strategy can be thought of as a journey that begins by listening to online conversations. The next steps are to collect, record and analyze the data, and then monitor trends. Finally, heuristics and business algorithms are applied to the data to derive actionable insights. This journey from an ad hoc approach to a highly optimized solution does not happen overnight but in increments, as an enterprise develops analytics maturity. To achieve this maturity, business leaders need to make the right investments in technology, and then invest in training people and creating a social media analytics culture within the organization.

Thanks, Shree. Readers, if you’re intrigued by Shree’s take on social media analytics, please check out the LT-Accelerate program and consider joining us in Brussels!

From Social Sources to Customer Value: Synthesio’s Approach

Text analytics is an enabling technology for deep social media understanding. We apply natural language processing (NLP) and data analysis and visualization techniques in an effort to make sense of the diversity of social postings. The social intelligence that results advances customer engagement and informs efforts to meet marketing, customer experience, product management, and reputation management needs.

I interviewed Pedro Cardoso of social intelligence leader Synthesio as part of preparation for December’s LT-Accelerate conference. Pedro will be speaking on language morphology (forms) in sentiment analysis. That’s a fairly technical topic, reflecting Pedro’s role as text analytics director at Synthesio, but one that will help business attendees understand the ins-and-outs of attitudes, opinions, and emotions in social and other text sources.

Pedro Cardoso, Synthesio

Pedro Cardoso, Synthesio

Pedro’s background: He earned an engineering degree in electronics and control systems and a masters in speech processing. His career path started in Portugal, as a research engineer, followed by 4 years in Japan and 5 years in France. For the majority of this time, he worked on speech processing, mostly relying on machine learning for acoustic and language modeling. For the last 2 years, Pedro has been working on natural language processing at Synthesio in Paris.

Our Q&A:

Q1: The topic of this Q&A is social media analytics. What’s your personal SMA background and your current work role?

Pedro Cardoso> My background is in machine learning applied to language technology. I started in development of speech recognition systems — language and acoustic statistical models. The focus was not on social media analysis (SMA), even if over the years I did some call-center development, including tests on sentiment analysis in voice. Over the last two and half years, ever since I joined Synthesio, I have been working full-time on SMA.

Currently I am responsible for NLP and text analytics development at Synthesio. Our objective is to create algorithms that help process and analyse social data collected by Synthesio, so that it can easily understood and exploited by our customers. This work includes data visualisation, document topic classification, and sentiment analysis.

Q2: What are key technical and business goals of the analyses you’re involved in?

Pedro Cardoso> Business drives technology, and customers needs drive business.

As mentioned above, our objective in the text analytics group is to find ways to structure and present information from social media sources in a simple way that customers can understand and get value from it. Our focus is on text. We classify and summarize it with the goal of obtaining meaningful key performance indicators (KPIs) from large quantities of data, which would be impossible without technology.

We also develop methods for detecting key influencers and deriving demographic information. This allows our customers to focus their searches on particular groups of social media users.

Q3: And what particular analytics approaches or technologies do you favor, whether for text, network, geospatial, behavioral, or other analyses?

Pedro Cardoso> If we focus on my work, I favor text and also study of network connections between online users. But if the question is what I believe to be the best technologies for SMA, that would have to be text also. Text is the medium, it is what customers use for communication. Network, geospatial, and other analytics are important, but mainly to focus our listening on a specific group. In the end, it is text, what SM users say, that counts.

Recently there has been interest on image analysis. People share more and more pictures. Sharing the picture of a brand logo or a product carries a strong brand loyalty message. Still, we need better image processing techniques and to learn how to best use information from images, in particular how it combines with text, in case of comments.

Social media allows us to focus on particular customers and groups, it allows us to have more personalized communications. In these cases, technologies such as demographic analysis and group detection gain favor, but discussing further, we would be getting off-topic.

Q4: To what extent do you get into sentiment and subjective information?

Pedro Cardoso> Automatic sentiment analysis is a great part of what I do as text analytics director. Our team is responsible for the development of automatic sentiment analysis at Synthesio, and has developed internally current support for 15 languages offered as part of the product.

Subjectivity is a very complicated subject, and one that I believe no one has managed to solve. To understand subjectivity, you need first to understand well the user and the context in which a message was written. After all, the real meaning is in the person’s mind. We are still not there, and it might take a long while to get there.

Q5: How do you recommend dealing with high-volume, high-velocity, diverse social postings — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Pedro Cardoso> We have developed data crawlers that ensure we can capture, enrich and standardize data from different sources worldwide should they come from largest social networks (Twitter, Facebook, Sina Weibo, VKontakte, etc.), mainstream media sources, and blogs or forums (thanks to a dedicated sourcing team of 5 people). This approach allows us to deal with several million social mentions each day and to provide for each of them a sentiment assessment, a global influence ranking (proprietary algorithm), and potential reach (another proprietary algorithm), on an ongoing basis and in near real time. It takes less than 2 minutes for a data to be crawled, parsed, enriched and pushed into client interfaces. Once structured with both metadata and enriched data, our clients can then access their dashboard. They can either work on global data volumes for main KPI tracking and trend analysis and/or on focused subsamples for deeper human qualitative analysis.

Q6: Could you provide an example (or two) that illustrates really well what Sythesio’s customers been able to accomplish via SMA, that demonstrate strong ROI?

Pedro Cardoso> Sure. One of our clients in the automotive industry has achieved, through deep analysis of first-customer feedback in European forums, identification of key barriers when it comes to acquiring an electric car. Based on the lessons, they had the ability to create a far more efficient digital and social media campaign. ROI was there for reducing costs before the campaign both in terms of message crafting and media planning. ROI was there after the campaign, which drove far more traffic to the Web site, and to dealerships for test drives, than previous efforts.

Another example we can give is a telco company that uses Synthesio for both listening and engaging directly with its clients on social networks, regarding client questions and complaints. By defining a precise listening scope and by clustering, combined with precise workflows for answer validation and publication, the client was able to measure ROI based on average answer time for any given question. By socializing answers to most frequent topics they also built up a C to C advice platform, which allows top users to directly address other customers questions. ROI is also achieved via fewer inbound calls to the call center.

Q7: Do you have recommendations to share, regarding choice of data sources, metrics, analytical methods, and visualizations, in order to best align with desired business outcome?

Pedro Cardoso> At Synthesio we hold two key principles when it comes to social data and metrics.

  1. We believe social analytics and intelligence have to be global. We have sources covering more than 200 countries, networks crawled natively in more than 50 languages, etc.
  2. And they have to be simple. We built business oriented metrics, comparable KPIs, and customizable interfaces to make sure that every single client within a company (from PR to marketing, from CRM to sales) can access the right data at the right moment.

Furthermore we know that social analytics can’t be envisaged as another data silo. That’s why we pay so much attention to openness and interconnections with other digital marketing tools (such as consumer review platforms like Bazaarvoice, owned communities platforms like Lithium, social marketing platforms like Spredfast, etc.), CRM (Salesforce.com, Microsoft Dynamics, etc.), or BI (IBM, etc.) tools used by our clients. Our open API helps them to both push data to such tools but also integrate data from other sources to get a 360° view of customer feedback, for instance.

Last recommendation we would like to share is “Don’t get too focused on data: Next step is people.” To better measure ROI, our clients have to go back to where it all began: Business is conducted by people and not by a data set. Being customer centric for better targeting, better personalization of messages, and better understanding of the brand relationship is what guides all of our present and future developments. Even though our roadmap is our best kept secret, be prepared to see more demographic profiling, audience targeting tools, and sales oriented measurement and anticipation metrics.

Q8: I’m glad you’ll be speaking at LT-Accelerate. Your topic is fairly technical — exploiting languages’ morphology for automatic sentiment analysis — noting that we do have a range of presentations on the program. Would you please tell me about your presentation, briefly: What attendees will learn.

Pedro Cardoso> The first thing we need to understand is the definition of morphology. Morphology of a word defines its structure: the root, part-of-speech, gender, conjugation, etc. And this is the first giveaway of the presentation.

Continuing, I will show how the use of morphological information of words helped us at Synthesio in building sentiment analysis, in particular for less represented languages, those that offer less labeled [training] data. Also, it is an important part of the system for agglutinative languages, whose vocabulary is theoretically close to infinite.

That wraps up this interview. I’m looking forward to Pedro Cardoso’s LT-Accelerate presentation. If you’re intrigued by what you read here, please do visit the conference Web site to learn more. And I hope you’ll join us 4-5 December 2014 in Brussels.