Category: text analytics

Lipika Dey, Tata Consultancy Services

The Analytics of Digital Transformation, per Tata Consultancy Services

Next month’s LT-Accelerate conference will be the third occasion I’ve invited Lipika Dey to speak at a conference I’ve organized. She’s that interesting a speaker. One talk was on Goal-driven Sentiment Analysis, a second on Fusing Sentiment and BI to Obtain Customer/Retail Insight. (You’ll find video of the latter talk embedded at the end of this article.) Next month, at LT-Accelerate in Brussels, she’ll be speaking on a particular topic that’s actually of quite broad concern, E-mail Analytics for Customer Support Centres.

As part of the conference lead-up, I interviewed Lipika regarding consumer and market analytics, and — given her research and consulting background — techniques that best extract practical, usable insights from text and social data. What follows are a brief bio and then the full text of our exchange.

Dr. Lipika Dey, Tata Consultancy Services

Dr. Lipika Dey, senior consultant and principal scientist at Tata Consultancy Services

Dr. Lipika Dey is a senior consultant and principal scientist at Tata Consultancy Services (TCS), India with over 20 years of experience in academic and industrial R&D. Her research interests are in content analytics from social media and news, social network analytics, predictive modeling, sentiment analysis and opinion mining, and semantic search of enterprise content. She is keenly interested in developing analytical frameworks for integrated analysis of unstructured and structured data.

Lipika was formerly a faculty member in the Department of Mathematics at the Indian Institute of Technology, Delhi, from 1995 to 2006. She has published in international journals and refereed conference proceedings. Lipika has a Ph.D. in Computer Science and Engineering, M.Tech in Computer Science and Data Processing, and 5 Year Integrated M.Sc in Mathematics from IIT Kharagpur.

Our interview with Lipika Dey –

Q1: The topic of this Q&A is consumer and market insight. What’s your  personal background and your current work role, as they relate to these domains?

Lipika Dey: I head the research sub-area of Web Intelligence and Text Mining at Innovation Labs, Delhi of Tata Consultancy Services. Throughout my academic and a research career, I have worked in the areas of data mining, text mining and information retrieval. My current interests are focused towards seamless integration of business intelligence and multi-structured predictive analytics that can reliably and gainfully use information from multitude of sources for business insights and strategic planning.

Q2: What roles do you see for text and social analyses, as part of comprehensive insight analytics, in understanding and aggregating market voices?

Lipika Dey: The role of text in insight analytics can be hardly over-emphasized.

Digital transformation has shifted control of the consumer world to consumers from providers. Consumers — both actual and potential — are demanding, buying, reviewing, criticising, influencing others, and thereby controlling the market. The decreasing cost of smart gadgets is ensuring that all this is not just for the elite and tech-savvy. Ease of communicating in local languages on these gadgets is also a contributing factor to the increased user base and increased content generation.

News channels and other traditional information sources have also adopted social media for information dissemination, thereby paving the way for study of people’s reactions to policies and regulations.

With so much expressed and exchanged all over the world, it is hard to ignore content and interaction data to gather insights.

Q3: Are there particular tools or methods you favor? How do you ensure business-outcome alignment?

Lipika Dey: My personal favourites for text analytics are statistical methods and imprecise reasoning techniques used in conjunction with domain and business ontologies for interpretation and insight generation. Statistical methods are language agnostic and ideal for handling noisy text. Text inherently is not amenable to be used within a crisp reasoning framework. Hence use of imprecise representation and reasoning methodologies based on fuzzy sets or rough sets is ideal for reasoning with text inputs.

The most crucial aspect for text analytics based applications is interpretation of results and insight generation. I strongly believe in interactive analytics platforms that can aid a human analyst comprehend and validate the results. Ability to create and modify business ontology with ease and view the content or results from different perspectives is also crucial for successful adoption of a text analytics based application. Business intelligence is far too entrenched in dashboard-driven analytics at the moment. It is difficult to switch the mind-set of a whole group at once. Thus text analytics at this moment is simply used as a way to structure the content to generate numbers for some pre-defined parameters. A large volume of information which could be potentially used is therefore ignored. One possible way to practically enrich business intelligence with information gathered from text is to choose “analytics as a service” rather than look for a tool.

As a researcher I find this the most exciting phase in the history of text analytics. I see a lot of potential in the yet unused aspects of text for insight generation. We are at the confluence where surface level analytics has seen a fair degree of success. The challenge now is to dive below the surface and understand intentions, attitudes, influences, etc. from stand-alone or communications text. Dealing with ever-evolving language patterns that are also in turn influenced by the underlying gadgets through which content is generated just adds to the complexity.

Q4: A number of industry analysts and solution providers talk about omni-channel analytics and unified customer experience. Do you have any thoughts to share on working across the variety of interaction channels?

Lipika Dey: Yes, we see many business organizations actively moving towards unified customer experience. Omni-channel analytics is catching up. But truly speaking I think at this point of time it is an aspirational capabilty. A lot of information is being pushed. Some of it is contextual. But I am not sure whether the industry is still in a position to measure its effectiveness or for that matter use it to its full potential.

It is true that a multitude of sources help in generating a more comprehensive view of a consumer, both as an individual as well as a social being. Interestingly, as data is growing bigger and bigger, technology is enabling organizations to focus on smaller and smaller groups, almost to the point of catering to individuals.

As a researcher I see exciting possibilities to work in new directions. My personal view is that the success of omni-channel analytics will depend on the capability of data scientists to amalgamate domain knowledge and business knowledge with loads and loads of information gathered about socio-cultural, demographic, psychological and behavioural factors of target customers. Traditional mining tools and technologies will play a big role, but I envisage an even greater role for reasoning platforms which will help analysts play around with information in a predictive environment, pick and choose conditional variables, perform what-if analysis, juggle around with possible alternatives and come up with actionable insights. The possibilities are endless.

Q5: To what extent does your work involve sentiment and subjective information?

Lipika Dey: My work is to guide unstructured text analytics research for insight generation. Sentiments are a part of the insights generated.

The focus of our research is to develop methods for analysing different types of text, mostly consumer generated, to not only understand customer delights and pain-points but also to discover the underlying process lacunae and bottlenecks that are responsible for the pain-points. These are crucial insights for an enterprise. Most often the root cause analysis involves overlaying the text analytics results with other types of information available in the form of business rules, enterprise resource directory, information exchange network etc. for generating actionable insights. Finally it also includes strategizing to involve business teams to evaluate insights and convert the insights into business actions with appropriate computation of ROI.

Q6: How do you recommend dealing with high-volume, high-velocity, diverse data — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Lipika Dey: Tata Consultancy Services has conducted several surveys across industry over the last two years to understand organizational big data requirements. The findings are published in several reports available online. (See the Tata Consultancy Services Web site, under the Digital Enterprise theme.) One of the key findings from these surveys was that many business leaders saw the impending digital transformation as siloed components affecting only certain parts of the organization. We believe that this is a critical error.

The digital revolution that is responsible for high volumes of diverse data arriving at high velocity does not impact only a few parts of business — it affects almost every aspect. Thus our primary recommendation is to harness a holistic view of the enterprise that encompasses both technology and culture. Our focus is to help organizations achieve total digital transformation through an integrated approach that spans sales, customer service, marketing, and human resources, affecting the entire universe of business operations. The message is this: Business processes need to be rethought. The task at hand is to predict and prioritize the most likely and extreme areas of impact.

Q7: So what are the elements of that rethinking and that prioritization?

Lipika Dey: We urge our clients to consider the four major technology shifters under one umbrella. Big data initiatives should operate in tandem with social-media strategy, mobility plans, and cloud computing initiatives. I’ve talked about big data. The others –

Social media has tremendous potential for changing both business-to-business and business-to-consumer engagement. It is also a powerful way to build “crowdsourcing” solutions among partners in an ecosystem. Moving beyond traditional sales and services, social media also has tremendous role in supply-chain and human resource management.

Mobile apps are here to transform the way business operated for ages. They are also all set to change the way employees use organizational resources. Thus there is a pressure to rethink business rules and processes.

There will also soon be a need for complete infrastructure revision to ward off the strains imposed in meeting data needs. While cloud computing initiatives are on the rise, we still see them signed up by departments rather than enterprises. The fact that cloud offerings are typically paid for by subscription makes them economical when signed up by enterprises.

Having said that we also believe there is no “one size fits all” strategy. Enterprises may need to redesign their workplaces where business will work closely with IT to redesign its products and services, mechanisms for communicating with customers, partners, vendors and employees, business models and business processes.

Q8: Could you say more about data and analytical challenges?

Lipika Dey: The greatest challenges while dealing with unstructured data analytics for an enterprise is to measure accuracy, especially in absence of ground truths and also effectiveness of measures taken. To check effectiveness of actionable insights, one possibility is to use the A/B testing approach. It is a great way to understand the target audience and evaluate different options. We also feel it is always better to start with internal data — something that is assumed to be intuitively understood. If results match known results, well and good — your faith in the chosen methods increase. If they don’t match — explore, validate and then try out other alternatives, if not satisfied.

Q9: Could you provide an example (or two) that illustrates really well what your organization and clients have been able to accomplish via analytics, that demonstrate strong ROI?

Lipika Dey: I will describe two case studies. In the first one, one of our clients wanted to analyze calls received over a particular at their toll-free call-center. These calls were of unusually high duration. The aim was to reduce operational cost for running the call center without compromising on customer satisfaction. The calls were transcribed into text. Analysis of the calls revealed several insights that could be immediately transformed into actionable insights. The different types of analyses carried out and insights revealed were broadly categorized into different buckets as follows:

(a) Content based analysis  identified that these calls contained queries pertaining to existing customer accounts, queries about new products or services, status updates about transactions, and eventually requests for documents.

(b) Structural analysis revealed that each call requested multiple services and for different clients, which eventually led to several context switches for search of information, thereby leading to high duration. It also revealed that calls often landed at wrong points and had to be redirected several times before they could be answered.

Based on the above findings, a restructuring of the underlying processes and call-center operations were suggested with an estimated ROI based on projected reduction in number of calls requesting for status updates or documents to be dispatched etc. based on available statistics.

In the second case study, analysis of customer communications for the call-center of an international financial institution, done periodically over an extended period, revealed several interesting insights about how customer satisfaction could be increased from their current levels. The bank wished to obtain aggregated customer sentiments around a fixed set attributes related to their products, staff, operating environment, etc. We provided those, and the analysis also revealed several dissatisfaction root causes that were not captured in the fixed set of parameters. Several of these issues were not even within the bank’s control since those were obtained as external services. We correlated sentiment trends for different attributes with changes in customer satisfaction index to verify correctness of actions taken.

In this case, strict monetary returns were not computed. Unlike in retail, computing ROI for financial organizations require long-term vision, strategizing, investment and monitoring of text analytics activities.

Q10: I’m glad you’ll be speaking at LT-Accelerate. Your talk is titled “E-mail Analytics for Customer Support Centres — Gathering Insights about Support Activities, Bottlenecks and Remedies.” That’s a pretty descriptive title, but is there anything you’d like to add by way of a preview?

Lipika Dey: A support centre is the face of an organization to its customers and emails remain the life-line of support centres for many organizations. Hence organizations spend a lot of money on running these centres efficiently and effectively. But unlike other log-based complaint resolution systems, when all communication within the organization and with the customers occur through emails, analytics becomes difficult. That’s because a lot of relevant information about the type of problems logged, the resolution times, the compliance factors, the resolution process, etc. remains embedded within the messages and that too not in a straight forward way.

In this presentation we shall highlight some of the key analytical features that can generate interesting performance indicators for a support centre. These indicators can in turn be used to measure compliance factors and also characterize group-wise problem resolution process, inherent process complexities and activity patterns leading to bottlenecks — thereby allowing support centers to reorganize their mechanisms. It also supports a predictive model to incorporate early warnings and outage prevention.

Thanks Lipika, for sharing insights in this interview and in advance for your December presentation.


The Voice of the Customer × 650 Million/Year at Sony Mobile

We understand that customer feedback can make or break a consumer-facing business. That feedback — whether unsolicited, social-posted opinions, or gained during support interactions, or collected via surveys — captures valuable information about product and service quality issues. Automated analysis is essential. Given data volume and velocity, and the diversity of feedback sources and languages that a global enterprise must deal with, there is no other way to effectively produce insights.

Olle Hagelin, Sony Mobile

Olle Hagelin, Sony Mobile

Consumer and market analytics — and supporting social, text, speech, and sentiment analysis techniques — are subject matter for the LT-Accelerate conference, taking place December 4-5, 2014 in Brussels. We’re very happy that we were able to recruit Olle Hagelin from Sony Mobile as a speaker.

Olle started in the mobile phone business 1993 as a production engineer. He has held many roles as a project and quality manager. He was responsible for the Ericsson Mobile development process and for quality at a company level. Olle is currently quality manager in the Quality & Customer Service organization at Sony Mobile Corporation. Olle is responsible for handling feedback from the field.

Our interview with Olle Hagelin –

Q1: The topic of this Q&A is consumer and market insight. What’s your personal background and your current work role, as they relate to these domains?

Olle Hagelin: My responsibility is to look into all customer interactions to determine Sony Mobile’s biggest issues from the customer’s point of view. We handle around 650 million interactions per year.

Q2: What roles do you see for text and social analyses, as part of comprehensive insight analytics, in understanding and aggregating market voices?

Olle Hagelin: I think text and social analyses can replace most of what is done today.

Everyone’s customer will sooner or later express what they want on the Net. And opinions won’t be colored by your questions. You just put your ear to the ground and listen. You probably want to ask questions too but that will be to get details, to fine tune — not to understand the picture, only to understand what particular shade of green the customer is seeing out of 3,500 shades of green.

Q3: Are there particular tools or methods you favor? How do you ensure business-outcome alignment?

Olle Hagelin: You will always prefer the tool you use/can. For our purposes what we get from Confirmit and the tool Genius is perfect. But again it is to find issues, to mine text to find issues and understand sentiment of issues. If you are a marketing person it may be that other tools that are better.

Business-outcome alignment is a big statement and I don’t try to achieve that. If it comes, nice, but my aim is only to understand customer issues and to ensure that they are fixed as soon as possible. And I suppose the in-the-end result is business-outcome alignment?

Q4: A number of industry analysts and solution providers talk about omni-channel analytics and unified customer experience. Do you have any thoughts to share on working across the variety of interaction channels?

Olle Hagelin: Yes. Do it. I do. Sorry. Politically correct: Sony Mobile does and has since 2010. All repairs, all contact center interactions, and as much social as possible. As said above, we handle around 650 million interactions per year.

Q5: To what extent does your work involve sentiment and subjective information?

Olle Hagelin: A lot although it could be more. Especially to determine which issues hurt the customer most. Identifying the biggest, most costly issues etc. is easy, but to add on pain-point discovery would be good.

Sentiment/subjective analyses are used frequently to look into specific areas but not as part of the standard daily deliverable. Hopefully everyday will be in place in a year or two.

Q6: How do you recommend dealing with high-volume, high-velocity, diverse data — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Olle Hagelin: This can be discussed for days. But in short: Look at what you have and start from that. Build up piece-by-piece. Don’t attempt a big do-it-all system because it will never work and always be outdated. If you know only one part well — say handling either structured data or unstructured data — don’t try yourself to take a big bite of the other part, the part you don’t know well. Instead, buy help and learn slowly.

Sony Mobile works to split the data up into structured and unstructured parts. We work with them separately to identify issues first and then compare. We know structured data well and got very good support and help with the unstructured part. After four years we can do a lot ourselves, but without support from Confirmit with the hard unchewed mass of unstructured data — Confirmit handles text in the language it is written in (no translations) — we wouldn’t be able manage.

The end result is to make it quick and easy to get to the point.

After working with this data many years, we now have a good understanding of what issues that will be seen in all systems and which will not.

Q7: Could you provide an example (or two) that illustrates really well what your organization and clients have been able to accomplish via analytics, that demonstrate strong ROI?

Olle Hagelin: Two cases that we fixed quickly recently –

First is an issue when answering a call. The call always went to speaker mood. We identified the problem and it was fixed by Google within two weeks — it was an issue in Chrome.

Another one was several years ago: A discussion about a small and in-principle invisible crack in the front of a phone stopped sales in Germany. After we issued a statement that the problem is covered by warranty and will be fixed within warranty coverage, sales started again. It turned out almost no one wanted a fix! As I said, you had to look for the crack to see it.

I have many more examples, but I think for daily work, the possibility of quick-checking social to see whether an issue has spread or not has been the most valuable contributor. And that ability keeps head count down.

Q8: You’ll be presenting at LT-Accelerate. What will you be covering?

Olle Hagelin: I’ll show how Sony Mobile uses social and also text mining of CRM data to quickly identify issues, and how we get an understanding of how big they are with complementing structured data.

Added to this, the verbatim from customers can be used as feedback to engineers so they can reproduce issues in order to fix them.

My thanks to Olle. Please consider joining us at LT-Accelerate in Brussels to hear more!

The Right Investments for the Social Analytics Journey: Dell’s View

It’s common to talk of the “customer journey,” of the path an individual takes from needs awareness, via research and evaluation, to purchase and, in the case of a happy customer, loyalty and a lasting relationship. The customer journey may involve multiple channels and touchpoints.

Social touchpoints are among the most important, at every stage of the customer journey. We explore them in this interview with Shree Dandekar, general manager for social analytics at Dell and a speaker at the LT-Accelerate conference, December 4-5, 2014 in Brussels.

The ability to understand, measure, and shape social influence and advocacy is hugely important. You need software to do the job right, software that automates collection, filtering, and analysis of social and online text in conjunction with network and market analytics. Techniques are rapidly evolving, making social media analytics innovation a topic of great interest for brands and agencies across industry.

The social business challenge and technical responses are central topics at LT-Accelerate. I’m very much looking forward to Shree’s presentation, about tools and techniques for social ROI. If this topic interests you as well, you’ll want to learn more. An interview I recently conducted with Shree is a start, then I hope you’ll join us in Brussels.

Shree Dandekar, Dell

Shree Dandekar, Dell

Shree has been at Dell for 14 years, in roles covering software design, product development, enterprise marketing and technology strategy. He is responsible for developing and driving the strategy for Dell’s predictive analytics and BI solutions.

Q1: The topic of this Q&A is social media analytics. What’s your personal SMA background and your current work role?

Shree Dandekar: I am the GM for our social analytics offering and have been responsible for taking Dell products in this space to market.

Q2: What are key technical and business goals of the analyses you’re involved in?

Shree Dandekar: Given that we are in the business of offering social analytics to our customers, our technical and business goals are tailored around that. Specifically, technical goals are focused on making our social analytics product robust enough to support our customers’ needs. This does include making sure we capture the right sentiment, glean the right insights, and prep the data to ensure both business and social context information can be surfaced in an efficient manner. Our business goals are to make sure our customers can realize their “social nirvana” by identifying themselves on the social analytics journey and making the right investments in moving to the next level.

Q3: And what particular analytics approaches or technologies do you favor, whether for text, network, geospatial, behavioral, or other analyses? [You don’t have to cover all these analysis types.]

Shree Dandekar: We use predictive analytics algorithms to derive insights from Social Media data. Dell has invested significant IP in building its text and natural language processing (NLP) capabilities and our social media analytics offerings is directly built on top of that foundation. Dell also recently acquired a leading predictive analytics player: Statistica. Statistica Text Miner is an extension of Statistica Data Miner, ideal for translating unstructured text data into meaningful, valuable clusters of decision-making “gold.” As most users familiar with text mining already know, real-world data comes in a variety of forms, not always organized or easily ready to analyze. Text mining digs for the underlying information not readily apparent in traditional structured data.  These data sources can be extremely large as well.  Statistica Text Miner is optimized and has recently been further enhanced for working with such data.

Q4: To what extent do you get into sentiment and subjective information?

Shree Dandekar: Dell joined forces with a leading text analytics provider to leverage sentiment and text analytics. Their patented NLP engine uses a mix of rules and dictionaries to break down and analyze customer feedback text, and to score it on an 11-point sentiment scale for added granularity and measurement. The sentiment and text analytics solution enables Dell to make sense of the vast amount of customer feedback data available. In order to make the insights relevant to Dell’s business and understand brand health through the voice of the customer, the social analytics team developed a proprietary metric, the SNA metric. This metric is an indicator of purchase intent, giving Dell a clear view into customer advocacy of the Dell brand.  Once the social media data is collected, analyzed, and scored for sentiment, it is then scored against Dell’s SNA scale.

Q5: How do you recommend dealing with high-volume, high-velocity, diverse social postings — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Shree Dandekar: Dell is using this patent-pending software (SNA) and integrating it into all aspects of the business from product development, marketing, Net Promoter Score (NPS) diagnosis, customer support/service, sales, and M&A. Measuring more than 1.5 million conversations annually, the system provides the ability to drill down to very granular parts of the business in real time. It serves as a source of uniform distribution and assimilation of customer feedback for multiple business functions. This enhances Dell’s avowed policy of customer centricity and direct feedback. And, since it updates in real-time, SNA accelerates customer feedback on important topics enabling shorter response cycles without negatively affecting the brand health.

Q6: Could you provide an example (or two) that illustrates really well what you’ve been able to accomplish via SMA, that demonstrate strong ROI?

Shree Dandekar: The Dell social media analytics portfolio includes the patent-pending Social Net Advocacy (SNA) metric. SNA is designed to measure the net advocacy of a brand or topic, calculated from the sentiment and context of social media conversations (see figure). Dell uses SNA internally to help the company deliver an enhanced experience to its customers. SNA is integrated within the Dell Social Media Command Center, which enables the company to monitor and react to online conversations in real time.

Dell measures SNA at the brand level and also extends this measurement to more than 150 topics representing various aspects of the business. Online conversations are analyzed for topics including products, services, marketing, customer support, packaging and even community outreach efforts. Each of these conversations influences brand perception and therefore affects the overall advocacy or health of the brand. SNA enables organizations to understand, quantify and contextualize online feedback, leading to informed business decisions that help improve the overall customer experience. Organizations can integrate customer feedback in near-real time for short response cycles — meaning that an organization can quickly connect with a customer and discuss relevant solutions.

The customer feedback derived from the SNA program is delivered across the entire organization, from departments such as customer care and quality control to marketing and product development. The real time analysis and measuring of social data has allowed Dell to proactively quell any public concerns before they grow into potentially larger issues. Moreover, Dell is able to add context to the sentiment and SNA scores such as understanding whether the customer is a brand advocate or not.

For example, within hours after the launch of a specific Dell product, the social analytics team saw a declining trend in SNA (decreased by more than 50%). When the analyst team looked further into the issue, they found a significant number of social media conversations expressing anger over the pricing for the new product. They turned to Dell’s chief blogger who quickly wrote a post explaining the situation and rectifying the price concerns. Within one day, Dell was able to return to original sentiment levels. Moreover, the general manager didn’t even need to be brought into the issue- employees are empowered to make quick and informed decisions.

Q7: Finally, do you have recommendations to share, regarding choice of data sources, metrics, analytical methods, and visualizations, in order to best align with desired business outcome?

Shree Dandekar: With the explosive growth of social media, customers are increasingly taking their conversations to online platforms such as Twitter, Facebook, community forums, wikis and blogs. Because social media has the power to influence brand reputation, daily engagement with people who are discussing an organization’s brand has become a critical step for understanding the market — and in some cases, converting detractors into brand advocates.

Through social media analytics, organizations can determine who is doing the talking: Are they customers, influencers or others? They can find out when specific events caused positive or negative conversations and also measure general brand sentiment on a daily, weekly and monthly basis. This rich data enables enterprises to obtain real-time customer insights that can help solve complex business challenges.

The development of a social media analytics strategy can be thought of as a journey that begins by listening to online conversations. The next steps are to collect, record and analyze the data, and then monitor trends. Finally, heuristics and business algorithms are applied to the data to derive actionable insights. This journey from an ad hoc approach to a highly optimized solution does not happen overnight but in increments, as an enterprise develops analytics maturity. To achieve this maturity, business leaders need to make the right investments in technology, and then invest in training people and creating a social media analytics culture within the organization.

Thanks, Shree. Readers, if you’re intrigued by Shree’s take on social media analytics, please check out the LT-Accelerate program and consider joining us in Brussels!

From Social Sources to Customer Value: Synthesio’s Approach

Text analytics is an enabling technology for deep social media understanding. We apply natural language processing (NLP) and data analysis and visualization techniques in an effort to make sense of the diversity of social postings. The social intelligence that results advances customer engagement and informs efforts to meet marketing, customer experience, product management, and reputation management needs.

I interviewed Pedro Cardoso of social intelligence leader Synthesio as part of preparation for December’s LT-Accelerate conference. Pedro will be speaking on language morphology (forms) in sentiment analysis. That’s a fairly technical topic, reflecting Pedro’s role as text analytics director at Synthesio, but one that will help business attendees understand the ins-and-outs of attitudes, opinions, and emotions in social and other text sources.

Pedro Cardoso, Synthesio

Pedro Cardoso, Synthesio

Pedro’s background: He earned an engineering degree in electronics and control systems and a masters in speech processing. His career path started in Portugal, as a research engineer, followed by 4 years in Japan and 5 years in France. For the majority of this time, he worked on speech processing, mostly relying on machine learning for acoustic and language modeling. For the last 2 years, Pedro has been working on natural language processing at Synthesio in Paris.

Our Q&A:

Q1: The topic of this Q&A is social media analytics. What’s your personal SMA background and your current work role?

Pedro Cardoso> My background is in machine learning applied to language technology. I started in development of speech recognition systems — language and acoustic statistical models. The focus was not on social media analysis (SMA), even if over the years I did some call-center development, including tests on sentiment analysis in voice. Over the last two and half years, ever since I joined Synthesio, I have been working full-time on SMA.

Currently I am responsible for NLP and text analytics development at Synthesio. Our objective is to create algorithms that help process and analyse social data collected by Synthesio, so that it can easily understood and exploited by our customers. This work includes data visualisation, document topic classification, and sentiment analysis.

Q2: What are key technical and business goals of the analyses you’re involved in?

Pedro Cardoso> Business drives technology, and customers needs drive business.

As mentioned above, our objective in the text analytics group is to find ways to structure and present information from social media sources in a simple way that customers can understand and get value from it. Our focus is on text. We classify and summarize it with the goal of obtaining meaningful key performance indicators (KPIs) from large quantities of data, which would be impossible without technology.

We also develop methods for detecting key influencers and deriving demographic information. This allows our customers to focus their searches on particular groups of social media users.

Q3: And what particular analytics approaches or technologies do you favor, whether for text, network, geospatial, behavioral, or other analyses?

Pedro Cardoso> If we focus on my work, I favor text and also study of network connections between online users. But if the question is what I believe to be the best technologies for SMA, that would have to be text also. Text is the medium, it is what customers use for communication. Network, geospatial, and other analytics are important, but mainly to focus our listening on a specific group. In the end, it is text, what SM users say, that counts.

Recently there has been interest on image analysis. People share more and more pictures. Sharing the picture of a brand logo or a product carries a strong brand loyalty message. Still, we need better image processing techniques and to learn how to best use information from images, in particular how it combines with text, in case of comments.

Social media allows us to focus on particular customers and groups, it allows us to have more personalized communications. In these cases, technologies such as demographic analysis and group detection gain favor, but discussing further, we would be getting off-topic.

Q4: To what extent do you get into sentiment and subjective information?

Pedro Cardoso> Automatic sentiment analysis is a great part of what I do as text analytics director. Our team is responsible for the development of automatic sentiment analysis at Synthesio, and has developed internally current support for 15 languages offered as part of the product.

Subjectivity is a very complicated subject, and one that I believe no one has managed to solve. To understand subjectivity, you need first to understand well the user and the context in which a message was written. After all, the real meaning is in the person’s mind. We are still not there, and it might take a long while to get there.

Q5: How do you recommend dealing with high-volume, high-velocity, diverse social postings — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Pedro Cardoso> We have developed data crawlers that ensure we can capture, enrich and standardize data from different sources worldwide should they come from largest social networks (Twitter, Facebook, Sina Weibo, VKontakte, etc.), mainstream media sources, and blogs or forums (thanks to a dedicated sourcing team of 5 people). This approach allows us to deal with several million social mentions each day and to provide for each of them a sentiment assessment, a global influence ranking (proprietary algorithm), and potential reach (another proprietary algorithm), on an ongoing basis and in near real time. It takes less than 2 minutes for a data to be crawled, parsed, enriched and pushed into client interfaces. Once structured with both metadata and enriched data, our clients can then access their dashboard. They can either work on global data volumes for main KPI tracking and trend analysis and/or on focused subsamples for deeper human qualitative analysis.

Q6: Could you provide an example (or two) that illustrates really well what Sythesio’s customers been able to accomplish via SMA, that demonstrate strong ROI?

Pedro Cardoso> Sure. One of our clients in the automotive industry has achieved, through deep analysis of first-customer feedback in European forums, identification of key barriers when it comes to acquiring an electric car. Based on the lessons, they had the ability to create a far more efficient digital and social media campaign. ROI was there for reducing costs before the campaign both in terms of message crafting and media planning. ROI was there after the campaign, which drove far more traffic to the Web site, and to dealerships for test drives, than previous efforts.

Another example we can give is a telco company that uses Synthesio for both listening and engaging directly with its clients on social networks, regarding client questions and complaints. By defining a precise listening scope and by clustering, combined with precise workflows for answer validation and publication, the client was able to measure ROI based on average answer time for any given question. By socializing answers to most frequent topics they also built up a C to C advice platform, which allows top users to directly address other customers questions. ROI is also achieved via fewer inbound calls to the call center.

Q7: Do you have recommendations to share, regarding choice of data sources, metrics, analytical methods, and visualizations, in order to best align with desired business outcome?

Pedro Cardoso> At Synthesio we hold two key principles when it comes to social data and metrics.

  1. We believe social analytics and intelligence have to be global. We have sources covering more than 200 countries, networks crawled natively in more than 50 languages, etc.
  2. And they have to be simple. We built business oriented metrics, comparable KPIs, and customizable interfaces to make sure that every single client within a company (from PR to marketing, from CRM to sales) can access the right data at the right moment.

Furthermore we know that social analytics can’t be envisaged as another data silo. That’s why we pay so much attention to openness and interconnections with other digital marketing tools (such as consumer review platforms like Bazaarvoice, owned communities platforms like Lithium, social marketing platforms like Spredfast, etc.), CRM (Salesforce.com, Microsoft Dynamics, etc.), or BI (IBM, etc.) tools used by our clients. Our open API helps them to both push data to such tools but also integrate data from other sources to get a 360° view of customer feedback, for instance.

Last recommendation we would like to share is “Don’t get too focused on data: Next step is people.” To better measure ROI, our clients have to go back to where it all began: Business is conducted by people and not by a data set. Being customer centric for better targeting, better personalization of messages, and better understanding of the brand relationship is what guides all of our present and future developments. Even though our roadmap is our best kept secret, be prepared to see more demographic profiling, audience targeting tools, and sales oriented measurement and anticipation metrics.

Q8: I’m glad you’ll be speaking at LT-Accelerate. Your topic is fairly technical — exploiting languages’ morphology for automatic sentiment analysis — noting that we do have a range of presentations on the program. Would you please tell me about your presentation, briefly: What attendees will learn.

Pedro Cardoso> The first thing we need to understand is the definition of morphology. Morphology of a word defines its structure: the root, part-of-speech, gender, conjugation, etc. And this is the first giveaway of the presentation.

Continuing, I will show how the use of morphological information of words helped us at Synthesio in building sentiment analysis, in particular for less represented languages, those that offer less labeled [training] data. Also, it is an important part of the system for agglutinative languages, whose vocabulary is theoretically close to infinite.

That wraps up this interview. I’m looking forward to Pedro Cardoso’s LT-Accelerate presentation. If you’re intrigued by what you read here, please do visit the conference Web site to learn more. And I hope you’ll join us 4-5 December 2014 in Brussels.

Consumer & Market Analytics: Q&A with Lauren Azulay, Confirmit

Our thesis: Language technologies — text, speech, and social analytics — natural language processing and semantic analysis — are the key to understanding consumer, market, and public voices. Apply them to extract the full measure of business value from social and online media, customer interactions and other enterprise data, scientific and financial information, and a spectrum of other sources. The insight you’ll gain means competitive edge, whatever your organization’s mission.

Insight, via business (and research and government) application of language technologies, is the central topic for LT-Accelerate, a new conference that takes place December 4-5 in Brussels.

Lauren Azulay, Confirmit

Lauren Azulay, Confirmit

I recently interviewed a number of LT-Accelerate speakers. My questions broadly cover the topics they’ll be addressing in their conference presentations. This article relays my Q&A with speaker Lauren Azulay of Confirmit, a customer, employee, and market insights solution provider.

Let’s get right into it!

Q1: The topic of this Q&A is consumer and market insight. What’s your personal background and your current work role, as they relate to these domains?

Lauren Azulay: My current role is Senior Product Manager for Text and Social Analytics at Confirmit, where consumer and market insight is at the heart of what we do. Text Analytics gives insight into the voice of the customer and the social analytics provides insight into the voice of the market. Previously, I was the Product Manager of the Channel and Brand insights platform for YouTube multi-channel network, Base79, where we discovered and uncovered meaningful patterns in data for brands and channels wanting to increase their exposure on YouTube. Before that I spent 7 years as Head of Product and then Head of Internationalisation and Billing on a social networking product, where consumer and market insight drove our new product development and launch in new territories.

Q2: What roles do you see for text and social analyses, as part of comprehensive insight analytics, in understanding and aggregating market voices?

Lauren Azulay: The voice of the market includes your competitors, independent analysts and commentators, or just consumers who may or may not be your customers. Understanding the buzz and sentiment around key topics or issues across all these voices can help marketing, sales, services and product functions within a business. Social analysis is also good for early issue detection which can protect you from brand damage, reduce service costs and improve customer satisfaction.

Q3: Are there particular tools or methods you favor? How do you ensure business-outcome alignment?

Lauren Azulay: Extracting social insights requires capturing not just the text but all of the metadata associated with each post or comment, such as the conversation data, author, etc. Because of the large volumes, statistical techniques are better than rules-based approaches to categorization and sentiment analysis, as they can process very large volumes of text much faster.

The right visualisations are important for bringing the data to life, and the ability to correlate the social insights with other business and customer data has the potential to offer significant value.

Q4: A number of industry analysts and solution providers talk about omni-channel analytics and unified customer experience. Do you have any thoughts to share on working across the variety of interaction channels?

Lauren Azulay: A centralised customer hub is essential. In the end, the solution will most likely be a hybrid combination of different technologies, as storing and managing social data is a different technical problem to solve than storing transactional data. So it’s important for any hub solution to easily integrate with other customer or data hubs within an organisation, and each one can be targeted at the problem it is best suited to solve. This is the only way businesses will be able to achieve the holy grail of a truly unified customer experience across all channels, including social.

Q5: To what extent does your work involve sentiment and subjective information?

Lauren Azulay: We have many customers that have deployed sentiment analysis for text from both voice of the customer data and social media data. For example, Sony Mobile have been using social media to detect consumer issues as they come up in order to improve their products and services, protect their brand and to avoid costs.

Q6: How do you recommend dealing with high-volume, high-velocity, diverse data — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Lauren Azulay: The key to analysis is to understand the data sources and how they all fit together for your business and your market. In order to deal with large volumes of diverse data, such as social data, you need to know your objectives and be very focused on the project goal. This drives the data sources you analyse and the categorisation model you apply. The data structure is also important, as mentioned before, so that you make sure that all relevant metadata is stored along with the text. Categorisation of all texts can be performed very quickly using Boolean search techniques. Knowing the volume of interactions by category gives you the buzz, which can then be tracked over time and can quickly highlight trending topics.

However, performing sentiment analysis on all the data captured is not practical, even with a high-performing statistically-based sentiment algorithm. Statistical sampling, performed correctly, can ensure an accurate sentiment and can be obtained by category, even if there are thousands or millions of text strings within each category. Research we have conducted shows that a collection of a million documents can be analysed in a very short time using a sample of 20,000 documents, with a high degree of accuracy.

In addition, with the right analytical tools and the right visualisations, humans can interpret the results and explore the root causes for specific topics. Human analysis is very important in pulling it all together and drawing the overall conclusions.

Q7: Could you provide an example (or two) that illustrates really well what your organization and clients have been able to accomplish via analytics that demonstrate strong ROI?

Lauren Azulay: Sony Mobile, through their social media ‘listening’ service, uncovered more than 15,000 unique issue reports in a year. This gave them the ability to prioritise the 3 main issue categories and focus on a process of remediation. This has saved the company tens of thousands of dollars and improved customer satisfaction, which has been measured through social media.

Q8: You’ll be presenting a lighting talk at LT-Accelerate. What will you be covering?

Lauren Azulay: Social analytics in action!

Thanks Lauren. Readers, if what you’ve read here sounds interesting, please do visit the LT-Accelerate Web site to learn more about the conference. We’ve designed the conference as a venue for learning, networking, and opportunity, for technologists and business users alike.

Inés Campanella, Havas Media

How Havas Media Views Consumer & Market Analytics

Our thesis: Language technologies — text, speech, and social analytics — natural language processing and semantic analysis — are the key to understanding consumer, market, and public voices. Apply them to extract the full measure of business value from social and online media, customer interactions and other enterprise data, scientific and financial information, and a spectrum of other sources. The insight you’ll gain means competitive edge, whatever your organization’s mission.

Insight, via business (and research and government) application of language technologies, is the central topic for LT-Accelerate, a new conference that takes place December 4-5 in Brussels.

I recently interviewed a number of LT-Accelerate speakers. My questions broadly cover the topics they’ll be addressing in their conference presentations. This article relays my Q&A with speaker Inés Campanella of Havas Media Group and her colleague Óscar Muñoz-García. I’ll provide a bit of background and short bios and then we’ll get directly to the questions and responses.

Inés Campanella, Havas Media

Inés Campanella, Havas Media

Inés Campanella Casas is in charge of social research elements of Havas Media Group R&D projects, with a focus in recent years on social media analysis and innovation. Her background is in the sociology of communication and information society; she holds a M.A in Research in Sociology from the University of Barcelona. Inés is currently studying use of digital methods to better understand and predict consumers and citizens’ opinions.

Oscar Munoz, Havas Media

Oscar Munoz, Havas Media

Inés’s colleague Óscar Muñoz-García holds a MSc in Artificial Intelligence and a degree in Computer Science from UPM (Spain) and is completing a PhD. At Havas Media, Óscar is innovation-projects manager, focusing on marketing technology and big data social media applications. He is involved in the development of an opinion-mining platform for online brand reputation monitoring that applies natural language processing, graph and time series analysis, and other analytical techniques.

Now, on to the Q&A –

Q1: The topic of this Q&A is consumer and market insight. What’s your personal background and your current work role, as they relate to these domains?

Inés Campanella: I hold a B.A in Sociology and a M.A in Research Methods. Through my work and studies, I specialized in the field of Sociology of Communication and Information Society. Given my background, I’m very keen on new communication models and behavior patterns mediated by new media (i.e., social networks and other social media) and how we can profit from this amazing stream of behavioral data to increase our knowledge of social behavior.

At Havas Media I work as a researcher within the Global Corporate Development Team. My role involves integrating scholarly research and social theory into market research, designing a conceptual framework for insights into online consumer behavior; with a special emphasis in buzz monitoring. One of my main responsibilities is help to build qualitative, business-savvy content classifications to be used in the development of novel content analytics tools. My day-to-day tasks also involve working in practical, hands-on online market research analysis. This twofold approach allows me, and the team I work in, to come up with an innovative and yet pragmatic approach regarding what technology we are able to develop and what technical features we need to improve to meet our clients’ real needs.

Q2: What roles do you see for text and social analyses, as part of comprehensive insight analytics, in understanding and aggregating market voices?

Inés Campanella: Regularly listening to consumers is a task that marketers must undertake in order to know their audience, detect how people feel about them, and cater to their needs and desires. This being said, the new shopping scenario with people massively sharing their thoughts online and performing regular online research about product and brands calls for an assessment of the techniques traditionally used in market research. There are many advantages to this. In comparison with traditional quantitative techniques such as questionnaires, the collection of opinions extracted from social media sources means less intrusion since it enables the gathering of spontaneous perceptions of consumers, without introducing any apparent bias. In addition, the possibility of doing this in real time poses a clear advantage over other techniques based on retrospective data. Overall, this allows for a more efficient and complex business decision making.

So text analysis is and will increasingly be key to market research. Nevertheless, issues such as online privacy, anonymization and the degree of representativeness and objectiveness this data holds in comparison with other methods must be taken into account. We are only beginning to understand how we can combine these approaches in a solid, law-abiding methodological toolbox.

Q3: Are there particular tools or methods you favor? How do you ensure business-outcome alignment?

Oscar Muñoz: There are many tools for measuring consumer insights in online paid and owned media that are reaching a significant degree of maturity, for instance, Programmatic Advertising and Web Analytics platforms for paid and owned respectively. However, regarding tools for performing consumer analytics in earned media, there is a long road that still lies ahead for offering results that can be easily activated in communication strategies.

Content classification is needed to enable meaningful KPIs (key performance indicators). At Havas Media, we are working on sentiment KPIs that go beyond polarity identification (e.g., classification of emotions expressed by users towards brands and products), on consumer communities research studies via big graph analysis techniques, and on measuring the influence of ad campaigns over word-of-mouth through the analysis of correlations between advertising spent, spots’ audience, and social buzz.

Q4: A number of industry analysts and solution providers talk about omni-channel analytics and unified customer experience. Do you have any thoughts to share on working across the variety of interaction channels?

Inés Campanella: We live in a digitalized world and this means that we no longer find consumers in one location, environment or channel but, rather, in an ever-increasing variety of them. Traditional customer journeys are no longer valid and, thus, new strategies to engage with consumers — and avoid looking redundant to their eyes — are very much needed.

We have witnessed that, while companies struggle to connect their marketing strategies, they often lack a tool-supported holistic approach that ensures effective multi-channel and multi-device  media strategies. Our ultimate goal at Havas Media is to integrate all data sources in order to track consumers across online and offline touch points, gathering information about them with the aim of performing real time automation of communication processes. An example: serving personalized, timely online ads, push messages, and e-mail recommendations. This is completely indispensable if we wish to address consumers in an effective way.

Q5: To what extent does your work involve sentiment and subjective information?

Inés Campanella: To a very large extent. Either when I’m directly dealing with data from social media listening projects or when we’re devising new business coding frames, we’re always trying to elucidate ways to leverage this source of subjective information and make it actionable.

On the other hand, carrying out accurate [sentiment] polarity analysis is essential, but we believe it is equally important to achieve a good classification and detection of recurrent conversation topics between users (e.g., regarding product features). Our deployment of content analytics tries to give answer to all these questions. Otherwise, we would be missing half the story.

Q6: How do you recommend dealing with high-volume, high-velocity, diverse data — to ensure that analyses draw on the most complete and relevant data available and deliver the most accurate results possible?

Oscar Muñoz: We deal with volume and velocity by leveraging Big Data processing platforms like Hadoop and the related ecosystem (e.g., HBASE, HIVE, Spark, etc.). To tackle diverse data, we spent a significant part of our computing resources on ETL (extract, transform, load) processes for normalizing, integrating, aggregating, and summarizing data from multiple social media channels (Twitter, Facebook, blogs, forums, etc.) according to a unique schema of linked data about content, users, and related metadata.

Regarding accuracy, our goal is to develop natural language processing (NLP) algorithms that are as precise as possible, to obtain confidence levels similar to other techniques like opinion polls. Unfortunately, this goal cannot be achieved easily. We combine machine learning and deep linguistic analysis techniques in order to find fair balances of precision and recall, but there is still a lot of work to be done.

Q7: Could you provide an example (or two) that illustrates really well what your organization and clients have been able to accomplish via analytics that demonstrate strong ROI?

Inés Campanella: We’ve developed business indicators that allow us to better code and interpret social media listening data, namely marketing mix indicators and consumer decision journey stages. Ultimately, this has a very positive impact on ROI. Let me explain this in greater detail.

To monitor in real time and accordingly react to the experiences that customers are sharing, our clients must know the purchase stages in which those consumers are gained and lost, in order to refine touch points, impact consumers at the right time, and achieve the desired result (that is, a transaction). Also, uncovering the exact content of the dialogues that customers are having lets marketers and advertisers keep better track of consumers’ mindsets. We’ve found that the combination of these two categories provides very valuable information for a better positioning of the brand or organization in the market and, therefore, for an improved return on advertising efforts.

Q8: I’m glad you’ll be speaking at LT-Accelerate. Your talk is titled “Understand Consumers: Mindset, Intentions, and Needs.” Would you please describe your presentation briefly: What will attendees learn?

Inés Campanella: I’m also glad I’ll take part in LT-Accelerate. I will introduce the audience to Havas Media Group current needs, challenges, and practices regarding content analytics. Specifically, I’ll comment on the research we have carried out regarding innovative classification of user generated content (UGC) to improve social media buzz monitoring. In short, I’ll explain the business need for these kinds of classifier and how we can leverage and combine them with other market techniques and insights to improve our understanding of consumers’ mindsets and habits.

Business value in text, speech, digital & social analytics, at LT-Accelerate in Brussels

LTaccelerateLogoDo you know about LT-Accelerate, a premier European customer, market, and media insights conference, December 4-5 in Brussels?

Attend LT-Accelerate to keep up with developments in text, speech, digital, and social analytics. You’ll hear from and network with –

ESOMAR president Dan Foreman — Shree Dandekar, Dell Software‘s information management strategy director — Elsevier content and innovation VP Michelle Gregory — Prof. Stephen Pulman of the University of Oxford — Tony Russell-Rose, founder and director of UXLabs, a research and design consultancy — Lipika Dey, principal scientist at Tata Consultancy Services Innovation Labs… to name a few speakers.

The full agenda is online at lt-accelerate.com/programme. The conference is being co-produced by LT-Innovate, Europe’s language-technology association.

Whom else will you meet? Brand and agency speakers from Havas Media, IPG Mediabrands, Sony Mobile, Telefonica, and TOTAL. Leading technologists from IBM Research, Synthesio, and the Universities of Antwerp and Sheffield. Innovative solution providers including Basis Technology, Confirmit, CrossLang, Daedalus, Ontotext, and TheySay.

LT-Accelerate is about the application of language technologies to create business value for the insights industry, customer experience, media & publishing, social analytics, marketing, and public administration, and about deal-making. LT-Accelerate is about opportunity.

An extra: Benefit from a special 10% early-registration discount through October 16. Use the code “EarlyBird” when you register online at lt-accelerate.com/registration.

I hope you and colleagues will join us in Brussels in December!

Attensity, NLP, and ‘Data Contextualization’

(Part 2 of an Attensity/text analytics update. Click to read part 1, Attensity Doubles Down: Finances and Management.)

Attensity ex-CEO Kirsten Bay’s LinkedIn profile states her Attensity objective as “develop go-to-market strategy to reorient corporate focus from social media, text analytics to corporate intelligence.” A shift of this sort — a step up the value ladder, from technology to solutions — seems sensible, yet Attensity has gone in a different direction. Since Bay’s December 2013 departure, the company has instead doubled-down on a technology-centered pitch, placing its positioning bet on algorithms rather than on business benefit.

We read on Attensity’s Web site — the first sentence under About Attensity (as of August 14, 2014) — “Our text analytics technologies use patented, state-of-the-art semantic approaches to extract and recall information into valuable insights.” Other main-page tag-lines: “Using semantic analysis to extract textual insights” and “Enabling customer insights for social and non-social data.” There are variations on this sort of language throughout Attensity’s marketing collateral.

A tech-centered pitch is great if you’re marketing to developers, to a market segment that knows it needs “semantic approaches.” A tech pitch may also appeal to insights professionals, to market researchers and consultants. But for a business exec who’s looking to boost customer satisfaction, engagement, and loyalty, for competitive advantage? Perhaps not so compelling.

In an article that preceded this one, I characterized Attensity’s May 2014, $90 million financing announcement as a doubling down, in both investment and technical positioning. The earlier article covered Attensity’s financial and management picture. This second one focuses on positioning and prospects, with a few words on Attensity Q, a new “easy-to-use real-time intelligence” solution for marketers.

Go Your Own Way

In my earlier article, I offered the impression that Attensity’s business performance, measured in financial and competitive terms, has been stagnant in recent years. Attensity’s Aeris Capital owners evidently agreed: Ex-CEO Kirsten Bay, in describing her Attensity assignment in her LinkedIn profile (as of the moment I’m writing this article) uses “restructure” and “recapitalize” twice each. Recapitalize? Done, although it’s unclear where the $90 million is going, or went. Restructure? I don’t know what steps have been taken beyond the hiring of a new marketing head, Angela Ausman, and the reversion to tech-centric market messaging. The company has declined to discuss its product roadmap.

Attensity’s message is certainly different from its nearest competitors’, who have had greater market and corporate success. Among business solution providers –

Medallia has invested heavily in text analytics, maintaining, however, a pitch built around business benefit: “understand and improve the customer experience.” Clarabridge repositioned several years ago as a customer experience management (CEM) solution provider; you won’t find “text analytics” on the main page of Clarabridge’s Web site. Kana (owned by interaction-analytics leader Verint) sells “customer service software solutions.” Confirmit bought text analytics provider Integrasco earlier this year, but eschews the “text analytics” label in favor of a functional description of the capability provided: “Discover insights in free-form content.” Pegasystems focuses on improving customer-critical business processes, with text analysis capabilities enhanced via the May 2014 MeshLabs acquisition but still playing a supporting role.

Attensity could do likewise: Contextualize its text analytics technology by repositioning as a business solutions provider. Attensity certainly does understand the importance of context, because “context” (along with “insights”) is the part of the company’s pitch that best bridges tech and business benefit.

Context and Sense

In some of its more-recent material, Attensity has termed itself “the leading provider of corporate insight solutions based on proprietary data contextualization.” See, for example, the Attensity Q announcement.

I asked Attensity its definition of “data contextualization,” but again, the company declined to take up my questions. So I’ll give you my definition: The notion that accurate data analysis accounts for provenance (identity, demographic and behavioral profile, and reliability of the source), channel (e.g., social media, surveys, online reviews, contact center), and circumstances (location, time, and activity prior, during, and after a data point) among other factors. There’s a word that describes data context — metadata — so what’s different is a dedication to better use it in analyses.

Nathan Shedroff: From data to wisdom via context.

Nathan Shedroff: From data to wisdom via context.

Authorities such as IBM’s Jeff Jonas have written (virtual) reams about context. See, for instance, Jonas’s G2 | Sensemaking -– Two Years Old Today. Other vendors have made the case for context. One pitch: “Digital Reasoning uses machine learning techniques to accumulate context and fill in the understanding gaps.” I’ll present to you an illustration that dates back two decades. It’s to the right, pulled from Nathan Shedroff’s 1994 Information Interaction Design: A Unified Field Theory of Design.

Gary Klein, Brian Moon, and Robert R. Hoffman wrote in 2006 about sensemaking embodied in intelligent systems that “process meaning in contextually relative ways,” that is, relative to the data consumer’s situation. “Data contextualization,” as I understand it, makes explicit an extension of Shedroff-type models into the data producers’ realm, to better power those sought-after intelligent systems. The concept isn’t new (per IBM, Digital Reasoning, SAS’ s Contextual Analysis categorization application, and other examples), even if the Attensity messaging/focus is.

How has Attensity implemented “data contextualization”? I don’t know details, but I do know that the foundation is natural language processing (NLP).

Attensity NLP Based Q

“The strongest and most accurate NLP technology” is Attensity advantage #1, according to marketing head Angela Ausman, quoted in a June Agency Post article, Attensity Q Uses NLP and Visualization to Surface Social Intelligence From the Noise. Attensity advantage points #2 and #3 are development experience and “deep breadth of knowledge in social and engagement analytical solutions,” setting the stage for introduction of Attensity Q. Attensity visualization dashboardAusman cites unique capabilities that include real-time results; alerting; “quotable metrics for volatility, sentiment, mentions, followers, and trend score”; and term-completion suggestions, via a visualization dashboard.

Attensity Q comes across as designed for ease-of-use and sophistication (which are not necessarily mutually exclusive categories) beyond the established Attensity Analyze tool’s. Attensity Pipeline‘s real-time social media data feed remains a differentiator, as do the NLP engine’s exhaustive extraction voice-tagging capabilities and the underlying parallelized Data Grid technology.

But none of this, except for Q, is new, so is any of it, Q included, enough to support a successful Attensity relaunch? The question requires context, which I’ve aimed to supply. Its answer depends on execution, and that’s CEO Howard Lau’s and colleagues’ responsibility. I wish them success.


Disclosure: Attensity sponsored 3 instances of a conference I organize, the Sentiment Analysis Symposium, most recently in the fall of 2012, and the company was a 2011 sponsor of my text analytics market study. A new version of that study is out, Text Analytics 2014, available for free download, sponsored by Digital Reasoning among others. And IBM’s jStart innovation team sponsored my 2014 sentiment symposium.

Attensity Doubles Down: Finances and Management

Attensity, founded in 2000, was early to market with text analytics solutions. The company’s “exhaustive extraction” capabilities, referring to the use of computational linguistics to identify entities and relationships in “unstructured” text, set a standard for commercial natural language processing (NLP). Rival Clarabridge, as a start-up, even licensed Attensity’s technology. Yet Attensity has struggled in recent years, reaching a possible low point with 2013’s less-than-a-year tenure of Kirsten Bay as CEO. And now, under Howard Lau, chairman (since early 2013) and CEO (with Bay’s December 2013 departure), and with $90 million in equity financing?

I would characterize the May 14, 2014 financing announcement as a doubling down, in both investment and technical positioning.

Attensity is worth a re-look, hence this article with a financial focus and another on positioning points that I’ll post soon. [Now posted, August 14, 2014, Attensity, NLP, and ‘Data Contextualization’.] I hope to follow them in the fall with a look at innovations and the product roadmap.

A Doubling Down?

All Attensity will say about the $90 million transaction is that “Financing was provided by an international private equity fund and financial advisor company. The new capital secured will be used to accelerate product innovation; fuel market growth; and expand the sales, marketing, and engineering teams to meet the growing need for engagement and analytic applications using patented natural language processing (NLP) technology.”

I tried and failed to learn more. Marketing head Angela Ausman, who joined the company in April, declined to comment on questions I posed to her and CEO Howard Lau, regarding market positioning and growth markets, the competitive landscape, alliances, and the product roadmap. Lau has been unavailable for discussions.

“The company says it previously raised about $58 million,” according to May reporting in the San Francisco Business Times, and Attensity-acquired company Biz360 had itself raised about $50 million. I’m guessing the $58 million figure includes only investment in Attensity in its current form, that it discounts funding prior to 2009, the year Aeris Capital bought Attensity. Attensity no longer lists investors or much history on the company Web site. Early investors, surely since bought out, include In-Q-Tel, which led a 2002 $6 million round. I’d speculate that early owners did not fully recoup their investments.

Turn-arounds are tough.

I don’t know whether former CEO Kirsten Bay, who held the post from January to December 2013, chose to leave or was pushed out. Regardless, she may not necessarily have failed so much as not sufficiently succeeded, by her own or Attensity’s standards. When a company is losing customers, talent, and money (in order of importance) only radical restructuring or an asset sale will save the day.

My take is that Attensity’s troubles started under the longtime executive managers who preceded Bay. An industry-analyst friend offers the comment, “By the time they took Aeris money Attensity was already a dead enterprise. The money and support gave them a Death 2.0 runway.”

Attensity used the Aeris money to attempt to go big but was unable to make a go of the 2009 roll-up of Attensity and two German companies, Empolis and Living-e. Buying Biz360 in 2010 in order to get into social intelligence was a mistake. Social intelligence was at the time, and largely still is, a low-value, small(ish)-deal proposition that doesn’t pay unless you’re set up to do mass marketing, which Attensity wasn’t and isn’t. Attensity Respond goes beyond social intelligence to provide engagement capabilities.

I wonder whether the 2010 deal for real-time access to Twitter’s “firehose” has paid off, or will ever.

Also, Attensity had a failed foray in advanced analytics (data science), which probably wasted attention and opportunity more than it wasted money, but still a loss ill-afforded by the company. (Clarabridge pursued a similar predictive analytics initiative around the same time, in 2010 or so, working with open-source RapidMiner software, but didn’t invest much in the effort.)

Attensity Analyze

Attensity Analyze

So Attensity unrolled the 2009 roll-up, in particular shedding European professional services, but has maintained social listening (Pipeline) and engagement (Respond) capabilities, complemented by Attensity Analyze for multi-channel customer analytics. (I plan to write in another article about a new product, Attensity Q, and perhaps also about product technical underpinnings.)

Get On Up

Attensity business performance has seemingly been stagnant in recent years. The company has lost customers to rivals with no recent new wins that I know of. Attensity shed most senior staff. Everyone I knew personally at the company left within the last couple of years. Beth Beld, the chief marketing officer whom Kirsten Bay brought on in May 2013, stayed less than five months.

I do know that Bay met with potential Attensity acquirers, including two of my consulting clients, but none of them, evidently, saw promise sufficient to justify terms. If Dow Jones reporter Deborah Gage is correct, that the new funding comes from Aeris Capital AG, already Attensity majority owner, then the $90 million transaction truly does represent a doubling down, in a game with no other players.

The May funding announcement was a plus — call it a take-over from within — and I did receive a positive comment from a consultant friend: “We’re working with one of the Top 10 banks in the U.S. and they are migrating new lines of business from other tools over to Attensity.” Good new, but for Attensity to revive, we’re going to have to hear a whole lot more.


Click here for part 2, posted August 14, 2014, Attensity, NLP, and ‘Data Contextualization’.


Disclosure: Attensity sponsored 3 instances of a conference I organize, the Sentiment Analysis Symposium, most recently in the fall of 2012, and the company was a 2011 sponsor of my text analytics market study. A new version of that study is out, by the way, Text Analytics 2014, available for free download.

A Cheap Way to Discover Brand-Topic Affinities on Twitter

… or, Whose Twitter Followers Are Really Into Text Analytics?

Sometimes interesting things appear when you’re not even looking. And some lessons taught are applicable far beyond immediate challenges.

Case in point: The realization that Twitter advertising statistics can reveal brand-topic affinities.

Ad stats help you assess how well you’ve targeted promoted tweets — that’s their purpose — but you can use them for much more. You can use them to study competitive positioning and identify influencers around particular topics of interest. The trick is to craft tweets that don’t (only) promote a product or service, but also/instead help you evaluate the topic-engagement link. The insights revealed aren’t especially useful for me — I’m well-positioned in my text and sentiment analysis consulting specialization — but if your business depends on precision online targeting, you may find ads data to be a new, unique, inexpensive source of social intelligence.

Brand-TopicAffinityInsights from Twitter Engagement

I’ll save you a long read: I ran two Twitter promoted tweet campaigns. One targeted a set of keywords. For the second, I entered a set of @handles to target people similar to those accounts’ followers. I promoted a single tweet, one associated with a well defined technology topic.

The targeted @handles: Each represents a brand, whether an organization, product, or person. IBM is a brand, and so are @IBMWatson, IBM Watson evangelist Fredrik @Tunvall, and Gartner analyst @Doug_Laney, whose coverage extends to Watson.

What I advertised isn’t important beyond that the ad content was single-topic and brand-neutral. Brand-neutrality reduces response bias, whether toward or against a brand. The single-topic focus eliminates ambiguity; it makes clear what prompted the response. Net is that we can associate engagement — retweets, replies, follows, and other clicks — with one, particular topic. Ad stats break out and rank engagement by targeted @handle and by keyword, giving us neat way to study affinities.

My @handle-targeted campaign achieved a 5.96% engagement rate, which I consider pretty good. Twelve of the @handles I targeted had over 10% engagement, out of 69 with at least 100 ad impressions, and seven were below 4%.

We learn from the variation, from the spread of response rates. We learn which brands are associated with a topic and which aren’t. The @handles for individuals: High engagement rates reveal or confirm influencers for the tweet’s topic. The uses of company and product @handle-topic associations is close to self-evident so I won’t elaborate on them.

Get a complete set of insights by running a parametric study, a series of ads with topics whose associations you wish to explore, for a fixed set of target @handles. You may find surprises. I did. In the end, you’ll gain solid, valuable social intelligence.

Finally: Cheap. Twitter Ads per-engagement costs are very, very reasonable, and because you pay by the engagement rather than by the impression, you’re not penalized for poorly composed or targeted advertising. (But please, let’s not waste anyone’s time.) I won’t tell you what I spent on my Twitter advertising, just that the amount was modest, with excellent return on investment.

More Detail

My promoted tweet advertised a free report I recently published, delivering findings from my Text Analytics 2014 market study. TAmarketStudyThe term “text analytics” describes a collection of technologies and processes that extract information from social, online, and enterprise text. My advertising aim was click-throughs to the download page, and secondarily, retweets and other forms of Twitter engagement.

(Twitter does offer additional advertising options, for instance lead generation cards and conversion tracking, useful for ad optimization but not for the affinity study I’m describing in this article.)

I chose to target 77 Twitter @handles, of solution providers that sell text analytics products or services, of industry analysts who cover text analytics or application areas, and of associations. Text analytics is commonly applied in customer experience management, market research, social intelligence, financial services, media and publishing, and public policy, so I included certain companies, analysts, and consultants who work in those domains.

(An ideal way to learn more is to check out a conference I’m organizing, LT-Accelerate, slated for December 4-5, 2014 in Brussels.)

As I’ve mentioned, results — ad engagement — varied widely.

Top scorer was @Confirmit, a survey research/insights firm, at 15.62%. Two in every thirteen promoted-tweet impressions led to a click, favorite, or retweet. I think that’s pretty good.

In the cellar, @The_ARF (the Advertising Research Foundation) at 0.57%.

The easy conclusion is that Confirmit and other top-scorers — @GateAcUk (GATE open-source text analytics) and @SAPAnalytics — have strong text analytics brand interest while only a small portion of the ARF’s audience has a text analytics affinity. SAP and Confirmit will want to play to the first point, while frankly, I may put less personal effort into working with the ARF.

I’ll paste in my full set of results below.

Complications

Finally, I’d be remiss if I didn’t discuss complications.

Secondary data use — analysis of data that was collected and reported for purposes other than your current ones — is rarely straightforward. The available data may not fit your preferred categories or characteristics — for instance, you might want hourly data, but daily is the best you can get — or you might be not have access to detailed metadata that fully describes the data and collection conditions.

There is a lot of follower overlap among the @handles I targeted. While I could cross-check follower lists, combinatorics suggest an intractable attribution task. If you need to account for ad engagements across a set of @handles (or ad-targeting keywords), I suggest running simultaneous, separate ad campaigns, one for each @handle, or choose yet another option, the one I chose: Don’t overthink your experiment, because you most likely don’t need highly precise results.


Results

The following are my promoted-tweet campaign engagement results, for @handles with at least 75 impressions:

@handle Impressions Engagements Rate
Campaign totals 21,335 1,271 5.96%
@confirmit 160 25 15.62%
@GateAcUk 229 32 13.97%
@SAPAnalytics 115 16 13.91%
@LTInnovate 90 12 13.33%
@metabrown312 256 29 11.33%
@allegiancetweet 135 15 11.11%
@texifter 310 34 10.97%
@DataSift 105 11 10.48%
@SASsoftware 621 65 10.47%
@havasi 374 39 10.43%
@Verint 204 21 10.29%
@dreasoning 605 62 10.25%
@jasonbaldridge 130 13 10.00%
@Lexalytics 196 18 9.18%
@Clarabridge 1265 116 9.17%
@lousylinguist 178 16 8.99%
@bobehayes 314 28 8.92%
@DeloitteBA 901 80 8.88%
@sinequa 102 9 8.82%
@LuminosoInsight 616 54 8.77%
@eMarketer 165 14 8.48%
@digimindci 489 41 8.38%
@Gartner_inc 621 52 8.37%
@basistechnology 798 66 8.27%
@IDC 289 23 7.96%
@Medallia 384 30 7.81%
@RapidMiner 1,598 123 7.70%
@NetBase 326 25 7.67%
@Expert_System 1,125 86 7.64%
@stuartrobinson 119 9 7.56%
@NICE_Systems 292 22 7.53%
@dtunkelang 859 64 7.45%
@nik 997 74 7.42%
@Doug_Laney 178 13 7.30%
@ClickZ 207 15 7.25%
@kdnuggets 318 23 7.23%
@IBMWatson 83 6 7.23%
@40deuce 629 45 7.15%
@TEMIS_Group 261 18 6.90%
@btemkin 588 40 6.80%
@strataconf 240 16 6.67%
@forrester 967 62 6.41%
@adage 360 23 6.39%
@Brandwatch 454 29 6.39%
@stanfordnlp 580 37 6.38%
@Bazaarvoice 795 49 6.16%
@rwang0 440 27 6.14%
@attensity 718 44 6.13%
@LoveStats 631 38 6.02%
@CXPA_Assoc 389 23 5.91%
@Smartlogic_com 871 51 5.86%
@LithiumTech 1,321 75 5.68%
@visible 881 49 5.56%
@crimsonhexagon 1,415 78 5.51%
@Synthesio 758 40 5.28%
@Econsultancy 438 23 5.25%
@pgreenbe 305 16 5.25%
@HPAutonomy 422 21 4.98%
@RecordedFuture 487 22 4.52%
@Gnip 45 2 4.44%
@IBMAnalytics 91 4 4.40%
@coveo 989 43 4.35%
@JeanneBliss 625 27 4.32%
@TomHCAnderson 510 21 4.12%
@digimind_FR 258 9 3.49%
@ekolsky 378 12 3.17%
@KISSmetrics 2,869 86 3.00%
@etuma360 139 4 2.88%
@comScore 2,972 80 2.69%
@converseon 271 7 2.58%
@The_ARF 1,585 9 0.57%