Innovation in Big Data and Analytics: Tweetchat Q&A

I’m an admirer of IBM’s, and not just because — I’ll get the quite-germane disclosure out of the way — the company’s jStart Emerging Technologies team is a sponsor of my up-coming (next week!) conference, the Sentiment Analysis Symposium. I’m a fan because the company’s technology is a not-as-common-as-it-should-be combination of enterprise-ready and innovative, in areas I follow that include text and social analytics and information synthesis, that is, sense-making.

Innovation is a lumpy process. Sometimes you leap forward; sometimes progress is steady even if undramatic; sometime you seem stuck in place and can manage only incremental advances. These modes could be described at disruption, extension, and improvement; modes I describe in a recent article, Sentiment Analysis Innovation, in which I catalog industry innovation in my current-favorite technology/solution domain. Clearly the innovation topic has been on my mind, which is why I suggested it to the @IBMbigdata folks who approached me about participating in a #BigDataMgmt tweetchat, focusing in particular on Big Data and analytics innovation.

The IBMers and I jointly worked out the topics for the tweetchat, which took place on February 26, 2014. While we had a couple of dozen active participants (I haven’t counted them), I did pull out only key thoughts, my own and, selectively, other participants’, to share now. Many of my responses were prepared, frankly, but many were on-the-fly responses to others’ tweets. I have reformatted tweets and edited for readability. This form is an experiment. My own tweets are unquoted and mixed in with a bit (although not too much) of text that creates a narrative.

So here we go, a series of questions and short answers on Innovation in Big Data and Analytics:

1) How do you define innovation, as related to Big Data & Analytics of course?

I see 3 innovation aspects: 1) Improve existing; 2) Extend to new; 3) Disrupt. But further, defining innovation involves defining the questions you can ask and the elements that contribute to answers.

Now, Big Data itself embodies an innovative attitude, that you can do more with more, via analytics of course. But those  analytics approaches to Big Data have to be different, because conventional methods may not scale, and new methods, tailored to Big Data, may be able to extract insights that conventional analytics methods miss.

Participant Tracey Wallace had the astute observation, “Big data and analytics need to be actionable — do that and it’s innovative. I don’t want to have to use a data scientist.” My observation was that a data scientist needed may not always be needed, but judgment is always required and too often in short supply, prompting further tweets from Tim Crawford, “Title aside, it’s important to have someone that understands the business to provide context for data,” and Tracey Wallace, “That’s the innovation. Make big data easy to understand so journalists, etc. can make those judgements.”

(I’m pulling one thread, here, from a tangled tweetchat conversation, and I’ll do the same in my write-ups for the rest of the questions.)

My summation: Defining innovation involves defining the questions you can ask and the elements that contribute to answers.

2) What recent analytics innovations have high potential to be game changers?

I cite two:

  • Machine learning, in particular semi-supervised and unsupervised learning and also deep learning.
  • Also data integration/fusion (à la Watson, certainly!), of personal (profile), geospatial, transactional, attitudes.

We did get into IBM products. For instance, David Pittman brought up that “stream computing like InfoSphere Streams does is innovating many uses, from online gaming to fraud detection.” I do believe, and tweeted, that I don’t see stream computing as so recent. Complex Event Processing (CEP) — analytical computing on streams — has been around for quite a while. Of course, it’s continually improving.

Let’s be real however, that Watson is still mostly potential. Stuff like Wolfram Alpha (a “computational knowledge engine”) is more here-and-now. There’s other IBM tech such as I2 Analyst Workbench, which does synthesis and visualization, that is also more here-and-now. With Watson, Watson, the real innovations are the assembly of disparate technologies, and operation at scale, and question-answering capability.

3) How are businesses applying and commercializing these analytics innovations?

The best candidate areas relate to pattern recognition, across locations, time, populations. But that’s VERY broad. Better to be more specific…

Language understanding and image recognition are 2 great examples of tech functions that can be applied for many purposes. But really any area where traditional methods don’t scale or adapt well to new data may be a candidate.

Yes, Watson is a great example of incipient commercialization of Big Data analytical and semantic technology. The concepts aren’t so new. I’ll cite an analytics/semantic solution that predated Watson by 10-12 years: MedTAKMI, text mining for life sciences, out of IBM Research Tokyo.

Bob Hayes asked/suggested customer surveys as an example, regarding application of innovative analytics, including “us[e] of sentiment analysis for loyalty measurement in surveys instead of NPS (rated questions of loyalty).” (NPS is the Net Promoter Score.) I do see how surveys could be a Big Data challenge, if survey responses are cross-correlated with transactional and social data.

Tracey Wallace’s question/suggestion was that “transactional and social data seem more reliable than any survey, no?” I do agree but observe that transactional data is factual rather than attitudinal, so less ambiguous but not indicative of root causes, and social data is truer in a sense, since attitudes expressed are unsolicited, but it’s incredibly noisy.

Bob Hayes offered the thought that you find sentiment’s meaning by correlating sentiment with other measures. I add only that we compute indicators from measures in order to map collected data to outcomes.

4) Where, and how, do organizations begin to address opportunities, and what risks are involved, what potential down side?

I’m big on experimenting & exploring, with different data sets, analytical methods, visualizations, etc. Certain technologies such as R and Python (popular among data science types) lend themselves to exploratory analysis. I don’t see huge risk, until you put yourself in a bet-the-store position. The real risk is in standing still.

The @IBMBigData tweeted response was, “We always encourage experimentation too. That’s where innovation sprouts. Try, try, try again.” And Bob Hayes offered, regarding risks and potential downsides, “With so much new data involved, it could take you down a rabbit hole.”

5) How about some examples of areas — business challenges — ripe for innovation responses, even for disruption?

Anything expensive/slow. I hate to say this, but areas often handled by people, whom automation could make redundant. Think customer service, certain logistics functions. Are truckers looking forward to self-driving rigs?

Steve Massi called out “better use of real-time traffic and routing” and Mark Salke, “customer service,” feeling the need to add, “I’m not kidding.”

6) What investment is involved, in people, software/services, in-house R&D, or other elements, in harnessing Big Data & Analytics innovation?

Talent is the biggest and most difficult investment, and I don’t mean just hiring data scientist types. Need starts with ability to decide where to invest effort & resources and how to operationalize, productize & monetize insights. You can outsource, or buy as-a-service, software, platform, and R&D elements that aren’t core to your business.

In response to this question, Tracey Wallace offered “I truly believe the investment needs to be in the SaaS product (a smart data stack) that is intuitive for users,” and Bob Hayes‘s thought was “Need to consider organizational change agents that instill importance of Big Data and analytics in company culture,” both good points. Natasha Bishop offered, “culture must be in equation.”

Tim Crawford had a similar thought, that “Tech innovation is only part of the equation. Process, organization, and new paradigms are also needed to truly evoke innovation,” echoed in Marko Pitkanen‘s tweet, “In many cases cultural change is needed if an organization wants to be data driven. Decisions, people & tech need to be lined up.”

Marie Wallace‘s view was, “As well as data scientists we need biz folks that can understand and internalize analytics and integrate them into how they work.”

Bob Hayes added, “education in the research methodology would be good. Big Data does not speak for itself.”

7) Given the past few years Big Data & Analytics innovation experience: Where are we heading, and how can organizations stay nimble?

I don’t have a crystal ball. so I’ll fall back on the old (Bayesian?) standard of projecting from past experience…

We’re heading toward more data and better retrieval & analysis, so faster & more effective & more pervasive automation. How do you stay nimble? I’ll go back to my response to an earlier question: Experiment, explore. Stay aware and stay open.

I admit, that answer felt kind of obvious, platitudinous.

Tracey Wallace‘s thought was, “Big Data is a necessity now. The guys who use it right will prevail. To stay nimble, you need Big Data and people who can use it.” Steve Massi contributed, “Nimbleness [is] driven by [an] open knowledge base. Narrow base and lack of access to data points leads to rigid decision making.”

These are good points, and a good conclusion to the February 26, 2014 #BigDataMgmt tweetchat.

Leave a Reply