Can Machines Beat The Pollsters?

Can Machines Beat The Pollsters?

The BBC’s Technology Correspondent Rory Cellan Jones discusses with Essencient Co-Founder, Rob Lancashire, why the company thinks it has cracked extracting accurate feelings from social media using its Natural Language Processing and Noise Reduction technology and how they are using this to look for trends in feelings of support towards the main Parties during campaigning for the forthcoming UK General Election, on his popular Tech Tent program for the BBC World Service.

To read his article and to listen live or download the podcast click: HERE



Essencient AI Tech Outdoes Pollsters, Predicting Trump Support Well Before the Polls Close

Just like the London Mayoral Campaign and Brexit, Twitter Voices Signaled the Result

London, UK & West Palm Beach, FL, US November 18, 2016:  As it did with the London Mayoral Race and the EU Referendum, start-up Essencient’s patented AI technology could sense the momentum on Twitter in the US Presidential Election well before the pundits knew what was happening. Before the polls closed, their tech identified a huge groundswell of engagement on Twitter and it was all for Trump. Essencient say they have once again successfully applied their AI technology to measure the “engagement current” on social media and have demonstrated its potential to predict outcomes.

Whilst Essencient acknowledge they were not the only company to demonstrate the prediction opportunities AI presents when combined with social media, they believe they are the only company doing it without significant cost and time overhead when compared to other approaches.

“Our technology requires no training to produce results, so it is incredibly flexible and allows all sorts of scenario planning very easily and quickly – the ‘what-if’ options allow us to test multiple hypotheses really rapidly.” explained Rob Lancashire, Co-founder of Essencient, “For example, in the US election we examined the data in multiple ways, and every view we took showed us the same thing. Twitter users overwhelmingly engaged actively and positively for Trump over Clinton in the last days. We reported that fact to our shareholders a full day ahead of the election, and then once we concluded that the momentum for Trump was unmistakable, to the public via LinkedIn on the 8th November at 6.58 EST (23.58 GMT). That was at least an hour before the first polls closed and anyone was even considering a win for Trump.  It was like Brexit all over again, and in truth some of the team doubted our own predictions.”  Lancashire added, on a light note, “Some of our shareholders who had absolute faith in the technology actually backed our predictions with their cash at the bookies, and are riding high.  Even without relying on Essencient for betting tips I am now confident that won’t be the last cash pile we help them make!”

While the Essencient engine uses hugely sophisticated AI, it is in practice really simple to use.  The team applied a simple Boolean query to gather some 17 million tweets relevant to the election. They then fed these into the engine to filter out those that demonstrated no meaning or feeling of any kind, leaving just those that had “real engagement” towards the candidates.   Multiple factors, many of which are unique to the Essencient engine, were taken into account in filtering the results to segment tweets expressing, for example, strongly held opinions, real intent to vote and active advocacy.

“An effort of this scale relies heavily on the ability to automate the process of eliminating ‘noise’, the uninformative conversations, from the ones that actually mean something – right up front. With other systems, this first level of filtering is typically done by humans every time, which is just not feasible in terms of time or cost in a commercial scenario”, explained Mike Petit, Co-founder of Essencient. “We ran two distinct processes. The first, our Noise Floor technology, reduced the 17 million tweets to 9 million. With the ‘noise’ eliminated, the second stage of Essencient’s technology – Engagement Metrics – was applied.  The Engagement Metrics, which use Essencient’s patented Natural Language Processing capabilities, allowed us to further filter the truly meaningful tweets based on the strength of engagement.  Engagement Metrics reduced the 9 million tweets to 4 million highly qualified ones.  Although complicated to explain, the Essencient technology can produce this high value data in seconds. Having identified the high value tweets, conventional analysis techniques were then applied to identify the twitter users’ attitudes and intentions over specified periods.”

Lancashire added, “Interestingly, for months prior to Election Day the tweets reflected the traditional ‘scientific’ polling, but several days before the polling stations opened, a clear swing towards Trump began to emerge despite the traditional polls saying otherwise”.  He continued, “It was puzzling at first, but then we began to understand. It wasn’t so much that the tweets were greater in number for one candidate or another.  It was, rather, that Trump supporters were significantly more expressive of their feelings than Clinton supporters.  They spoke in stronger terms, expressed much more certainty about taking action, spoke more negatively of their candidate’s detractors, and offered more reasons why others should support their candidate. In short, the level of passion was much higher. That trend intensified right up to the end, which we believe may well have influenced those voters who were undecided up to the point of voting. Although doubtlessly some of the traffic was generated by bots, I believe no candidate on the planet, even with the current state of the art in bots, could artificially create the level of characteristically human feeling and engagement we were seeing. “

Petit added, “Although we’re certainly not claiming to have definitively predicted the outcome of the election using this limited data set, we did feel we could say with some confidence that engagement was higher for Trump, which allowed us to predict in the period leading up to the polls finally closing that an unmistakable level of support for Trump was building.”

Did that enthusiasm reflect itself in voter turnout? “We believe it did”, responded Petit, “Clinton tweeters simply did not demonstrate the same level of passion and advocacy on Twitter. Although quantifying the impact of that engagement gap would require far more analysis, the fact of that gap is now obvious to all; therefore it makes a compelling case for this method of analysis.”

Essencient is on to something big. Prior analyses of the London Mayoral campaign and the Brexit battle demonstrated the existence of this “engagement current” on social media in general, and its correlation with the outcome. In the case of Brexit, the Remain and Leave percentages Essencient calculated  closely mirrored the final results and – as with Trump – ran counter to what the polls were predicting.

“We shared these US Election predictions only with our shareholders and the LinkedIn network.   We can make this kind of social intelligence available to anyone through our website – – whether they be someone looking for intelligence on political futures or a brand looking for social intelligence on their products”, says Lancashire. He adds, with a chuckle, “I am confident in saying we will have others joining us for a celebratory drink in the future when they win on the back of our predictions.”

Social Media Lead Mining Company Essencient Proves Its NLP Tech By Predicting EU Referendum To Within 0.3%

Essencient today announced that its Social Media Lead Mining technology, which uses patented Natural Language Processing (NLP) technology to remove noise around a chosen subject or brand, had proven accurate enough at mining meaningful posts on Twitter to be able to predict the UK EU referendum to within 0.3%.  The company said that as far as it knew it was the only commercially available platform to get this close. Essencient’s results validate the potential of its new and innovative approach to accurately mining actionable data, such as sales leads and customer support requests, from social media.

The company processed posts from Twitter on the EU Referendum and analyzed sentiment for the period of 7 days up to midnight of the day of the vote, and compared these findings against the actual results. The results were -0.3% off the actual referendum result, with their analysis predicting that ‘Leave’ would be the winning result. Additionally, they plotted data during the run up and sentiment was indicating a ‘Leave’ win for most of the two weeks leading up to the vote.  They also performed a comparable analysis for Twitter data from the London Mayoral Election that took place in May, and their results for sentiment were +2.7% off the actual second round voting result. When they analyzed both campaigns for ‘intention to choose’ the variances were -1.5% and -0.6% respectively, again both clearly indicating the actual outcome.

The Essencient Essence Mining™ platform takes a novel approach, using its NLP technology to analyze texts from sources such as Twitter and identifying the ones that have no meaningful linguistic markers in relation to a chosen subject. These are then removed, to leave a stream of high quality, meaningful conversations with markers such as sentiment, intent, seeking/giving guidance or flamboyancy of writing style, all indicators of engagement. These meaningful conversations can then be reviewed for action or analyzed for insights.

Essencient COO Mike Petit said, ‘Over the sixteen years I have been involved with NLP and social media, the amount of noise has become the most serious hurdle to effective engagement. By noise we mean the number of posts where a target such as a brand might be mentioned, but nothing meaningful is said about it. Because we eliminate the noise, humans don’t have to wade through huge amounts of raw posts to find the good stuff, and the analyses and decisions based on cleaner data are much higher in quality. Think of it as reversing the process of looking for needles in a haystack. The hay is much easier to see, so why not find that and get rid of it to leave just the needles? Since the noise can be as much as 95% of the data, it saves a lot of effort and delivers a lot more value.’

When asked why no one else was doing this, Petit responded, ’Our patented NLP tech is unique in what it can identify. For example, no one else is able to identify if an author is seeking or giving guidance like we can. We can also find references to a topic across sentences when pronouns like “he”, “she” or “it” are used to tie them together, a process called co-referencing that is on the cutting edge of computational linguistics. These and other advances permit our tech to deliver levels of accuracy that are unmatched.’

Essencient CEO Rob Lancashire added, ‘The analysis of the EU Referendum and London Mayor Elections was an exercise to demonstrate the accuracy of the system, and whilst it validates the market leading ability of the system, predictive analytics is just one aspect of the real value our platform delivers, which comes from helping brands to increase engagement with customers on social media. Consider that 48% of messages towards brands are not replied to, and that the average response time when a brand does respond is 9 hours. Ad to that the fact that only 3% of posts are directed to the brand using its handle or tag, and it’s clear a lot more can be done. But it’s not easy. Social media etiquette typically requires a brand to intervene in a conversation only when addressed, and not to intrude when only mentioned in passing. The accuracy of our platform covers all these points for brands by delivering only the posts that are meaningful, so a brand can efficiently and quickly make engagement decisions case by case. This could be a sales opportunity or a customer support issue, for example, but either way when you consider customers are reported to spend between 20-40% more if they engage on social media, there is a big opportunity out there waiting.’

Lead Generation from Social Media – getting rid of the Noise within the noise to leave the needles

I read a really good article recently about how important the construction of the search query is when pulling data out of social media platforms such as Twitter. It’s definitely also our experience at Essencient that if you don’t get this right, you will pull back a large amount of noise that is just not relevant to what you are looking for: you would still need to go through this possibly huge haystack to find the needles.  This is particularly important when you are using specific keywords to try and find sales leads. A whole load of people out there use the same words for many things that are not remotely related to your target, but you could still get their posts in your data set…the noise! For example, let’s assume I was looking for people who had engaged on social media with subjects that might indicate interest in fitness products. Using a keyword such as ‘run’  (which according to has 179 different meanings!) in my search query would return a lot of irrelevant data from people who are not actually of interest to me. Such a query is not constructed well: too generic, too broad, too noisy. Focusing on searching for ‘handles’ or hashtags might seem to be an answer but, as Gartner reports, only 3% of brand mentions are tagged, so using that approach severely limits what you get back.

So getting the query right is key to reducing noise but, and it’s an important but, this is only part of the lead identification challenge because there is noise within the noise. This is in the form of posts that are relevant to your target, and are legitimately returned by the search query, but just don’t say anything actionable you can use as a lead. We have found that this noise can be between 80-90% of even a well-constructed query.

An example related to fitness might be, “Time for another run in the pouring rain.” I am afraid that, as humans, we are quite good at speaking whilst not saying anything that someone can then do anything about, particularly on social media. However you still have to wade through all the non-actionable stuff in the haystack to find the actionable needles you really want. If you consider that Social Media Lead Generation Heaven is being presented with just the posts that you can actually do something with, then being able to identify and engage with those posts quickly and easily has to be your ambrosia.

Most traditional social media tools provide functionality that lets you search your data set using multiple keywords. This works to some extent, but can be quite blunt and typically requires training and practice to overcome the limitations of the search protocols. More importantly, most words have many synonyms and related concepts, which means you probably still get quite a lot of noise.  It’s not then surprising that according to Hootsuite only 15% of marketing executives have been able to demonstrate social media’s quantitative effect on their business. I imagine that the other 85% got fed up with wading through the noise and gave up!

We have found that by taking an alternative approach, using natural language processing technology to surface and combine a number of linguistic markers in the text, we have been able to identify the absence of meaning in posts relative to a target and remove them..the hay! You only have to look at the ones that get sifted out to see that typically there is nothing you can do with them or learn from them. The ones you are left with contain markers for actionability: intent to do something about fitness; some sentiment towards running; questions or opinions about fitness; and more. The presence of such markers can provide a nice stream of posts ready for a team to engage with and sell to. More importantly, they all merit attention, so you don’t end up with a hacked off sales force who have written social media leads off as junk.  I know from my days as a sales foot warrior, when only cold calling or door knocking was out there, that I would have thought I had gone to heaven if I had this source of quality leads.

At the end of the day, who wouldn’t want that steady stream of good quality leads that social media has been promising since it took off?  And with Hootsuite reporting that 60% of social media managers cite ‘measuring ROI’ as one of the top three most challenging aspects of a social media campaign, what CMO wouldn’t want to be able to link it directly to sales ROI, especially as Bain & Co. found that customers who engaged with brands over social media end up spending anywhere between 20 to 40% more money!

Who Are My Advocates?

Identifying social media brand advocates has long been among a marketer’s aspirations. We all know that nothing carries cred like a happy, independent believer. And if that believer happens to have, say, a few hundred thousand social media followers, the boost to the brand can be enormous.

The corollary, sadly, is also true: make an enemy out of a highly-influential blogger, and you have a major PR issue on your hands.

Identifying both of them follows about the same course, so let’s stick with advocates. How can you figure out who is advocating on your behalf?

The answer lies in the combination of two “signals” that we identify in a text (in fact, the addition of one more can be a real refinement; more on that later).

The first, and obvious, signal is polarity (a.k.a. sentiment). It’s hard to consider someone an advocate if they don’t like you. So step one is to isolate posts where a positive, direct reference to your brand has been made. Essencient provides that information directly and with an industry-high level of accuracy.

Positive sentiment is not, on its own, a definite marker for advocacy. This is because advocates have more skin in the game than just having an opinion: they want you to have their opinion, too. So, they offer advice about what they advocate: they tell you to buy it, or consider it, or take some other (presumably constructive) action about it.

That’s as distinct from them just intending to do something themselves. “I like that new Beamer and I’m going to buy it” is a good thing, but not as strong an advocacy as “I like that new Beamer and you should buy it”. The offering of advice is a clear marker, with positive sentiment, of advocacy. Essencient, and only Essencient, provides the so-called Guidance signal that can identify advocacy.

There’s a good refinement available: Flamboyancy, another signal that Essencient (alone) provides. Flamboyancy measures how “flowery” the language is in a post. “I took a long, exciting test drive in that beautiful new Beamer.  You should buy it immediately!” is a far stronger endorsement and call to action than the last example above. Factoring Flamboyancy into the identification of advocates can only improve your results.

As I mentioned, the corollary to advocacy is detraction. Knowing who vocally dislikes your brand is pretty important, as well. And, of course, knowing who your competitor’s detractors are is a powerful insight as well. Identifying detractors differs from identifying advocates only in the sentiment being expressed (obviously, it will be negative for detractors).

So there it is: advocates and detractors identified. If you layer on reach or Klout scores, you can pinpoint who your greatest reward and risk bearers are and, through your engagement team, influence them appropriately.

Chasing 80%: Peripheral Sentiment

I recall that academic research performed in the past has demonstrated that determination of polarity in text (what we have come, perhaps incorrectly, to call “sentiment”) has an accuracy ceiling of about 80%. That is, given a body of statements, a team of human beings will not agree on the sentiment in those statements more than 80% of the time. That’s because, well, people have opinions.

So, the best we can hope to achieve in automated sentiment analysis would be that 80%. A lofty goal, indeed.

In an earlier post, I discussed the importance of World Knowledge in making good decisions: that knowledge we’ve gained from lining in our world that biases, hopefully correctly, the calls we make. Computers are not good at world knowledge, although we’re always trying to get better.

At Essencient, we have recently developed a concept that we call Peripheral Sentiment. Since our objective is not to make decisions for you, but to make sure that the texts you see are worthy of human evaluation, PS is a very useful thing.

PS says that for a given topic, if there is no directly measurable sentiment for or against it, but there is sentiment of some kind at the level of the entire text, some of that latter sentiment might be ascribed to the topic. Therefore, the text is probably important enough for human evaluation.

Here’s an example: “I stayed at Hotel Roger last night. The food was terrible.” World knowledge tells us that a hotel can be judged by its food. Therefore, this text could be considered negative about Hotel Roger, although nothing directly critical of that topic was said. Peripheral sentiment score: negative. Human, please take a look.

It’s not perfect. Consider this one: “Drove the Benz to dinner last night. The food was terrible.” You get it, of course: this says nothing bad about the car, since food and cars have no world knowledge relationship. In this case, we’d put this in front of a human, who would probably consider it unimportant (about Benz, that is).

Still, even with imperfections, it does make it more likely that texts with only indirect sentiment towards your brand will at least end up getting reviewed. That’s why PS takes us a step closer to the 80% mark. Expect to see more about PS going forward.

Your Team is Smarter Than Our Tech

One of the cornerstone beliefs reflected in Essencient product design is the notion of your team’s competence: you invest in their training and the systems that support them, because you know that all they need to excel is the best conditions to do so.

Put another way, you’re not asking us to decide whether a social media post is a lead or a case, or anything else other than significant enough to warrant attention. Removing the hay so the needles are visible is the best way we can empower your team to make the best decisions. So we concentrate upon doing that.

One of the main reasons we emphasize hay removal is that your people are smarter than our tech. That will always be true. And one of the main reasons for that is what we call “world knowledge”. Sometimes people call that “context”. World knowledge is what we employ to make quality decisions about everything, and why we are capable of making decisions based upon more than just the case that’s in front of us.

A great example of the need for world knowledge is sarcasm. If I were to say, “Wow, isn’t it great that taxes are going up again?”, the odds are overwhelmingly good that I’m being sarcastic. Why? Because very few people actually like to pay more tax (I don’t know any, that’s for sure).

One of the greatest challenges to NLP tech like ours is a computer’s lack of world knowledge. Sure, we can try to develop constructs and methods to approximate world knowledge (and we do try…), but they will always be error-prone.

Your people, though, are far more likely to make the right call. So, our logic goes, why ask you to depend upon a substitute for something people do better? Precisely because we can dependably eliminate the noisy, useless conversations at low cost, we can deliver the much smaller number of highly useful conversations to your team for high-value decisions.  And that, we believe, is what you really want us to do.

Do you agree? We’d appreciate knowing your opinion.

Can Social Leads Actually Pay Off?

Can Social Leads Actually Pay Off?

I’ve had many discussions in the past with customers about the viability of “social leads”, that is, leads mined organically from social media. The consensus has been that it’s an edgy proposition, and for many good reasons. In this post, I’d like to advance the thesis that social leads can be very viable if, and only if, we stop trying to take the qualification and go-no-go decisions out of the hands of those best equipped to make them.

Many social media processing offerings based upon Natural Language Processing set expectations that cannot be met. They promise a continuous stream of viable sales leads generated auto-magically in real time and injected seamlessly into your sales workflow. The sheer volume of social media, we are told, means that leads will simply pour into the funnel. Experience has shown, of course, that it doesn’t work like that.

First off, there’s nothing static or universal about what constitutes a lead, much less a qualified one. I remember a college friend reminding me that, “at two in the morning, any dance is a good dance.” Put another way, when the phones aren’t ringing, any lead is worth following up. The corollary is also true: when the team is under pressure, “I love my friend’s new phone” is probably not going to generate an intervention.

Not only does the definition of a good lead change according to circumstances, the decision to reach out to the social media user is always a matter of risk/reward: blatant intervention violates the privacy expectations of many users, and can really backfire on the brand. Only a trained team member can make that call.

Furthermore, the automatic injection of large quantities of probably-noisy social leads into the sales workflow can really tank the team’s KPIs. Metrics like percentage of opportunities closed, or even just percentage of opportunities addressed, can go right through the floor if the queues are flooded with noise masquerading as leads. Given a low conversion rate on social media interventions, many managers simply don’t want to risk degrading otherwise solid team performance.

So, what to do? Notwithstanding all these obstacles, the social lead generation channel is growing and maturing, and cannot simply be ignored. There’s gold somewhere in all that dirt.

We believe that the application of Essence Mining™ techniques to sales workflow can really boost sales performance. The key is to let your team decide which conversations they should act upon. They can do that, if they’re not overwhelmed by the volume of the lead funnel. Since most posts are noise, Essence Mining can surface the 5% or so of conversations that might actually lead to a sale. Once those conversations, and only those conversations, enter the workflow, your team can qualify them; action them if appropriate; quickly reduce the queue while maintaining or boosting KPIs; and, we may hope, actually generate significant profit and brand benefits from social media.

Essencient technology doesn’t try to decide what’s a lead and what isn’t for you, but just makes sure that what your team sees is worth seeing so you can treat social media exactly the same way you treat other, less noisy channels.

That’s how we see it but get in touch to let us know what you think.

Mining The Undisputed Automotive Gold Standard

Mining The Undisputed Automotive Gold Standard

The continuous onslaught of (artsy, racy, glitzy…) luxury automotive advertising must be effective. Why else would such vast sums be spent on influencing a choice that most of us make only every few years? Clearly, the average buyer is not going to race out and spend North of $60,000 for a luxury sedan on the strength of a 30-second spot, no matter how compelling. It’s about who we think of when we start thinking, “time for a new ride.”

So, who do we think of?

Essence Mining™ (EM) is the perfect methodology for answering that question. To do so, we set up Essencient profiles for each of the major luxury auto manufacturers and turned the analysis loose for two weeks. About six million tweets came in, and the EM engine determined that about 92.7% of them were just noise, and cleared them away. That left us with 438,000 important tweets that we could analyze using the Essencient for iPad app. What we learned was an eye opener.

Essencient for iPad’s  “Competition” view answers the question, “For a tweet focused on a brand, what other brands are most often mentioned by twitter users in the same tweet?” Although one can drill down from there, just the answer to that basic (but hard to answer) question reveals the gold standard.

We found that just about every luxury auto brand’s main competitor was the same brand. The gap between that brand and all others was significant, sometimes two to one. Twitter users with real intent and real opinions regarding luxury autos compared just about every other brand to this one.

And that brand is…

Mercedes Benz.

Duh, you may say. But would BMW or Lexus have been surprising? The competition is fierce, and costly: the real question is, why did Benz win?

They won by sheer brute force: according to Statista, in 2013 Benz topped the nearest US luxury competitor ad spend (Cadillac) by 14% and outspent BMW by almost three to one. The cost of being the gold standard? 323 million dollars.

By marshaling twitter traffic for a mere two weeks, and mining the essence of that traffic, we were able to see that Mercedes’ awareness strategy is working. We could also see what criteria were applied to the comparisons, and what issues drove intent to purchase (or not to purchase). We could even see how each brand was described across the tweets, which goes a long way towards explaining why that brand leadership stays sticky.

Does your brand have that kind of business intelligence?

By the way, people who mentioned Mercedes Benz tended to distribute their comparisons across brands like Range Rover, Jeep, Lexus, Audi and even Ferrari about equally, 10% or so against each. So the impact of that ad spend was pretty even regardless of competition.

Stay tuned for competitive analyses in other important market categories, including mobile devices and…beer.