Sentiment Analysis


November 02, 2017 - by

Sentiment Analysis

Sentiment Analysis in Excel! There is a free add-in from Microsoft Labs that will let you do sentiment analysis in Excel. What if you have to wade through hundreds of survey comments to see what people think of your company? Excel can assign a probability showing how positive or negative each comment is.


Watch Video

  • It is easy to quantify survey data when it is multiple choice
  • You can use a pivot table to figure out what percentage each answer has
  • But what about free-form text answers? These are hard to process if you have hundreds or thousands of them.
  • Sentiment Analysis is a machine-based method for predicting if an answer is positive or negative.
  • Microsoft offers a tool that does Sentiment Analysis in Excel - Azure Machine Learning.
  • Traditional sentiment analysis requires a human to analyze and categorize 5% of the statements.
  • Traditional sentiment analysis is not flexible - you will rebuild the dictionary for each industry.
  • Excel uses MPQA Subjectivity Lexicon (read about that at http://bit. ly/1SRNevt)
  • This generic dictionary includes 5,097 negative and 2,533 positive words
  • Each word is assigned a strong or weak polarity
  • This works great for short sentences, such as Tweets or Facebook posts
  • It can get fooled by double-negatives
  • To install, go to Insert, Excel Store, search for Azure Machine Learning
  • Specify an input range and two blank columns for the output range.
  • The heading for the input range has to match the schema: tweet_text
  • Companion article at: http://sfmagazine.com/post-entry/may-2016-excel-sentiment-analysis/

Video Transcript

Learn Excel from MrExcel Podcast, Episode 2062: Sentiment analysis in Excel

Oh hey, it was a Thanksgiving night and we were sitting around the pumpkin pie and Jes, friend of ours, started talking about doing sentiment analysis on Twitter data.  And I said, “Hey, you know that Excel has a way to do sentiment analysis.”  And I realized I didn't have a good video on this or any video on this, so this video is about doing sentiment analysis in Excel.

Now the first question is, what the heck is sentiment analysis?  And if you do a survey of your customers and they have a multiple choice selection where they can choose from 1 to 5, well, that's really, really easy to analyze.  You can just create a little pivot table: Insert pivot table, Existing Worksheet right here, click OK.  We want to know the question there or the answer to the question, and then how many answers there were for each one, and that gives us the absolute number.  You can even come in here and change this from Field Settings to Show values as a % of the Column Total, like that. 

Alright, so you can see for each answer what percentage of the people get an answer.  Alright, but sentiment analysis is for when you have a really long answer where you say, “Hey, alright, well you know, tell us why you gave us that answer?”  And they, you know, use sentences or paragraphs.  Well, if you have hundreds or thousands of these, it's very hard for someone to go through it and read them all and figure out what's going on, alright? 

So there's two different kinds of sentiment analysis.  Typically in the past you'd use a human supervised learning algorithm.  So if you had 5,000 answers, go through, you know, 200 of those and choose the positive and negative words and phrases.  You're essentially building a dictionary of the positive and negative words; but, you know, this was very limiting.  If you did this for a place that did car repair and then had a different customer, you know, who did carpet cleaning, those two dictionaries are completely different.  You have to do the machine learning or the human supervised learning over and over and over again.  So, Excel uses this thing called the MPQA Subjectivity Lexicon and you can go Google this.  It has the info about it - 5,097 negative words, 2533 positive words.  And so, it works great for short sentences or Tweets or Facebook posts.  But one thing I've noticed is that if someone is writing in double negatives, I cannot say that I do not hate this feature, well, the machine learning will fail there.  And heck, I fail.  I can't tell if they're happy or not. 

Alright, so here's what we do.  In Excel 2013 or Excel 2016, go out to the Insert tab, go to the Store, when the search box comes up search for Azure Machine and you get Azure Machine Learning right there.  We click Add.  Alright, and two different tools out here: the Titanic Survivor Predictor, which is fun; and, the Text Sentiment analysis Excel Add-in.  Let’s use that one.  Alright, here's a couple of things that will trip you up.  Your heading: Take a paragraph to explain your answer.  It needs to match the Schema and the Schema says that the heading has to say tweet_text.  So, up here: tweet_text, of course, case sensitive matters, alright.  And then close the Schema, and then Predict, Input: A1 to 100, My data has headers, Output: DataB1, Include the headers.  They're going to give us 2 columns.  Make sure that you have 2 blank columns there; otherwise, it's going to override the data.  You have 2 choices: A few rows at a time or As a batch.  This is just a hundred, so it really doesn't matter.  I will choose Predict and BAM!  Just that fast. 

Alright now, we get 2 columns: we get a Sentiment and a Score, alright.  So, let's represent the scores here as percentages with a bunch of decimal places.  Alright, so 47.496, this goes from 0 to 100%.  Close to 100 is extremely positive, close to 0 is extremely negative, alright?  So here, we have one where there's a minor problem, drives me crazy.  Can't find the solution, so you can see why that's being rated as extremely negative.  Let's look at one that comes up extremely positive.  Alright, so you know, so we have some happy words here: please and thank you, exclamation points and so on.  That might be contributing to the high score.  Alright, so is it perfect?  No, but it'll give you a quick, quick way to tell you, you know, how many people are extremely happy or extremely negative about those answers. 

And of course, again, here we can do this with a pivot table: Insert, Pivot Table, go to an Existing Worksheet right here, click OK, and we're interested in the Sentiment, and then maybe with the average Score is for each of those.  So we'll change this under Field Settings to be an Average, click OK.  And so, or maybe even a Count.  I guess we'd want to know the Count, how many people.  So we'll take some other field, and so, we know how many people were negative.  Ooh, how many people were neutral, how many people were positive and what the average score of each of those was. 

Alright, so if you have survey data and it's a multiple choice, easy to use a pivot table to figure out what percentage each answer has.  But for free-form text answers, it's hard to process.  If you have hundreds or thousands of them, sentiment analysis is a machine- based method for predicting if an answer is positive or negative.  Microsoft offers a free tool for this.  Works in Excel 2013 or Excel 2016, called Azure Machine Learning.  Usually have to go through and categorize 5% of the statements manually by hand.  It's not flexible, you have to re-categorize for each new data set, but Excel is using this MPQA Subjectivity Lexicon.  It's a generic dictionary.  It's going to work for short sentences, Tweets, Facebook posts.  I can get fooled by double-negatives.  So just go to the Excel Store, search for Azure Machine Learning.  Specify an input and a two columns for an output range.  Don't forget to change the heading to match the Schema, tweet_text, in this particular case.

Alright, so there you go.  Next time you have a large amount of data to analyze, check out using Azure Machine Learning, the free Add-in for Excel 2013.  Thanks for stopping by, we'll see you next time for another netcast from MrExcel.

Download File

Download the sample file here: Podcast2062.xlsm

Title Photo: PDPics / Pixabay