Learn Excel - Understanding FiveThirtyEight - Podcast 2057

If you like this content, please consider visiting the video on YouTube and subscribe to the MrExcel Channel to show your support!
This video has been published on Oct 29, 2016.
Ever since I saw Nate "Moneyball" Silver present at a Microsoft conference, I've been a junkie for his election statistics at FiveThirtyEight. But Nate has baseball predictions, and as a long-suffering Cleveland Indians fan, I live and die by Nate's predictions for Cleveland winning the world series. But his model swings wildly after each game, first saying the Cubs would win, then the Tribe, then the Cubs, and now the Tribe. How can this be? Today, I use an Excel spreadsheet to play the World Series 150,000 times and come incredibly close to Nate Silver's conclusion. Thanks to Excel MVP Jon Peltier for his awesome chart utilities at: Peltier Tech Charts for Excel 3.0
maxresdefault.jpg


Transcript of the video:
Learn Excel from MrExcel Podcast, Episode 2057: Understanding 538 Well, I'm a Cleveland Indians fan.
If you follow baseball, American baseball, which means that in my entire lifetime, when I seen the Indians in the World Series two times before this year, three times this year, and the Indians are ahead 2 to 1 in a best of 7 series.
They're fighting against the Chicago Cubs who have had an even longer drought, either team winning the World Series will be a great story.
And I spend a lot of time at my favorite website, Nate Silver’s FiveThirtyEight.
A lot of it for election coverage but Nate is a baseball guy first, and he has these awesome statistics and I live and breathe on these statistics.
So right now, Nate is saying the Indians’ chance of winning the World Series is 63%, and the Cubs’ chance of winning the World Series is 37%.
But these numbers fluctuate wildly after every game and so using the Elo statistic and really advanced modeling, Nate predicts the outcome of each of the games.
And look at this, the Indians aren't predicted to win.
They're not predicted to win.
They're not predicted to win.
Finally, in Game 7 they're going to predict a win.
They have to win.
For Nate, how could it be that with your calling 3 games against the Indians, they're going to win?
Alright, and so I needed to wrap my head around this and of course, to wrap my head around this.
I came in to Microsoft Excel.
So, I took these statistics in.
These are rounded off, I'm sure that Nate has this out far greater percentages.
I took the 49, the 34, the 49 and the 54%.
Put them up here, those are the four games.
And then a simple, simple little formula here that says: =RAN , RAN gives us a random number between 0 and 1.
If that number is less than or equal to the 49%, then I put a 1 for the Indians winning.
So every time I calculate here - See we have a different outcome to this series.
Over here in IF statement, if the Indians win more than 1 game, in other words that they win 2 of the remaining 4 games, they've won the series.
And then here, let's go on where they actually, a very inefficient formula to figure out whether they won in 6 games or 7 games.
Alright, let's be fair.
Calculate if the Cubs win and how many games they can win.
And of course, the Cubs can't win in 5 now, they have to win in either 6 or 7, alright.
So I'm going to model this thing: 10,000 times, 10,000 times right there.
So that whole big thing - Every row here is a different possible outcome and you see, there are a lot of possibilities where Cleveland wins because even though it's more likely the Cubs will win the next game, there's still 49 times out of a hundred that the Indians do win the game.
And, there's enough where the Indians win 2 to win the series.
Alright, so out here on the right-hand side, I have some statistics.
These statistics are just adding up the number of times that the Indians win, divided by 10,000 because there's 10,000 rows, alright?
So, there's 6,420 times are they win.
In 5 games, 1675, 6 games 2478, 7 games 2275, alright.
And then, some more statistics out here, whether Cleveland wins in 5, 6 or 7 or Chicago wins in 6 or 7.
Now, this is a result of 10,000 runs- 10,000 runs and you can see here, the 538 is saying a 63% chance of winning the World Series and right here in 10,000 runs, I have a 64% chance.
And this- Well a couple things.
First off, Nate's model is far more superior than my simple little thing that threw together here in Excel.
Because in his model, once the winner of this game is known, he actually changes the percentages for the remaining items, alright.
And mine doesn't do that, but still order of magnitude, close.
But let's not just run it for 10,000.
Let's use that great trick that I learned from Professor Simon Banega where we create a data table from a blank cell, usually the top-left corner cell of a data table in Excel is the formula, but in this case, I'm going to leave that blank.
We're going to go to Data, What-if Analysis, Data Table and all we do is specify Column Input Cell of any blank cell, and because we have ran back there, it will do those 10,000 calculations now 15 times.
So here's the result, 150,000 runs of the World Series, so: Cleveland in 5, Cleveland in 6, Cleveland in 7, Chicago in 6 and Chicago in 7.
So right now, sure enough, there's about a 63% chance after running in a model 150,000 times.
The Cleveland's going to win the series and tonight Saturday, October 29th , the Cleveland ace pitcher, Corey Kluber, is going to be our starting pitcher, on 3 days rest.
And how important is it for the Indians to win tonight.
Alright, well there's a really easy way to find this out.
Now that we have this model setup, so I copied that data and Paste Special as Values, paste Special Formats, and then said – Alright, if the Indians win tonight, so I'm basically going to change this to 100 which means that every single game in Game 4 is a winner.
And then we'll come out here and you'll see that there's now an 85% chance of the Indians winning the World Series and only a 15% chance of the Cubs winning the World Series.
We’ll paste this down here, paste Special Values, paste Special Formats.
But, let's go the other way.
If the Cubs win tonight, - They're playing at home, first chance to ever win a World Series at Wrigley Field in 100 years.
If the Cubs win tonight, it dramatically changes the playoff picture.
Alright, if the Cubs win tonight then there's no chance for Cleveland to win at 5, all of those have gone away.
And we're down to a 43% chance of the Indians winning the World Series.
So we go from a 63% chance to a 43% chance, if in fact the Indians do not win on the road in Wrigley tonight.
And hey, for all you Cubs fans, all you got to do is pull one out tonight and all of a sudden everything shifts back in your favor and you now have a 56% chance of winning the series.
Now, I tried to use the brand-new treemap view here for illustrating this.
And the treemap, well it’s new.
It doesn't have a lot of variability.
And I chose some bad colors; I tried to use the two colors for the Cleveland Indians.
They’re red and blue in the Chicago Cubs, they’re red and blue - It was just all too confusing because what happens here is, by default, the treemap sorts in order: 24%, 22%, 19%, 17%, 17%.
And when I put this out on Facebook, Jon Peltier, who of course has that awesome Marimekko Chart Utility, said, “You know, it'd be a lot better if we reorganized it.” So, Jon took my data – Thanks, Jon.
And created these charts that are a lot better than mine, and a plug for his awesome chart utilities that lets you create charts that you can't normally create in Excel.
So here, this shows right now before Game 4, 63% chance of the Indians winning.
If the Indians win Game 4 tonight, Wow ! 84% chance but if they lose, oh, we have less than a 50% chance and the Cubs are likely to go on and win it all possibly in Game 6, is the most likely outcome there.
So tip that out to Jon Peltier for his awesome chart utility.
I'll put a link to that down in the YouTube comments.
You should check it out.
Go tribe.
 

Forum statistics

Threads
1,223,634
Messages
6,173,477
Members
452,516
Latest member
archcalx

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top