First, this is as much of a math question as an Excel question, so my apologies for that up front.
I am designing a "power ranking" system that will take the results of a list of games between many dozens of players and rank the players from best to worst. Because each player will have played a different schedule, it is not as easy as simply sorting by wins and losses (if only). Instead, I need the system to account for both a player's record of wins and losses as well as the quality of their opponents. I believe this will require some sort of iterative method (to look at the player's record, and his opponents' records, and their opponents' records, etc.) I am asking for advice regarding implementing a system that I think should work (see below), but if anybody has suggestions for throwing out my system and doing something completely different (including just adapting some existing system) I am all ears.
My method, that I think should work, is modeled after a college football ranking system contained in the "Real Data, No Point Spread" tab of this http://iowahawk.typepad.com/college_fb_ratings/College Football rating.xls spreadsheet. (The games in my tournament will have no "point spread," just a winner and a loser.) My version is available at https://drive.google.com/file/d/0B6_fhgz-r6mbSlowOHVRRnpwSGM/edit?usp=sharing
The method is this:
1- Goal is to calculate "PR." This is the chance of winning a game against an average opponent.
2- Set all players PR equal to .5.
3- On each row, list two opponents, the result of their game, and VLOOKUP their current PR.
4- Calculate the chances of the actual winner winning the game using this formula ( Log5 - Wikipedia, the free encyclopedia ), based on their respective PRs. So, if a .7 beats a .6, the odds of that happening are (.7-(.7*.6))/(.7+.6-(2*.7*.6)) or 61%.
5- Calculate the "error" as the actual result (100% chance of the .7 winning) minus the predicted result (61% chance of the .7 winning), or in this example, 39%. So at this point, we have dozens or hundreds of rows, each of which reflects one game, and a column that gives the "error" for each row/game.
6- In another column, square the errors.
7- In a cell, sum all the squared errors.
8- Use the solver tool to change the PRs until the squared error is minimized.
Sounds great, and it works in that college football spreadsheet I swiped it from above, but my implementation doesn't work for some reason. At first I was running into a divide-by-zero error whenever the two PRs for two opponents were identical, but I avoided that by manually forcing a "50% win chance" result when two equal PRs faced off. But still, I'm getting nonsense results.
I REALIZE THIS IS WAY TOO MURKY TO BE APPEALING FOR ANYONE TO HELP WITH! But nevertheless, I would be most grateful for any advice or thoughts that anyone can offer. If there's a better way to share my spreadsheet so you can see what I'm rambling about, let me know that too.
Thank you!!
I am designing a "power ranking" system that will take the results of a list of games between many dozens of players and rank the players from best to worst. Because each player will have played a different schedule, it is not as easy as simply sorting by wins and losses (if only). Instead, I need the system to account for both a player's record of wins and losses as well as the quality of their opponents. I believe this will require some sort of iterative method (to look at the player's record, and his opponents' records, and their opponents' records, etc.) I am asking for advice regarding implementing a system that I think should work (see below), but if anybody has suggestions for throwing out my system and doing something completely different (including just adapting some existing system) I am all ears.
My method, that I think should work, is modeled after a college football ranking system contained in the "Real Data, No Point Spread" tab of this http://iowahawk.typepad.com/college_fb_ratings/College Football rating.xls spreadsheet. (The games in my tournament will have no "point spread," just a winner and a loser.) My version is available at https://drive.google.com/file/d/0B6_fhgz-r6mbSlowOHVRRnpwSGM/edit?usp=sharing
The method is this:
1- Goal is to calculate "PR." This is the chance of winning a game against an average opponent.
2- Set all players PR equal to .5.
3- On each row, list two opponents, the result of their game, and VLOOKUP their current PR.
4- Calculate the chances of the actual winner winning the game using this formula ( Log5 - Wikipedia, the free encyclopedia ), based on their respective PRs. So, if a .7 beats a .6, the odds of that happening are (.7-(.7*.6))/(.7+.6-(2*.7*.6)) or 61%.
5- Calculate the "error" as the actual result (100% chance of the .7 winning) minus the predicted result (61% chance of the .7 winning), or in this example, 39%. So at this point, we have dozens or hundreds of rows, each of which reflects one game, and a column that gives the "error" for each row/game.
6- In another column, square the errors.
7- In a cell, sum all the squared errors.
8- Use the solver tool to change the PRs until the squared error is minimized.
Sounds great, and it works in that college football spreadsheet I swiped it from above, but my implementation doesn't work for some reason. At first I was running into a divide-by-zero error whenever the two PRs for two opponents were identical, but I avoided that by manually forcing a "50% win chance" result when two equal PRs faced off. But still, I'm getting nonsense results.
I REALIZE THIS IS WAY TOO MURKY TO BE APPEALING FOR ANYONE TO HELP WITH! But nevertheless, I would be most grateful for any advice or thoughts that anyone can offer. If there's a better way to share my spreadsheet so you can see what I'm rambling about, let me know that too.
Thank you!!