JenniferMurphy
Well-known Member
- Joined
- Jul 23, 2011
- Messages
- 2,687
- Office Version
- 365
- Platform
- Windows
Suppose I am looking to buy a new toaster. I go on Amazon and find several that look good to me. I see that their 5-star ratings vary considerably (from around 4.1 to 4.9) as do the number of reviews (from around a dozen to almost 50,000). I would like to enter these two parameters into a table and then figure out a way to adjust the ratings taking into account the number of reviews.
The table below is my first crack at solving this. It uses confidence intervals as suggested when I asked this question over on the Talk Stats forum.
The last post over there suggested that I am doing something wrong, but did not provide any details or suggestions as to how to correct it.
Either I am doing something wrong or this is the wrong method. The results do not seem right. A 4.7 rating with 9,000 reviews (Product D) ought to be worth more than one with a 4.8 rating based on just 2 reviews (Product C). Even more surprising, a 4.5 rating with 10 million reviews (Product K) ought to be worth more than a 4.6 rating with just 2 reviews (Product J).
Can anyone either help me fix whatever is wrong with my confidence interval implementation or suggest a different algorithm? Thanks
The table below is my first crack at solving this. It uses confidence intervals as suggested when I asked this question over on the Talk Stats forum.
How to compare similar ratings?
Most of the products on Amazon have a 5-star rating (1.0 - 5.0). Few products have ratings below 4.0 and almost none below 3.0. The majority are clustered around 4.2 to 4.7 or so. The ratings include the number of reviews from which those ratings were calculated. Those numbers can vary from just...
www.talkstats.com
The last post over there suggested that I am doing something wrong, but did not provide any details or suggestions as to how to correct it.
Either I am doing something wrong or this is the wrong method. The results do not seem right. A 4.7 rating with 9,000 reviews (Product D) ought to be worth more than one with a 4.8 rating based on just 2 reviews (Product C). Even more surprising, a 4.5 rating with 10 million reviews (Product K) ought to be worth more than a 4.6 rating with just 2 reviews (Product J).
Can anyone either help me fix whatever is wrong with my confidence interval implementation or suggest a different algorithm? Thanks
Cell Formulas | ||
---|---|---|
Range | Formula | |
E5:E18 | E5 | =RANK.EQ([@Rtg],[Rtg]) + COUNTIFS([Rtg],[@Rtg],['# of Reviews],">" & [@['# of Reviews]]) |
F5:F18 | F5 | =[@Rtg]*[@['# of Reviews]] |
G5:G18 | G5 | =[@['# of Reviews]] * ([@Rtg]-Mean)^2 |
H5:H18 | H5 | =CONFIDENCE.NORM(Alpha,StdDev,[@['# of Reviews]]) |
I5:I18 | I5 | =[@Rtg]-[@[Conf Int Norm]] |
J5:J18 | J5 | =[@[Adj Rtg 1]]-[@Rtg] |
K5:K18 | K5 | =RANK.EQ([@[Adj Rtg 1]],[Adj Rtg 1]) |
L5:L18 | L5 | =[@[Adj Rtg Rank]]-[@[Rtg Rank]] |
P6 | P6 | =getformula(TblConf[@[Rtg × '#Revs]]) |
P7 | P7 | =getformula(G5) |
P8 | P8 | =getformula(H5) |
P9 | P9 | =getformula(I5) |
P10 | P10 | =getformula(J5) |
P11 | P11 | =getformula(K5) |
P12 | P12 | =getformula(L5) |
O14 | O14 | =TblConf[[#Totals],['# of Reviews]] |
P14:P17 | P14 | =getformula(O14) |
O15 | O15 | =TblConf[[#Totals],[Rtg × '#Revs]]/TblConf[[#Totals],['# of Reviews]] |
O16 | O16 | =TblConf[[#Totals],['#Rev × Squares]]/TblConf[[#Totals],['# of Reviews]] |
O17 | O17 | =SQRT(Variance) |
C19 | C19 | =SUBTOTAL(103,[Rtg]) |
D19 | D19 | =SUBTOTAL(109,['# of Reviews]) |
F19 | F19 | =SUBTOTAL(109,[Rtg × '#Revs]) |
G19 | G19 | =SUBTOTAL(109,['#Rev × Squares]) |
Named Ranges | ||
---|---|---|
Name | Refers To | Cells |
'Confidence Interval Simple'!Alpha | ='Confidence Interval Simple'!$O$18 | H5:H18 |
'Confidence Interval Simple'!Mean | ='Confidence Interval Simple'!$O$15 | P15, G5:G18 |
'Confidence Interval Simple'!StdDev | ='Confidence Interval Simple'!$O$17 | P17, H5:H18 |
'Confidence Interval Simple'!Total | ='Confidence Interval Simple'!$O$14 | P14 |
'Confidence Interval Simple'!Variance | ='Confidence Interval Simple'!$O$16 | P16, O17 |