Please critique this little UDF to average ratings

JenniferMurphy

Well-known Member
Joined
Jul 23, 2011
Messages
2,691
Office Version
  1. 365
Platform
  1. Windows
I am in the process of researching password managers. I found at least 8 websites with reviews and "ratings". The ratings are all on different scales, so it's difficult to make an apples-to-apples comparison. So I wrote this little UDF. It takes each rating and rescales it to [0,1]. The resulting ratings can then be compared or averaged. I would appreciate any feedback plus answers to a few questions at the end.

Here's the code:
VBA Code:
Public Function ScaleEm(pValue As Variant, _
                        pMin As Double, pMax As Double, _
                        Optional pMinMax As Boolean = True _
                        ) As Variant
Dim TrueMin As Double
Dim TrueMax As Double

'Check for missing ratings and return null
If IsEmpty(pValue) Or Not IsNumeric(pValue) Then
  ScaleEm = ""
  Exit Function
End If

If pMinMax Then   'If min is min and max is max, preserve order
  TrueMin = pMin
  TrueMax = pMax
Else              'If they are reversed, reverse the order
  TrueMin = pMax
  TrueMax = pMin
End If

ScaleEm = (pValue - TrueMin) / (TrueMax - TrueMin)

End Function
The pMinMax switch tells the UDF if the Min and Max values are backwards as is the case for rankings. 1 = best, 2 = next best.

And here is some sample data. All of the data except for row 21 and columns M & W are from actual review websites. The table on the left is the raw (input) data. The table on the right is the results of calls to my UDF. In a future version, I'll pass the entire table on the left and have the UDF (now Sub) do all of the calculations and fill in column D.

20200323 1313 PW Manager ratings, detailed.jpg


Questions:
  1. I wanted to make it return a Double value, but then I didn't know how to handle missing ratings. Not every product is rated by every reviewer. There is no way to return a null double,m so it ends up being a zero, which then screws of the averages. So I changed it to Variant and returned a null value for missing ratings.
  2. Is my method of testing for missing ratings correct or is there a better way?
  3. I'd be interested in any comments on the algorithm. One problem is highlighted by the Test column. The mythical Test reviewer only reviewed one product, Test. So I gave it a "1" rating on a 1-10 scale. That gave that product a perfect 1.00 overall rating because none of the other products had a rating. I thought about assigning unrated products an average rating or a rating just below the lowest one they did rate, but this seemed arbitrary and may well penalize unfairly products that the reviewer just didn't get to.
Any other comments or suggestions are welcome.
 

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes
JenniferMurphy

The idea of using a Variant to deal with nulls seems fine to me.

As far as the algorithim goes, it seems to produce a meaningful result but personally I would have used a z-score formula (review score - average score) / standard deviation

And for the Test reviewer, you could put in additional smarts (meaning more helper tables) to exclude reviewers or products that have less than a minimum number of reviews.
 
Upvote 0
Solution
JenniferMurphy

The idea of using a Variant to deal with nulls seems fine to me.

As far as the algorithim goes, it seems to produce a meaningful result but personally I would have used a z-score formula (review score - average score) / standard deviation

And for the Test reviewer, you could put in additional smarts (meaning more helper tables) to exclude reviewers or products that have less than a minimum number of reviews.
Those are two great suggestions. I will look into both of them. Thanks.
 
Upvote 0
JenniferMurphy

The idea of using a Variant to deal with nulls seems fine to me.

As far as the algorithim goes, it seems to produce a meaningful result but personally I would have used a z-score formula (review score - average score) / standard deviation

And for the Test reviewer, you could put in additional smarts (meaning more helper tables) to exclude reviewers or products that have less than a minimum number of reviews.

Well, I got distracted a few times and just got back to this project.

Your Z score suggestion looks to be excellent. Thanks for that.

I created a sheet to compare Z scores vs "normalized" scores and "raw" rankings, two of the methods I was considering. Here's a mini-sheet. It shows 6 items (products) in rows 9-14 rated by 4 different sources in columns D-G. The first source uses a 5-star system, like Amazon. The second uses an inverted 1-9 system with 1 the highest and 9 the lowest. The third uses a 100-0 system. The last uses a 10 to -10 system. I wanted to test it against any kind of rating system.

The Z scores are in columns P-S. They are summed and ranked in H-I and averaged and ranked in J-K. I use a UDF to calculate the Z scores so it can handle inverted ratings. I'll include that code below.

The "normalized" ratings are in T-W and then summed and ranked in L-M. These also use a UDF for similar reasons. I'll include that code below, too.

The ranked ratings are in X-AA and then summed and ranked in N-O.

Weighted Ratings.xlsx
CDEFGHIJKLMNOPQRSTUVWXYZAA
5Max5110010
6Min190-10
7Weights1111
8ItemRtg 1Rtg 2Rtg 3Rtg 4Z SumZSum RankZ AvgZAvg RankNorm SumNorm RankRank AvgRank RankZ 1Z 2Z 3Z 4Norm 1Norm 2Norm 3Norm 4Rank 1Rank 2Rank 3Rank 4
9A4.73938+1.93441+0.483613.5112.01+0.25+0.28+0.53+0.880.930.750.930.902.02.02.02.0
10C4.61884+0.82012+0.205023.4823.43-0.25+1.93-0.23-0.630.901.000.880.704.01.04.04.5
11E4.648810+0.60003+0.150033.4133.43-0.25-0.55-0.23+1.630.900.630.881.004.04.54.01.0
12B5.04884+0.36854+0.092143.2153.55+1.77-0.55-0.23-0.631.000.630.880.701.04.54.04.5
13D4.641004+0.15415+0.038553.2343.55-0.25-0.55+1.58-0.630.900.631.000.704.04.51.04.5
14F4.44804-3.87726-0.969362.9865.36-1.27-0.55-1.43-0.630.850.630.800.706.04.56.04.5
15Std Dev0.201.216.632.66
16Mean4.653.3389.505.67
Compare Methods
Cell Formulas
RangeFormula
H9:H14H9=SUMPRODUCT(TblCompare[@[Z 1]:[Z 4]],Weights)
I9:I14I9=RANK.EQ([@[Z Sum]],[Z Sum])
J9:J14J9=AVERAGE(TblCompare[@[Z 1]:[Z 4]])
K9:K14K9=RANK.EQ([@[Z Avg]],[Z Avg])
L9:L14L9=SUMPRODUCT(TblCompare[@[Norm 1]:[Norm 4]],Weights)
M9:M14M9=RANK.AVG([@[Norm Sum]],[Norm Sum])
N9:N14N9=AVERAGE(TblCompare[@[Rank 1]:[Rank 4]]*Weights)
O9:O14O9=RANK.AVG([@[Rank Avg]],[Rank Avg],1)
P9:P13P9=ZScore([@[Rtg 1]], Rtg1Mean, TblCompare[[#Totals],[Rtg 1]], Rtg1Min, Rtg1Max)
Q9:Q14Q9=ZScore([@[Rtg 2]], Rtg2Mean, TblCompare[[#Totals],[Rtg 2]], Rtg2Min, Rtg2Max)
R9:R14R9=ZScore([@[Rtg 3]], Rtg3Mean, TblCompare[[#Totals],[Rtg 3]], Rtg3Min, Rtg3Max)
S9:S14S9=ZScore([@[Rtg 4]], Rtg4Mean, TblCompare[[#Totals],[Rtg 4]], Rtg4Min, Rtg4Max)
T9:T14T9=Normalize([@[Rtg 1]],Rtg1Min, Rtg1Max)
U9:U14U9=Normalize([@[Rtg 2]],Rtg2Min, Rtg2Max)
V9:V14V9=Normalize([@[Rtg 3]],Rtg3Min, Rtg3Max)
W9:W14W9=Normalize([@[Rtg 4]],Rtg4Min, Rtg4Max)
X9:X14X9=RANK.AVG([@[Rtg 1]],[Rtg 1])
Y9:Y14Y9=RANK.AVG([@[Rtg 2]],[Rtg 2],1)
Z9:Z14Z9=RANK.AVG([@[Rtg 3]],[Rtg 3])
AA9:AA14AA9=RANK.AVG([@[Rtg 4]],[Rtg 4])
P14P14=ZScore([@[Rtg 1]], D$16, TblCompare[[#Totals],[Rtg 1]], D$6, D$5)
D15D15=SUBTOTAL(107,[Rtg 1])
E15E15=SUBTOTAL(107,[Rtg 2])
F15F15=SUBTOTAL(107,[Rtg 3])
G15G15=SUBTOTAL(107,[Rtg 4])
D16D16=AVERAGE(TblCompare[Rtg 1])
E16E16=AVERAGE(TblCompare[Rtg 2])
F16F16=AVERAGE(TblCompare[Rtg 3])
G16G16=AVERAGE(TblCompare[Rtg 4])
Named Ranges
NameRefers ToCells
'Compare Methods'!Rtg1Max='Compare Methods'!$D$5P9:P14, T9:T14
'Compare Methods'!Rtg1Mean='Compare Methods'!$D$16P9:P14
'Compare Methods'!Rtg1Min='Compare Methods'!$D$6P9:P14, T9:T14
'Compare Methods'!Rtg2Max='Compare Methods'!$E$5Q9:Q14, U9:U14
'Compare Methods'!Rtg2Mean='Compare Methods'!$E$16Q9:Q14
'Compare Methods'!Rtg2Min='Compare Methods'!$E$6Q9:Q14, U9:U14
'Compare Methods'!Rtg3Max='Compare Methods'!$F$5R9:R14, V9:V14
'Compare Methods'!Rtg3Mean='Compare Methods'!$F$16R9:R14
'Compare Methods'!Rtg3Min='Compare Methods'!$F$6R9:R14, V9:V14
'Compare Methods'!Rtg4Max='Compare Methods'!$G$5S9:S14, W9:W14
'Compare Methods'!Rtg4Mean='Compare Methods'!$G$16S9:S14
'Compare Methods'!Rtg4Min='Compare Methods'!$G$6S9:S14, W9:W14
'Compare Methods'!Weights='Compare Methods'!$D$7:$G$7H9:H14, L9:L14, N9:N14


Here's the Z score code:
VBA Code:
Function ZScore(pValue As Double, pMean As Double, pStdDev As Double _
              , pMin As Double, pMax As Double) As Double

' If std dev is zero, return 0; else calculate Z Score
If pStdDev = 0 Then                   'If std dev = 0,
  ZScore = 0                            'Return 0
Else                                  'Else
  ZScore = (pValue - pMean) / pStdDev   'Calculate the ZScore
  If pMin > pMax Then ZScore = -ZScore  'If range order is reversed, revferse ZScore
End If

End Function

Here's the Normalize code:
Code:
                , Optional pWrstNew As Double = 0 _
                , Optional pBestNew As Double = 1 _
                ) As Double

Dim RangeOld As Double          'The range of the old scale
RangeOld = pBestOld - pWrstOld  'Calculate the old range

'Is it normal (Max > Min), inverted (Man < Min), or point (Max = Min)?
Select Case RangeOld
  Case Is > 0             '>0 = Normal range (Best > Worst)
    Select Case pXOld       'Where is pXOld on the old range?
      Case Is >= pBestOld     'If >= old best
        Normalize = pBestNew    'Make it the best
      Case Is <= pWrstOld     'If <= old Worst
        Normalize = pWrstNew    'Make it the Worst, scale it
      Case Else               'If it's within the old range
        Normalize = (pXOld - pWrstOld) / RangeOld  'Scale it
    End Select
  Case Is < 0             '<0 = Inverted range (Best < Worst)
    Select Case pXOld       'Where is pXOld on the old range?
      Case Is >= pWrstOld     'If >= old worst
        Normalize = pWrstNew    'Make it the worst
      Case Is <= pBestOld     'If <= old best
        Normalize = pBestNew    'Make it the best
      Case Else               'If it's within the old range
        Normalize = (pXOld - pWrstOld) / RangeOld  'Scale it
    End Select
  Case Else               '=0 = Point range (Best = Worst)
    Select Case pXOld       'Where is pXOld on the old (point) range?
      Case Is > pBestOld      'If > old point range,
        Normalize = pBestNew    'Make it the best
      Case Is < pWrstOld      'If < old point range
        Normalize = pWrstNew    'Make it the worst
      Case Else               'If = old point range,
        Normalize = (pWrstNew + pBestNew) / 2 'Make it the average (middle) value
    End Select
End Select

End Function

I'd appreciate any comments or suggestions.
 
Upvote 0
Looks good so far. One adjustment I would make is to try to eliminate outliers. For example Rating 2 for Item C has a z-score of +1.93 whereas the other z-scores for Item C are -0.25, -0.23 & -0.63. Including +1.93 in the averages and rankings skews the final result a little bit. The way to screen out unrepresentative results like this is to do a statistical analysis of the z-scores. You can calculate the average and standard deviation of an item's z-scores and then produce another table of z-scores consisting of only z-scores that fall between some range of standard deviations. For example, the average of Item C's z-scores is +0.90 and the standard deviation is -0.25. If you say that only results in the range Average +/- (Standard Deviation x 1.50) are acceptable, that excludes the +1.93 result. You then take the averages of the revised z-scores and rank those. The 1.50 factor is completely arbitrary and can be anything you like. I use this system for a motorsport scoring system I have and I have found that over the long term a factor of 2.0 satisfactorily excludes unreasonable results while retaining realistic results.

You will find that if you use a factor if 1.50 then all the Items except Item F will have one result excluded. If you use a factor of 1.75 then Item A gets to keep all its results. Using either 1.50 or 1.75 changes the z-score ranking a little bit. Instead of 1-A, 2-C, 3-E, 4-B, 5-D, 6-F you get 1-A, 2-E, 3-C, 4-B, 5-D, 6-F. Entirely up to you to decide which is the more realistic/representative outcome.
 
Upvote 0
Looks good so far. One adjustment I would make is to try to eliminate outliers. For example Rating 2 for Item C has a z-score of +1.93 whereas the other z-scores for Item C are -0.25, -0.23 & -0.63. Including +1.93 in the averages and rankings skews the final result a little bit. The way to screen out unrepresentative results like this is to do a statistical analysis of the z-scores. You can calculate the average and standard deviation of an item's z-scores and then produce another table of z-scores consisting of only z-scores that fall between some range of standard deviations. For example, the average of Item C's z-scores is +0.90 and the standard deviation is -0.25. If you say that only results in the range Average +/- (Standard Deviation x 1.50) are acceptable, that excludes the +1.93 result. You then take the averages of the revised z-scores and rank those. The 1.50 factor is completely arbitrary and can be anything you like. I use this system for a motorsport scoring system I have and I have found that over the long term a factor of 2.0 satisfactorily excludes unreasonable results while retaining realistic results.

You will find that if you use a factor if 1.50 then all the Items except Item F will have one result excluded. If you use a factor of 1.75 then Item A gets to keep all its results. Using either 1.50 or 1.75 changes the z-score ranking a little bit. Instead of 1-A, 2-C, 3-E, 4-B, 5-D, 6-F you get 1-A, 2-E, 3-C, 4-B, 5-D, 6-F. Entirely up to you to decide which is the more realistic/representative outcome.
Interesting point. My initial reaction is that I want to keep all ratings. The ratings in these examples are all made up. I included some more extreme values to see what the algorithm would do with them and to compare the different methods. Now that I've pretty much settled on the z-score method, I'll try some actual values. But thanks for the comment. If I see outliers in the real data, I'll revisit your point.
 
Upvote 0
As far as the algorithim goes, it seems to produce a meaningful result but personally I would have used a z-score formula (review score - average score) / standard deviation
The Z Score suggestion was very helpful. Thank you for that.

I have been playing around with it and have come up with some additional parameters. I have added an "Order" parameter which tells the code whether higher values are better (such as for ratings, mileage, or capacity) or lower values are (such as for price or error rate). I also added a weighting parameter which I use to give more weight to properties that matter more to me.

Here's a little demo table with 3 properties for each of 7 items. The Weight and Order parameters are across the top. I set the weights all to "1" so I could show that the weighted Z Score sums (Col M) are the same as the unweighted ones (Col L). I also show that the rankings are the same for both (Cols O & P). The Price (Col D) uses a LoHi order. The other 2 use a HiLo order (Cols F & H). In Col J, I applied a factor and a difference to the Amazon Ratings to show that items that are similarly (or identically) dispersed will have similar (or identical) Z Scores.

Weighted Ratings.xlsm
CDEFGHIJKLMNOPQ
2Weight1.001.001.001234.56
3OrderLoHiHiLoHiLo-5678HiLo
4ItemPricePrice ZScore LoHiRangeRange ZScoreAmazon RtgAmazon Rtg ZScoreAmazon Rtg2Amazon Rtg2 ZScoreZScore SumWtd ZScore SumSum ΔZScore Sum RankWtd ZScore Sum RankRank Δ
5E$42.00-0.6148180+1.32574.9+1.1965+371.34+1.1965+1.9073+1.9073=0.0011=0
6D$35.00+0.3161130-0.42524.8+0.7312+247.89+0.7312+0.6221+0.6221=0.0022=0
7G$49.99-1.6775180+1.32574.8+0.7312+247.89+0.7312+0.3794+0.3794=0.0033=0
8F$42.00-0.6148150+0.27514.7+0.2659+124.43+0.2659-0.0738-0.0738=0.0044=0
9C$32.75+0.6154125-0.60034.5-0.6647-122.48-0.6647-0.6496-0.6496=0.0055=0
10B$29.95+0.9878110-1.12564.5-0.6647-122.48-0.6647-0.8025-0.8025=0.0066=0
11A$29.95+0.9878120-0.77544.3-1.5953-369.39-1.5953-1.3829-1.3829=0.0077=0
12Mean$37.38-0.0000142.1429+0.00004.6429-0.000053.89=0.0000-0.0000-0.0000=0.00
13Std Dev$7.521.000028.55651.00000.21491.0000265.331.00001.09181.0918
Simple
Cell Formulas
RangeFormula
I5:I11I5=ZScore([@[Amazon Rtg]],TblSimple[[#Totals],[Amazon Rtg]],$H$13,$I$3)
J5:J11J5=([@[Amazon Rtg]]*$J$2)+$J$3
K5:K11K5=ZScore([@[Amazon Rtg2]],TblSimple[[#Totals],[Amazon Rtg2]],$J$13,$K$3)
L5:L11L5=[@[Price ZScore LoHi]]+[@[Range ZScore]]+[@[Amazon Rtg ZScore]]
M5:M11M5= [@[Price ZScore LoHi]]*$E$2 + [@[Range ZScore]]*$G$2 + [@[Amazon Rtg ZScore]]*$I$2
N5:N11N5=[@[Wtd ZScore Sum]]-[@[ZScore Sum]]
O5:O11O5=RANK.EQ([@[ZScore Sum]],[ZScore Sum],0)
P5:P11P5=RANK.EQ([@[Wtd ZScore Sum]],[Wtd ZScore Sum],0)
Q5:Q11Q5=[@[Wtd ZScore Sum Rank]]-[@[ZScore Sum Rank]]
E5:E11E5=ZScore([@Price],TblSimple[[#Totals],[Price]],$D$13,$E$3)
G5:G11G5=ZScore([@Range],TblSimple[[#Totals],[Range]],$F$13,$G$3)
D12D12=SUBTOTAL(101,[Price])
E12E12=SUBTOTAL(101,[Price ZScore LoHi])
F12F12=SUBTOTAL(101,[Range])
G12G12=SUBTOTAL(101,[Range ZScore])
H12H12=SUBTOTAL(101,[Amazon Rtg])
I12I12=SUBTOTAL(101,[Amazon Rtg ZScore])
J12J12=SUBTOTAL(101,[Amazon Rtg2])
K12K12=SUBTOTAL(101,[Amazon Rtg2 ZScore])
L12L12=SUBTOTAL(101,[ZScore Sum])
M12M12=SUBTOTAL(101,[Wtd ZScore Sum])
N12N12=SUBTOTAL(101,[Sum Δ])
D13D13=STDEV.S(TblSimple[Price])
E13E13=STDEV.S(TblSimple[Price ZScore LoHi])
F13F13=STDEV.S(TblSimple[Range])
G13G13=STDEV.S(TblSimple[Range ZScore])
H13H13=STDEV.S(TblSimple[Amazon Rtg])
I13I13=STDEV.S(TblSimple[Amazon Rtg ZScore])
J13J13=STDEV.S(TblSimple[Amazon Rtg2])
K13K13=STDEV.S(TblSimple[Amazon Rtg2 ZScore])
L13L13=STDEV.S(TblSimple[ZScore Sum])
M13M13=STDEV.S(TblSimple[Wtd ZScore Sum])


Here's that same table once I applied weight=3 to prices (because I'm a cheapskate) and weight=2 to range (because I don't trust Amazon ratings). I then sorted on the weighted Z Score sum. Items E & D remained in 1st and 2nd place. But Item B jumped 3 positions to 3rd place.

Weighted Ratings.xlsm
CDEFGHIJKLMNOPQ
2Weight3.002.001.001234.56
3OrderLoHiHiLoHiLo-5678HiLo
4ItemPricePrice ZScore LoHiRangeRange ZScoreAmazon RtgAmazon Rtg ZScoreAmazon Rtg2Amazon Rtg2 ZScoreZScore SumWtd ZScore SumSum ΔZScore Sum RankWtd ZScore Sum RankRank Δ
5E$42.00-0.6148180+1.32574.9+1.1965+371.34+1.1965+1.9073+2.0034+0.1011=0
6D$35.00+0.3161130-0.42524.8+0.7312+247.89+0.7312+0.6221+0.8292+0.2122=0
7B$29.95+0.9878110-1.12564.5-0.6647-122.48-0.6647-0.8025+0.0475+0.8563+3
8C$32.75+0.6154125-0.60034.5-0.6647-122.48-0.6647-0.6496-0.0192+0.6354+1
9A$29.95+0.9878120-0.77544.3-1.5953-369.39-1.5953-1.3829-0.1828+1.2075+2
10F$42.00-0.6148150+0.27514.7+0.2659+124.43+0.2659-0.0738-1.0283-0.9546-2
11G$49.99-1.6775180+1.32574.8+0.7312+247.89+0.7312+0.3794-1.6499-2.0337-4
12Mean$37.38-0.0000142.1429=0.00004.6429-0.000053.89=0.0000-0.0000-0.0000-0.00
13Std Dev$7.521.000028.55651.00000.21491.0000265.331.00001.09181.1914
Simple
Cell Formulas
RangeFormula
I5:I11I5=ZScore([@[Amazon Rtg]],TblSimple[[#Totals],[Amazon Rtg]],$H$13,$I$3)
J5:J11J5=([@[Amazon Rtg]]*$J$2)+$J$3
K5:K11K5=ZScore([@[Amazon Rtg2]],TblSimple[[#Totals],[Amazon Rtg2]],$J$13,$K$3)
L5:L11L5=[@[Price ZScore LoHi]]+[@[Range ZScore]]+[@[Amazon Rtg ZScore]]
M5:M11M5= [@[Price ZScore LoHi]]*$E$2 + [@[Range ZScore]]*$G$2 + [@[Amazon Rtg ZScore]]*$I$2
N5:N11N5=[@[Wtd ZScore Sum]]-[@[ZScore Sum]]
O5:O11O5=RANK.EQ([@[ZScore Sum]],[ZScore Sum],0)
P5:P11P5=RANK.EQ([@[Wtd ZScore Sum]],[Wtd ZScore Sum],0)
Q5:Q11Q5=[@[ZScore Sum Rank]]-[@[Wtd ZScore Sum Rank]]
E5:E11E5=ZScore([@Price],TblSimple[[#Totals],[Price]],$D$13,$E$3)
G5:G11G5=ZScore([@Range],TblSimple[[#Totals],[Range]],$F$13,$G$3)
D12D12=SUBTOTAL(101,[Price])
E12E12=SUBTOTAL(101,[Price ZScore LoHi])
F12F12=SUBTOTAL(101,[Range])
G12G12=SUBTOTAL(101,[Range ZScore])
H12H12=SUBTOTAL(101,[Amazon Rtg])
I12I12=SUBTOTAL(101,[Amazon Rtg ZScore])
J12J12=SUBTOTAL(101,[Amazon Rtg2])
K12K12=SUBTOTAL(101,[Amazon Rtg2 ZScore])
L12L12=SUBTOTAL(101,[ZScore Sum])
M12M12=SUBTOTAL(101,[Wtd ZScore Sum])
N12N12=SUBTOTAL(101,[Sum Δ])
D13D13=STDEV.S(TblSimple[Price])
E13E13=STDEV.S(TblSimple[Price ZScore LoHi])
F13F13=STDEV.S(TblSimple[Range])
G13G13=STDEV.S(TblSimple[Range ZScore])
H13H13=STDEV.S(TblSimple[Amazon Rtg])
I13I13=STDEV.S(TblSimple[Amazon Rtg ZScore])
J13J13=STDEV.S(TblSimple[Amazon Rtg2])
K13K13=STDEV.S(TblSimple[Amazon Rtg2 ZScore])
L13L13=STDEV.S(TblSimple[ZScore Sum])
M13M13=STDEV.S(TblSimple[Wtd ZScore Sum])


This is getting too complicated for Excel. I think I will move it to a database. Then I can add other paramaters such as the max/min you suggested above.

Do you (or anyone else) have any comments or suggestions?
 
Upvote 0
PS: Here's the code for the ZScore function:

VBA Code:
Function ZScore(ByVal pValue As Variant, ByVal pMean As Double, ByVal pStdDev As Double, _
                ByVal pOrder As String) As Double
'Application.Volatile

' Check that pOrder is valid
If Not (UCase(pOrder) = "HILO" Or UCase(pOrder) = "LOHI") Then
  ZScore = CVErr(xlErrValue): Exit Function: End If

' If the value is not a number, return zero,
' which is the same as setting it to the average of the other values.
If Not WorksheetFunction.IsNumber(pValue) Then  'If not a number,
  ZScore = 0: Exit Function: End If               'Return zero

' If std dev is zero, return 0; else calculate Z Score
If pStdDev = 0 Then                   'If std dev = 0,
  ZScore = 0                            'Return 0
Else                                  'Else
  ZScore = (pValue - pMean) / pStdDev   'Calculate the ZScore
  If UCase(pOrder) = "LOHI" Then      'If range order is reversed,
    ZScore = -ZScore: End If            'Reverse the ZScores
End If

End Function
 
Upvote 0
JenniferMurphy

The idea of using a Variant to deal with nulls seems fine to me.

As far as the algorithim goes, it seems to produce a meaningful result but personally I would have used a z-score formula (review score - average score) / standard deviation

And for the Test reviewer, you could put in additional smarts (meaning more helper tables) to exclude reviewers or products that have less than a minimum number of reviews.
I should have marked this as the answer long ago. Sorry about that. Z Scores are even more useful than I first thought. Thanks for that tip.
 
Upvote 0

Forum statistics

Threads
1,224,814
Messages
6,181,127
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top