Getting STDEV from sumproduct() ...

just_matt

New Member
Joined
Aug 10, 2018
Messages
7
I am struggling with standard deviation in a sumproduct() output.

I have developed a matrix that tracks stuff by hour of day and day of week over the course of a great length of time. This matrix is specific and by design.

Right now, I am able to achieve a total count, and an average count (details below). I need to figure out if I can leverage the same formulas to get standard deviation, min, and max.

I do have this answer elsewhere, however, I did it "the hard way" by creating massive tables and datasets to perform DAVERAGE() and DSTDEV()'s against.

Here's where I'm at with my "to-be"

The matrix is driven by a dataset that is about 38,000 rows.

So for my total count matrix, I have:

[TABLE="class: grid, width: 538"]
<colgroup><col span="2"><col span="6"></colgroup><tbody>[TR]
[TD][/TD]
[TD="align: right"]2 X24[/TD]
[TD="align: right"]3[/TD]
[TD="align: right"]4[/TD]
[TD="align: right"]5[/TD]
[TD="align: right"]6[/TD]
[TD="align: right"]7[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD][/TD]
[TD]M[/TD]
[TD]T[/TD]
[TD]W[/TD]
[TD]Th[/TD]
[TD]F[/TD]
[TD]S[/TD]
[TD]Su[/TD]
[/TR]
[TR]
[TD]Time[/TD]
[TD]M[/TD]
[TD]T[/TD]
[TD]W[/TD]
[TD]Th[/TD]
[TD]F[/TD]
[TD]S[/TD]
[TD]Su[/TD]
[/TR]
[TR]
[TD="align: right"]12:00 AM


W27
[/TD]
[TD="align: right"]50[/TD]
[TD="align: right"]46[/TD]
[TD="align: right"]41[/TD]
[TD="align: right"]54[/TD]
[TD="align: right"]54[/TD]
[TD="align: right"]6[/TD]
[TD="align: right"]3[/TD]
[/TR]
[TR]
[TD="align: right"]5:00 AM[/TD]
[TD="align: right"]264[/TD]
[TD="align: right"]255[/TD]
[TD="align: right"]257[/TD]
[TD="align: right"]291[/TD]
[TD="align: right"]247[/TD]
[TD="align: right"]4[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD="align: right"]6:00 AM[/TD]
[TD="align: right"]537[/TD]
[TD="align: right"]598[/TD]
[TD="align: right"]597[/TD]
[TD="align: right"]640[/TD]
[TD="align: right"]513[/TD]
[TD="align: right"]9[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD="align: right"]7:00 AM[/TD]
[TD="align: right"]726[/TD]
[TD="align: right"]807[/TD]
[TD="align: right"]783[/TD]
[TD="align: right"]689[/TD]
[TD="align: right"]744[/TD]
[TD="align: right"]7[/TD]
[TD="align: right"]9[/TD]
[/TR]
[TR]
[TD="align: right"]8:00 AM[/TD]
[TD="align: right"]785[/TD]
[TD="align: right"]865[/TD]
[TD="align: right"]795[/TD]
[TD="align: right"]818[/TD]
[TD="align: right"]742[/TD]
[TD="align: right"]10[/TD]
[TD="align: right"]7[/TD]
[/TR]
[TR]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[/TR]
</tbody>[/TABLE]


(snipped for the sake of brevity)
(apologies for the garbage/duplicate rows)

In each ## cell, the following formula exists:
=SUMPRODUCT(--(WEEKDAY(Open)=X$24),(HOUR(Open)>=HOUR($W27))*(HOUR(Open)<HOUR($W28)))

"Open" is the column within my raw data which contain a date. Lots and lots of data here to be counted, averaged, and hopefully STDEV'd!
X24 is "2" (above the M)
W27 is "12:00AM"

In order to get an average, I need to take into account the fact that my data source only records when events happen, and not when they don't. e.g. an event at 6:30AM may occur one week and be counted, but not occur the following week. If my sample size were 2, I'd need that total number to calculate my average.

Therefore, to calculate averages, I update that formula to:

=(SUMPRODUCT(--(WEEKDAY(Open)=X$24),(HOUR(Open)>=HOUR($W27))*(HOUR(Open)<HOUR($W28))))/(DAYS(MAX(Open),MIN(Open))/7)


So far, this works fine, and I have validated it against "the long way" which was mentioned above.

My Dilemma:
Where I'm getting stuck is figuring out how/where to apply the standard deviation equation in the SUMPRODUCT formula, or if this is even possible based on how I've derived the total and average counts.

If you really want to play with what's in "Open" you can do a =randbetween(42370375,43465625)/1000 for a few thousand rows to simulate what I've got.

If I may clarify what I've stated above please let me know.

Thank you in advance!
 
Excel Formula:
J358 = (SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1)*Responses)/(SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1)))
Gives me my "average" for the question from the year in that row, for the position in that column.


Better, IMHO, if you need compatibility with Excel 2003 and earlier (save to "xls" file):
Code:
=SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1), Responses) / SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1))
Note the removal of superfluous parentheses to improve readability. " " "You" "wouldn't write" "sentences like this" ", " "would you"?" " "

Caveat: That denominator counts rows in which Responses is empty or appears empty.

On the other hand, if you require Excel 2007 and later (save to "xlsx" file), simply:

=AVERAGEIFS(Responses, Year, $H358, Question, $C358, Title, E$1)

(That does not count rows in which Responses is empty or appears empty.)

-----
What would I add for the "Range-Mean" in the formula?


"Responses" is the range.

So with formula above for the average in J358, the "population" (exact) std dev equivalent to STDEVP is:
Code:
=SQRT(SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1), (Responses - $J358)^2) / COUNTIFS(Year, $H358, Question, $C358, Title, E$1))

The "sample" (approximate) std dev equivalent to STDEV is:
Code:
=SQRT(SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1), (Responses - $J358)^2) / (COUNTIFS(Year, $H358, Question, $C358, Title, E$1) - 1))

For Excel 2003 and earlier, replace COUNTIFS(Year,$H358,Question,$C358,Title,E$1) with SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1)) .
 
Last edited:
Upvote 0

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().
Code:
=SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1), Responses) / SUMPRODUCT((Year=$H358)*(Question=$C358)*(Title=E$1))
[....]
That denominator counts rows in which Responses is empty or appears empty.
Code:
=AVERAGEIFS(Responses, Year, $H358, Question, $C358, Title, E$1)
That does not count rows in which Responses is empty or appears empty.


There is a difference only in rows where (Year=$H358)*(Question=$C358)*(Title=E$1)=1 is true, but Responses appears blank.

Not a concern if that will never happen (e.g. if Responses appears blank, the entire row appears blank).
 
Upvote 0
OK this is a lot to play with. Suffice to say, "AVERAGEIFS" = Mind. Blown.
 
Upvote 0
Caveat: That denominator counts rows in which Responses is empty or appears empty.
[....]
That does not count rows in which Responses is empty or appears empty.
There is a difference only in rows where (Year=$H358)*(Question=$C358)*(Title=E$1)=1 is true, but Responses appears blank.
OK this is a lot to play with.


Sorry for any confusion. Ignore those caveats. The STDEV formulas do not permit any rows in Repsonses to be empty or appear blank. So it must be a non-issue. Just some half-baked second-thoughts as I was rushing out the door. Klunk!
 
Upvote 0
"AverageIfs" gives me a #Value error. Been messing with other things in Tableau for the time being.

Note the removal of superfluous parentheses to improve readability. " " "You" "wouldn't write" "sentences like this" ", " "would you"?" " "
Legacy of building formulas one argument at a time.

One argument.

At.

A.

Time.
 
Upvote 0

Forum statistics

Threads
1,223,905
Messages
6,175,297
Members
452,633
Latest member
DougMo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top