Calculating variance, sum, average for range of rows during a specific time interval???

yyl05

New Member
Joined
Apr 7, 2018
Messages
5
Hello all,

I have a huge csv file with a few hundred thousand rows. I would like to perform statistical calculations on it and also obtain the first row value, for each time interval.

Example for every 300seconds, I would like to calculate the variance of the total number of rows within the 1st 300seconds, 2nd 300 seconds, 3rd 300 seconds and so on.... So the total number of rows would vary with each 300 seconds interval.

I would also like to get the value of a specific column in the First Row for each 300 second interval.
Example during the 1st 300 seconds, I would like to get the value of the 1st row during the time interval, for the next 300seconds I would like to get the value of the 1st row in the time interval again.

I have a time column T, which is all in seconds, so maybe I could calculate for every T=T+300?

Any suggestions on how and steps to do it??

Thanks a million!!!
 

Excel Facts

Workdays for a market open Mon, Wed, Friday?
Yes! Use "0101011" for the weekend argument in NETWORKDAYS.INTL or WORKDAY.INTL. The 7 digits start on Monday. 1 means it is a weekend.
I assume you have a column of elapsed time sorted in ascending order. Does this work for you?

Book1
ABCDEFGH
1SecondsReadingIntervalFirst RdngMeanVarianceStd. Dev.
20530054.3334.5562.134
36626001114.0005.6002.366
41187900106104.6006.6402.577
51852
62423
72947
834411
940713
1046013
1152518
1258815
13633106
14700107
15751101
16796107
17858102
189171111
Sheet1
Cell Formulas
RangeFormula
E2=B2
E3=INDEX(B:B, MATCH($D2, A:A) + 1)
F2=AVERAGE($B$2:INDEX(B:B, MATCH($D2, A:A)))
F3=AVERAGE(INDEX(B:B, MATCH($D2, A:A) + 1):INDEX(B:B,MATCH($D3, A:A)))
G2=VAR.P($B$2:INDEX(B:B, MATCH($D2, A:A)))
G3=VAR.P(INDEX(B:B, MATCH($D2, A:A) + 1):INDEX(B:B,MATCH($D3, A:A)))
H2=STDEV.P($B$2:INDEX(B:B, MATCH($D2, A:A)))
H3=STDEV.P(INDEX(B:B, MATCH($D2, A:A) + 1):INDEX(B:B,MATCH($D3, A:A)))
 
Upvote 0
This is what I am looking for, thank you so much!!

Also, is it possible to do all these calculations for a specific column value?
e.g for tcp.stream there are values =0,1,2,3,.....99999. How can I perform these formulas with tcp.stream==1 and so on?
SoRjS5.jpg

I would like to perform the same calculations for the same 300seconds interval for when tcp.stream==1,2,3.....

Thank you so much for your help
 
Upvote 0
I wonder if this could be done with a pivot table. I came up with this.

The duplicated time stamps and the blank cells produced some frowns. Why can't data be tidy? I used helper columns: one for the count of observations within an interval, two to hold MATCH formulas.

I used SUMPRODUCT formulas to calculate the mean and the variance over an interval for the tcp.stream; this may make sheet recalculations slow. I could not figure out a way to make AVERAGEIF or AVERAGEIFS work to determine the mean for a particular stream within an interval.

I went to 400,000 rows for the SUMPRODUCT and the COUNTIFS formulas—tighten or loosen those ranges as you see fit.


Book1
ABCDEFGHIJKL
1Secondstcp.streamReadingInterval_endFirst RdngMeanVarianceStd. Dev.
21530055.0004.7502.179
36626001114.0005.6002.366
411817900106104.6006.6402.577
518522
624213
729417TCP to Match2
830017
93007Interval_endFirst RdngMeanVarianceStd. Dev.Countmatch.intvlmatch.TCP
1034411130022.0000.0000.000194
114072136001315.5006.2502.5002142
12460113900102102.0000.0000.0001195
13525218
14588115
15633106
167003107
177513101
18796107
198582102
209171111
Sheet1
Cell Formulas
RangeFormula
F10=IFERROR(INDEX(C2:INDEX(C:C, $K10), $L10), "")
F11=IFERROR(INDEX(INDEX(C:C, $K10 + 1):INDEX(C:C,$K11), $L11), "")
G10=IFERROR(SUMPRODUCT(($A$2:$A$400000<=$E10)*($B$2:$B$400000=$F$7)*($C$2:$C$400000)) / $J10, "")
G11=IFERROR(SUMPRODUCT(($A$2:$A$400000>$E10)*($A$2:$A$400000<=$E11)*($B$2:$B$400000=$F$7)*$C$2:$C$400000) / $J11, "")
H10=IFERROR(SUMPRODUCT(($C$2:$C$400000 - G10)^2 * ($A$2:$A$400000<=$E10) * ($B$2:$B$400000=$F$7)) / $J10, "")
H11=IFERROR(SUMPRODUCT(($C$2:$C$400000 - $G11)^2 * ($A$2:$A$400000>$E10) * ($A$2:$A$400000<=$E11) * ($B$2:$B$400000=$F$7)) / $J11, "")
I10=IFERROR(SQRT($H10), "")
I11=IFERROR(SQRT($H11), "")
J10=COUNTIFS($A$2:$A$400000, "<="&E10, $B$2:$B$400000,$F$7)
J11= COUNTIFS($A$2:$A$400000, ">" & $E10, $A$2:$A$400000, "<="&$E11, $B$2:$B$400000,$F$7)
K10=MATCH($E10, A:A)
K11=MATCH($E11, A:A)
L10=IFERROR(MATCH($F$7, B2:INDEX(B:B, $K10), 0), "")
L11=IFERROR(MATCH($F$7, INDEX(B:B, $K10 + 1):INDEX(B:B, $K11), 0), "")
 
Upvote 0
I don't know what I was thinking yesterday. Here are some better formulas for the calculations by stream:


Book1
EFGHIJKL
7TCP to Match1
8
9Interval_endFirst RdngMeanVarianceStd. Dev.Countmatch.intvlmatch.TCP
1030076.0003.0001.732493
116001113.0002.6671.6333141
12900019
Sheet1
Cell Formulas
RangeFormula
F10=IFERROR(INDEX(C2:INDEX(C:C, $K10), $L10), "")
F11=IFERROR(INDEX(INDEX(C:C, $K10 + 1):INDEX(C:C,$K11), $L11), "")
G10=IFERROR(AVERAGEIF(B2:INDEX(B:B, $K10), $F$7, C2:INDEX(C:C, $K10)), "")
G11=IFERROR(AVERAGEIF(INDEX(B:B, $K10+1):INDEX(B:B, $K11), $F$7, INDEX(C:C, $K10+1):INDEX(C:C, $K11)), "")
H10=IFERROR(SUMPRODUCT((C2:INDEX(C:C, $K10) - $G10)^2 * (B2:INDEX(B:B, $K10) = $F$7)) / $J10, "")
H11=IFERROR(SUMPRODUCT((INDEX(C:C, $K10+1):INDEX(C:C, $K11) - $G11)^2 * (INDEX(B:B, $K10+1):INDEX(B:B, $K11)=$F$7)) / $J11, "")
I10=IFERROR(SQRT($H10), "")
I11=IFERROR(SQRT($H11), "")
J10=COUNTIF(B2:INDEX(B:B,$K10), $F$7)
J11=COUNTIF(INDEX(B:B, $K10+1):INDEX(B:B, $K11), $F$7)
K10=MATCH($E10, A:A)
K11=MATCH($E11, A:A)
L10=IFERROR(MATCH($F$7, B2:INDEX(B:B, $K10), 0), "")
L11=IFERROR(MATCH($F$7, INDEX(B:B, $K10 + 1):INDEX(B:B, $K11), 0), "")
 
Upvote 0

Forum statistics

Threads
1,223,889
Messages
6,175,226
Members
452,620
Latest member
dsubash

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top