Hello guys,
I have a database of about 40 000 matches that I need to analyze. This is only an example of how the data look like (you'll get the idea from the table)
[TABLE="width: 500"]
<tbody>[TR]
[TD]round1
[/TD]
[TD]Winner (J)
[/TD]
[TD]Looser (K)
[/TD]
[TD]Ordred for Zverev[/TD]
[TD]series[/TD]
[/TR]
[TR]
[TD]round1[/TD]
[TD]Zverev[/TD]
[TD]Federer[/TD]
[TD]1[/TD]
[TD]W1[/TD]
[/TR]
[TR]
[TD]round1[/TD]
[TD]Nieminen[/TD]
[TD]Becer[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]round2[/TD]
[TD]Karlovic[/TD]
[TD]Tsonga[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]round2[/TD]
[TD]Zverev[/TD]
[TD]Berdych[/TD]
[TD]1[/TD]
[TD]W2[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Lend[/TD]
[TD]Edberg[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Borg[/TD]
[TD]Agassi[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Krajicek[/TD]
[TD]Karlovic[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Sampras[/TD]
[TD]Zverev[/TD]
[TD]-1[/TD]
[TD]L1[/TD]
[/TR]
</tbody>[/TABLE]
What I need to do is:
To caculate for each player the consecutive wins or losses and the frequency of these steaks
(dont forget, there are 40 000 items and god knows how many palyers)
For example, if I were to indiviually search the data I would find out that Zverev had the following series: L1, W4, L2, W6, L3 etc.
( L stands for loss; W for win; and the nubmer next to them signifinifies how many times in a row did they happen)
So the finale table for Zverev would be, for example:
a) 55 times W1
b) 32 times W2
c)12 times W3
d) 14 times L5
.....
with longest streak W14
and average winning streak of W5
and average losing streak of L4
How can I extract this information from such a table in the most effective and economical way?
Please help!
Michaela
P.S.
My thoughts thus far are:
1. insert a separate column where I will select for each individual player.
For example, for Zverev, I would use this formula:
=IF(J2="Zverev",1,IF(K2="Zverev",-1,IF(NOT(OR(K2="Zverev",B5="Zverev")),0)))
This would give me a column of 1s,0s, and -1 (as in the table above)
2. Clean the data of the "0s" (don't know how to do that???)
3. Convert the row of 1s and -1s into W and L values
e.g.
W1
W2
L3 etc
4. Lastly use the countif function to get the frequence for each category (i.e W1...Wn..L1..Ln)
Problems:
1. I would have to search manually for each player and change the formula to his name
2. Even if I manage to get rid of the "zeros" in the series and have only 1s and -1 (i.e. W1, W2, L3 etc. ) there's clearly a difference between an independent W1 (sandwiched between L1 and L1) and W1 which is part of a series from W1..to W4 (here you want to count the last only).
I have a database of about 40 000 matches that I need to analyze. This is only an example of how the data look like (you'll get the idea from the table)
[TABLE="width: 500"]
<tbody>[TR]
[TD]round1
[/TD]
[TD]Winner (J)
[/TD]
[TD]Looser (K)
[/TD]
[TD]Ordred for Zverev[/TD]
[TD]series[/TD]
[/TR]
[TR]
[TD]round1[/TD]
[TD]Zverev[/TD]
[TD]Federer[/TD]
[TD]1[/TD]
[TD]W1[/TD]
[/TR]
[TR]
[TD]round1[/TD]
[TD]Nieminen[/TD]
[TD]Becer[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]round2[/TD]
[TD]Karlovic[/TD]
[TD]Tsonga[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]round2[/TD]
[TD]Zverev[/TD]
[TD]Berdych[/TD]
[TD]1[/TD]
[TD]W2[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Lend[/TD]
[TD]Edberg[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Borg[/TD]
[TD]Agassi[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Krajicek[/TD]
[TD]Karlovic[/TD]
[TD]0[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]quarter[/TD]
[TD]Sampras[/TD]
[TD]Zverev[/TD]
[TD]-1[/TD]
[TD]L1[/TD]
[/TR]
</tbody>[/TABLE]
What I need to do is:
To caculate for each player the consecutive wins or losses and the frequency of these steaks
(dont forget, there are 40 000 items and god knows how many palyers)
For example, if I were to indiviually search the data I would find out that Zverev had the following series: L1, W4, L2, W6, L3 etc.
( L stands for loss; W for win; and the nubmer next to them signifinifies how many times in a row did they happen)
So the finale table for Zverev would be, for example:
a) 55 times W1
b) 32 times W2
c)12 times W3
d) 14 times L5
.....
with longest streak W14
and average winning streak of W5
and average losing streak of L4
How can I extract this information from such a table in the most effective and economical way?
Please help!
Michaela
P.S.
My thoughts thus far are:
1. insert a separate column where I will select for each individual player.
For example, for Zverev, I would use this formula:
=IF(J2="Zverev",1,IF(K2="Zverev",-1,IF(NOT(OR(K2="Zverev",B5="Zverev")),0)))
This would give me a column of 1s,0s, and -1 (as in the table above)
2. Clean the data of the "0s" (don't know how to do that???)
3. Convert the row of 1s and -1s into W and L values
e.g.
W1
W2
L3 etc
4. Lastly use the countif function to get the frequence for each category (i.e W1...Wn..L1..Ln)
Problems:
1. I would have to search manually for each player and change the formula to his name
2. Even if I manage to get rid of the "zeros" in the series and have only 1s and -1 (i.e. W1, W2, L3 etc. ) there's clearly a difference between an independent W1 (sandwiched between L1 and L1) and W1 which is part of a series from W1..to W4 (here you want to count the last only).