Prediction/Statistical Analysis

James8761

Board Regular
Joined
Apr 24, 2012
Messages
156
Office Version
  1. 365
Platform
  1. Windows
Hi,

I have been given some data to work on regarding timings. Before an event starts there is a 0% to 100% chance that something will happen. Someone predicts this.

The next day I am given the results.

Looking at the first chart below it shows that 17 times it was predicted that 60% would be the answer.

Looking at the second chart below it shows that 60% was correct 14 times. (82%)


I then have the third table that does this division and turns it in to a %. In this case 82%.


At the moment I am then just scanning through the data highlighting a range where I can say the % predicted at the start are most accurate. I have highlighted between 46% and 80% on here. There are a couple of 0%’s in there but I need to choose a range. The highlighted range I then put in a formula for each time. The highlighted range in this instance is between 46% and 80% was predicted 132 times and it actually happened 109 times, so a 82.6% hit rate.

This is all visual at present and I then enter a manual formula.

Is there a way where I could automate this to find the ‘sweet spot’ I don’t want to cherry pick and say exclude 73% and 62% but include everything between 46% and 80% as I have to find a range.

It may well be if I looked at between 49% and 83% that may be better. The only stipulation is that there has to be a minimum of 100 predicted outcomes to make the data mean anything.



Way out of my league this one, but thought I’d ask if anyone knew a way before I go through manually.


Thanks for any help!



Sweet Spot?
H FHS%
0%​
88%​
1%​
2%​
3%​
4%​
5%​
6%​
7%​
8%​
9%​
10%​
11%​
12%​
13%​
14%​
15%​
16%​
17%​
18%​
19%​
50%​
20%​
67%​
21%​
0%​
22%​
100%​
23%​
50%​
24%​
25%​
100%​
26%​
27%​
50%​
28%​
29%​
0%​
30%​
75%​
31%​
80%​
32%​
33%​
92%​
34%​
35%​
36%​
100%​
37%​
38%​
50%​
39%​
40%​
80%​
41%​
42%​
50%​
43%​
100%​
44%​
50%​
45%​
0%​
46%​
100%​
47%​
100%​
48%​
49%​
50%​
90%​
51%​
52%​
53%​
100%​
54%​
100%​
55%​
100%​
56%​
100%​
57%​
67%​
58%​
67%​
59%​
60%​
82%​
61%​
62%​
100%​
63%​
83%​
64%​
83%​
65%​
66%​
67%​
68%​
68%​
69%​
80%​
70%​
0%​
71%​
75%​
72%​
73%​
67%​
74%​
75%​
86%​
76%​
77%​
50%​
78%​
100%​
79%​
0%​
80%​
100%​


Actual Result
H FHS%
0%​
28​
1%​
0​
2%​
0​
3%​
0​
4%​
0​
5%​
0​
6%​
0​
7%​
0​
8%​
0​
9%​
0​
10%​
0​
11%​
0​
12%​
0​
13%​
0​
14%​
0​
15%​
0​
16%​
0​
17%​
0​
18%​
0​
19%​
1​
20%​
4​
21%​
0​
22%​
5​
23%​
1​
24%​
0​
25%​
6​
26%​
0​
27%​
3​
28%​
0​
29%​
0​
30%​
3​
31%​
4​
32%​
0​
33%​
11​
34%​
0​
35%​
0​
36%​
6​
37%​
0​
38%​
3​
39%​
0​
40%​
4​
41%​
0​
42%​
1​
43%​
2​
44%​
1​
45%​
0​
46%​
1​
47%​
3​
48%​
0​
49%​
0​
50%​
28​
51%​
0​
52%​
0​
53%​
1​
54%​
2​
55%​
2​
56%​
7​
57%​
2​
58%​
2​
59%​
0​
60%​
14​
61%​
0​
62%​
2​
63%​
5​
64%​
5​
65%​
0​
66%​
0​
67%​
13​
68%​
0​
69%​
4​
70%​
0​
71%​
3​
72%​
0​
73%​
2​


Prediction Before
H FHS%
0%​
32​
1%​
0​
2%​
0​
3%​
0​
4%​
0​
5%​
0​
6%​
0​
7%​
0​
8%​
0​
9%​
0​
10%​
0​
11%​
0​
12%​
0​
13%​
0​
14%​
0​
15%​
0​
16%​
0​
17%​
0​
18%​
0​
19%​
2​
20%​
6​
21%​
2​
22%​
5​
23%​
2​
24%​
0​
25%​
6​
26%​
0​
27%​
6​
28%​
0​
29%​
1​
30%​
4​
31%​
5​
32%​
0​
33%​
12​
34%​
0​
35%​
0​
36%​
6​
37%​
0​
38%​
6​
39%​
0​
40%​
5​
41%​
0​
42%​
2​
43%​
2​
44%​
2​
45%​
1​
46%​
1​
47%​
3​
48%​
0​
49%​
0​
50%​
31​
51%​
0​
52%​
0​
53%​
1​
54%​
2​
55%​
2​
56%​
7​
57%​
3​
58%​
3​
59%​
0​
60%​
17​
61%​
0​
62%​
2​
63%​
6​
64%​
6​
65%​
0​
66%​
0​
67%​
19​
68%​
0​
69%​
5​
70%​
1​
71%​
4​
72%​
0​
 

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
Ultimately this is not an Excel question but a data analysis question. I see three issues with your goal.

First, if x% is predicted 1 time, and x% actually occurs 1 time, it will score 100% on those predictions. Your method will then compare that to the case where y% is predicted 50 times, and y% occurs 48 times, for a score of 96%. Clearly the statistical significance in the second case is higher than the first case, but if you just compare percentages the first case looks better..

Second, you are asking to find "a range where I can say the % predicted at the start are most accurate" but you have not defined the width of that range, or how we should determine it.

Third, the data distribution of % correct predictions looks random (see image). I have not done a statistical test for randomness, just eyeballed it, but I am not even sure why it would be assumed that there would be a "sweet spot." More analysis might be needed to evaluate the distribution of correct predictions over ranges.

Perhaps we can help more if we understand the context. What is the objective of identifying a "sweet spot"?
 

Attachments

  • predictions.JPG
    predictions.JPG
    24.2 KB · Views: 5
Upvote 0

Forum statistics

Threads
1,223,991
Messages
6,175,820
Members
452,672
Latest member
missbanana

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top