Confidence Interval Trendline

fishme147 · Jul 28, 2015

Hi All. New to the forum, so please be gentle.

I have a bubble chart that I have a weighted trendline on. I'm not happy with how the line of best fit gives a misleading representation of the predictability of the data, hence I would like to replace this single line with a band of possible trendlines instead. (Essentially an x% confidence interval range, where x is unlikely to be greater than 80).

I appreciate that there is no one perfect answer to this problem, so I am simply looking for the best possible solution (if it exists!)

First thing to master is the mathematics behind the problem. Then I will aim to code up something in VBA (of which, my experience is limited, so any help here would be appreciated too!).

My first approach to this involved splitting up the data points into thin vertical bands (about 30 of them) and calculating the weighted standard deviation for each band, and by assuming a normal distribution of the spread within each band calculate an upper band and lower band. Then I fit a trendline to the set of upper bands and also one to the set of lower bands. Where this falls over slightly is in the bands (in the extremes where data becomes sparse) that have less than two data points in, as the variance cannot be calculated. The result (possibly influenced by this data sparsity) was that the two curves were different shapes and almost crossed at the extreme right of the chart. I feel like the lack of data in the tails should increase the distance between the lines.
I also pondered to what extent each band's spread should be influenced by it's neighbouring band's spread, but I can't work out a way to incorporate this.

The chart that I have been testing on has around 500 data points. However, I hope to repeat this process for many different bubble charts, with varying numbers of data points.

Any thoughts?

Many thanks

Ben Nevis · Jul 30, 2015

Hi, Please see the answer I posted yesterday in the following thread:
http://www.mrexcel.com/forum/excel-...tatistical-tool-use.html?highlight=confidence
Your question would need to be approached in a different way and I think I would need to see the data/understand what you are trying to do to provide an answer.
If you think I might be able to help let me know.

Ben Nevis · Jul 31, 2015

Hi, If you're just interested in the mathematics, I hope the following is helpful.
A3:B12 are Fertiliser amounts and Crop yields.
If you set up an X Y scatter plot with straight lines using A21:B25 that will give you a regression line for the data.
If you add the Lower and Upper 90% Confidence Levels in D21:D25 and E21:25 respectively you will find that they form curves that diverge from the regression line as you have indicated you would expect to see Quote <<I feel like the lack of data in the tails should increase the distance between the lines>>
The value of 1.86 that prefixes the formulae in C21:C25 is Student's t critical point for n - 2 = 8 degrees of freedom.

Excel 2007

A

B

C

D

E

F

G

H

I

1

2

Y

X

X^2

X*Y

3

12

2

4

24

4

13

2

4

26

5

13

3

9

39

6

14

3

9

42

7

15

4

16

60

8

15

4

16

60

9

14

5

25

70

10

16

5

25

80

11

17

6

36

102

12

18

6

36

108

13

14

Sxx

20

15

Sxy

23

16

Mean X

4

17

n X

10

18

19

20

X

Y

Lower Confidence Interval

Upper Confidence Interval

21

2

22

3

23

4

24

5

25

6

26

27

SUMMARY OUTPUT

28

29

Regression Statistics

30

Multiple R

0.907738

31

R Square

0.823988

32

Adjusted R Square

0.801986

33

Standard Error

0.840387

34

Observations

10

35

36

ANOVA

37

df

SS

MS

F

Significance F

38

Regression

1

26.45

37.4513274

0.000283

39

Residual

8

5.65

0.70625

40

Total

9

32.1

41

42

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 90.0%

Upper 90.0%

43

Intercept

10.1

0.797261

12.6683744

1.417E-06

8.261512

11.9384882

8.61745477

11.58255

44

X

1.15

0.187916

6.11974897

0.00028325

0.716664

1.58333583

0.80056074

1.499439

Sheet1

Cell Formulas
Range	Formula
A1	=SUM(A3:A12)
B1	=SUM(B3:B12)
B14	=C1-B1^2/B17
B15	=D1-(B1*A1/B17)
B16	=AVERAGE(B3:B12)
B17	=COUNT(A3:A12)
B21	=B43+B44*A21
B22	=B43+B44*A22
B23	=B43+B44*A23
B24	=B43+B44*A24
B25	=B43+B44*A25
C1	=SUM(C3:C12)
C3	=B3*B3
C4	=B4*B4
C5	=B5*B5
C6	=B6*B6
C7	=B7*B7
C8	=B8*B8
C9	=B9*B9
C10	=B10*B10
C11	=B11*B11
C12	=B12*B12
C21	=1.86B33SQRT(1/B17+(A21-B16)^2/B14)
C22	=1.86B33SQRT(1/B17+(A22-B16)^2/B14)
C23	=1.86B33SQRT(1/B17+(A23-B16)^2/B14)
C24	=1.86B33SQRT(1/B17+(A24-B16)^2/B14)
C25	=1.86B33SQRT(1/B17+(A25-B16)^2/B14)
D1	=SUM(D3:D12)
D3	=A3*B3
D4	=A4*B4
D5	=A5*B5
D6	=A6*B6
D7	=A7*B7
D8	=A8*B8
D9	=A9*B9
D10	=A10*B10
D11	=A11*B11
D12	=A12*B12
D21	=B21-C21
D22	=B22-C22
D23	=B23-C23
D24	=B24-C24
D25	=B25-C25
E21	=B21+C21
E22	=B22+C22
E23	=B23+C23
E24	=B24+C24
E25	=B25+C25

Confidence Interval Trendline

fishme147

New Member

Ben Nevis

Board Regular

Ben Nevis

Board Regular

Similar threads

Share this page

Confidence Interval Trendline

fishme147

New Member

Ben Nevis

Board Regular

Ben Nevis

Board Regular

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock