Statistics & the Chi Square function

rujedi

New Member
Joined
Aug 7, 2017
Messages
4
Hi everyone,

I'm teaching statistics and started incorporating Excel alongside R. When we got to running a Chi-Square test, I noticed something very odd that I cannot figure out. Can someone tell me why I get different results with the same formula function.
Note, I'm using Excel 365 on Windows 10.


[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD][/TD]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[TD]E[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Observed[/TD]
[TD]past[/TD]
[TD]present[/TD]
[TD]future[/TD]
[TD]row.total[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]up[/TD]
[TD]12[/TD]
[TD]45[/TD]
[TD]78[/TD]
[TD]135[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]down[/TD]
[TD]23[/TD]
[TD]56[/TD]
[TD]89[/TD]
[TD]168[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]col.total[/TD]
[TD]35[/TD]
[TD]101[/TD]
[TD]167[/TD]
[TD]303[/TD]
[/TR]
</tbody>[/TABLE]

[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD][/TD]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[TD]E[/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]Expected[/TD]
[TD]past[/TD]
[TD]present[/TD]
[TD]future[/TD]
[TD]row.total[/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]up[/TD]
[TD]15.59[/TD]
[TD]45.00[/TD]
[TD]74.41[/TD]
[TD]135[/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD]down[/TD]
[TD]19.41[/TD]
[TD]56.00[/TD]
[TD]92.59[/TD]
[TD]168[/TD]
[/TR]
[TR]
[TD]10[/TD]
[TD]col.total[/TD]
[TD]35[/TD]
[TD]101[/TD]
[TD]167[/TD]
[TD]303[/TD]
[/TR]
</tbody>[/TABLE]


When I use =CHISQ.TEST(B2:D3,B8:D9) I get a p-value of 0.4051 which is correct.
BUT.....
If I rearrange the tables to look like below, I get a different p-value.
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD][/TD]
[TD]H[/TD]
[TD]I[/TD]
[TD]J[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD][/TD]
[TD]observed[/TD]
[TD]expected[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]up past[/TD]
[TD]12[/TD]
[TD]15.59[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]up present[/TD]
[TD]45[/TD]
[TD]45.00[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]up future[/TD]
[TD]78[/TD]
[TD]74.41[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]down past[/TD]
[TD]23[/TD]
[TD]19.41[/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD]down present[/TD]
[TD]56[/TD]
[TD]56.00[/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]down future[/TD]
[TD]89[/TD]
[TD]92.59[/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]total[/TD]
[TD]303[/TD]
[TD]303[/TD]
[/TR]
</tbody>[/TABLE]

When I use =CHISQ.TEST(I2:I8,J2:J8) I get a p-value of 0.8741 which is not correct.

What's going on that I am missing? Does it have something to do with how the degrees of freedom are being counted?
I didn't see an function option to specify a Goodness of Fit or Test for Independence, which would change how df is counted.

Is there a function to produce the Chi-Square statistic?
At first I thought it was =CHISQ.TEST(), because =CHISQ.DIST.RT also gives me the p-value if I provide it the Chi-Square statistic and the degrees of freedom.
When I start to type =CHI all of the options that pop-up have something to do with the probability and not providing the statistic.
I don't see Chi-Square listed in the Data Analysis Tookpak.

Thanks for the assist!
Shawn
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
You are right it has to do with the df.
df=(rows-1)(columns-1)

In your first table you have 2 rows and 3 columns = (2-1)(3-1)=2 df
Your second table has 6 rows and 2 columns = (6-1)(2-1)=5 df

If you calculate chi-square in both cases you will get 1.803
I don't know of a function in Excel that will only calculate chi-square.

Excel Workbook
ABCDEF
1Observedpastpresentfuturerow.total
2up124578135
3down235689168
4col.total35101167303
5
6
Table 1
Excel Workbook
ABCDEF
1observedexpectedchi-square
2up past1215.590.82669
3up present45450.00000
4up future7874.410.17320
5down past2319.410.66399
6down present56560.00000
7down future8992.590.13920
8total3033031.80308= chi-square
9
10
11chi-square1.803
12df5
13p value0.8756765
Table 2
 
Upvote 0
I noticed that Excel Jeanie didn't post everthing above.

Excel Workbook
ABCDEF
1Observedpastpresentfuturerow.total
2up124578135
3down235689168
4col.total35101167303
5
6
7Expectedpastpresentfuturerow.total
8up15.594574.41135
9down19.415692.59168
10col.total35101167303
11
12
13chi-square1.803082
14df=2
15p value0.405944
16
Table 1



Excel Workbook
ABCDEF
1observedexpectedchi-square
2up past1215.590.82669
3up present45450.00000
4up future7874.410.17320
5down past2319.410.66399
6down present56560.00000
7down future8992.590.13920
8total3033031.80308= chi-square
9
10
11chi-square1.803
12df5
13p value0.8756765
Table 2
 
Upvote 0
Thank you AhoyNC!
I had a feeling it was the degrees of freedom tripping up the function. Too bad I cannot specify the degrees of freedom in that function or specify if it is a Goodness of Fit or Test of Independence Chi-Square since df is different for each.

This also means that I will have to be very specific if allow my students to reshape the table to manually calculate Chi-Square (less space & easier to drag formulas) since it will change the df. They'll have to use the =CHI.DIST.RT(x, df) function and input the df since the =CHISQ.TEST(obs, exp) function doesn't have the ability to set the df.

Excel Jeanie HTML 4--Very neat trick. That will surely save me work trying to replicate my spreadsheet in a forum post.

I find it interesting, that Excel offers two ways to get the p-value:
=CHI.DIST.RT(x, df) when we already have the Chi-Square and df, and
=CHISQ.TEST(observed, expected) when we don't yet have the Chi-Square or df.

Curious though--one of my students started using =CHISQ.INV.RT(probability, df) to generate the Chi-Square statistic. Sometimes this works and seems exact, while other times it is in the ball park but not the same value.

Using the example of Excel tables, =CHISQ.INV.RT(B13, B14) results in a Chi-Square of 1.807. Pretty spot on this time.
 
Upvote 0
Your students could add a 3rd table to calculate the (obs-exp)^2/exp as in cell B14 below and the just copy the formula across and down. Then just use the SUM function to get the Chi-square.
Excel Workbook
ABCDE
1Observedpastpresentfuturerow.total
2up124578135
3down235689168
4col.total35101167303
5
6
7Expectedpastpresentfuturerow.total
8up15.594574.41135
9down19.415692.59168
10col.total35101167303
11
12
13(Obs-Exp)^2/Exppastpresentfuture
140.82670.00000.1732
150.66400.00000.1392
16
17Chi-square1.80308
Sheet
 
Upvote 0
I wholly agree AhoyNC. I actually make them do the Chi-Square step by step in Excel, to eventually sum the Chi-Square, look up the critical value on stats tables and determine the statistical significance. This was usually to help make sure they knew how to do it 'by hand' and double-check the results they get from computing the value in R (or SPSS or Stata depending which class I'm teaching).

Only recently did we start exploring Excel's ability calculate p-values and try to calculate the Chi-Square statistic, which is what led to my initial confusion.

Thanks again for the assist!
 
Upvote 0

Forum statistics

Threads
1,223,228
Messages
6,170,875
Members
452,363
Latest member
merico17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top