Accounting for Number of Observations in a Regression when using Median data

dross

New Member
Joined
Dec 19, 2010
Messages
27
I'm evaluating a data set of Median product prices against a 100-point product rating scale.

The data I have to work with are below. The Correlation of Rating to Median Price is 0.922. Since all of the products rated so far have fallen into the rating range of 87 to 95 I want to predict the median price (dependent variable) for a broader range of ratings (independent variable). I have done so with a simple linear regression which yields an R-squared of 0.85 (ignoring the number of observations).

Question: Since I know the number of observations from which the Median data has been derived, should I take account of the number of observations in my analysis, and if so how?

DATA:

[TABLE="width: 191"]
<tbody>[TR]
[TD="class: xl65, width: 71"]Observations[/TD]
[TD="class: xl65, width: 51"]Rating[/TD]
[TD="class: xl66, width: 69"]Median Price[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]4[/TD]
[TD="class: xl65, width: 51"]95[/TD]
[TD="class: xl63"]92.5[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]6[/TD]
[TD="class: xl65, width: 51"]94[/TD]
[TD="class: xl63"]82.5[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]10[/TD]
[TD="class: xl65, width: 51"]93[/TD]
[TD="class: xl63"]49[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]29[/TD]
[TD="class: xl65, width: 51"]92[/TD]
[TD="class: xl63"]46[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]31[/TD]
[TD="class: xl65, width: 51"]91[/TD]
[TD="class: xl63"]45[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]27[/TD]
[TD="class: xl65, width: 51"]90[/TD]
[TD="class: xl63"]40[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]22[/TD]
[TD="class: xl65, width: 51"]89[/TD]
[TD="class: xl63"]45[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]6[/TD]
[TD="class: xl65, width: 51"]88[/TD]
[TD="class: xl63"]23[/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"]1[/TD]
[TD="class: xl65, width: 51"]87[/TD]
[TD="class: xl63"]13[/TD]
[/TR]
</tbody>[/TABLE]
<style>table { }td { padding-top: 1px; padding-right: 1px; padding-left: 1px; color: black; font-size: 12pt; font-weight: 400; font-style: normal; text-decoration: none; font-family: Calibri, sans-serif; vertical-align: bottom; border: medium none; white-space: nowrap; }.xl63 { text-align: center; }.xl64 { vertical-align: middle; white-space: normal; }.xl65 { text-align: center; vertical-align: middle; white-space: normal; }.xl66 { text-align: center; vertical-align: middle; }</style>
[TABLE="width: 71"]
<tbody>[TR]
[TD="class: xl63, width: 71"][/TD]
[/TR]
</tbody>[/TABLE]

[TABLE="width: 191"]
<tbody>[TR]
[/TR]
[TR]
[/TR]
[TR]
[/TR]
[TR]
[/TR]
[TR]
[TD="class: xl63"][/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"][/TD]
[TD="class: xl65, width: 51"][/TD]
[/TR]
[TR]
[TD="class: xl65, width: 71"][/TD]
[TD="class: xl65, width: 51"][/TD]
[TD="class: xl63"][/TD]
[/TR]
[TR]
[/TR]
[TR]
[/TR]
[TR]
[TD="class: xl63"][/TD]
[/TR]
</tbody>[/TABLE]
<style>table { }td { padding-top: 1px; padding-right: 1px; padding-left: 1px; color: black; font-size: 12pt; font-weight: 400; font-style: normal; text-decoration: none; font-family: Calibri, sans-serif; vertical-align: bottom; border: medium none; white-space: nowrap; }.xl63 { text-align: center; }.xl64 { vertical-align: middle; white-space: normal; }.xl65 { text-align: center; vertical-align: middle; white-space: normal; }.xl66 { text-align: center; vertical-align: middle; }</style>
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
I would think you'd need the underlying observations rather than the medians of those observations.
 
Upvote 0
I do have the observations within the rating range of 87-95. However, I've been asked to estimate median prices within the broader rating range of 85-100 from the available data. So, is what I have done valid and is there something more I could do with the known number of observations, or something else?
 
Upvote 0
[Table="width:, class:grid"][tr][td="bgcolor:#C0C0C0"][/td][td="bgcolor:#C0C0C0"]
A​
[/td][td="bgcolor:#C0C0C0"]
B​
[/td][td="bgcolor:#C0C0C0"]
C​
[/td][td="bgcolor:#C0C0C0"]
D​
[/td][td="bgcolor:#C0C0C0"]
E​
[/td][/tr][tr][td="bgcolor:#C0C0C0"]
1​
[/td][td="bgcolor:#F3F3F3"]
Observations
[/td][td="bgcolor:#F3F3F3"]
Rating
[/td][td="bgcolor:#F3F3F3"]
Median Price
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
2​
[/td][td="bgcolor:#F3F3F3"]
wi
[/td][td="bgcolor:#F3F3F3"]
xi
[/td][td="bgcolor:#F3F3F3"]
yi
[/td][td][/td][td="bgcolor:#F3F3F3"]
Correl
[/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
3​
[/td][td]
4​
[/td][td]
95​
[/td][td]
92.5​
[/td][td][/td][td]
0.775544​
[/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
4​
[/td][td]
6​
[/td][td]
94​
[/td][td]
82.5​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
5​
[/td][td]
10​
[/td][td]
93​
[/td][td]
49.0​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
6​
[/td][td]
29​
[/td][td]
92​
[/td][td]
46.0​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
7​
[/td][td]
31​
[/td][td]
91​
[/td][td]
45.0​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
8​
[/td][td]
27​
[/td][td]
90​
[/td][td]
40.0​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
9​
[/td][td]
22​
[/td][td]
89​
[/td][td]
45.0​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
10​
[/td][td]
6​
[/td][td]
88​
[/td][td]
23.0​
[/td][td][/td][td][/td][/tr]
[tr][td="bgcolor:#C0C0C0"]
11​
[/td][td]
1​
[/td][td]
87​
[/td][td]
13.0​
[/td][td][/td][td][/td][/tr]
[/table]


The formula in E3 is

=SUMPRODUCT(wi * (xi - SUMPRODUCT(wi * xi) / SUM(wi)) * (yi - SUMPRODUCT(wi * yi) / SUM(wi))) /
SQRT((SUMPRODUCT(wi * (xi - SUMPRODUCT(wi * xi) / SUM(wi)) ^ 2) * SUMPRODUCT(wi * (yi - SUMPRODUCT(wi * yi) / SUM(wi)) ^ 2)))
 
Upvote 0
Thanks for your replies. I'm quit slow on stat work. How does this help me estimate yi values that correspond to xi values of 85, 86, and 96-100?
 
Upvote 0
You'd need to do a weighted regression to get the slope and offset.

I'm no statistician, but doing extrapolation (versus interpolation) seems very sketchy.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,897
Messages
6,175,271
Members
452,628
Latest member
dd2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top