How to Create a Trendline for Data Points of Different Weights?

dmmx3 · Mar 2, 2012

I have a scatter plot of X and Y data points. Each point is made up of some number of items. How can I weight the points so that the trendline doesn't count a point that is made up of 10 items the same as it counts a point that is made up of 100 items?

An example of my data:

X .... Y .... (Number of items making up the data point)
1 .... .459 .... (45)
2 .... .509 .... (118)
3 .... .551 .... (231)
4 .... .589 .... (218)
5 .... .609 .... (124)
6 .... .698 .... (54)

Thanks in advance for any advice.

(Also, I should note I played with doing this manually by simply copying each data point the number of times in the parentheses, and then using the much larger data set. I think this works, but I have far too much data to do this with everything. I assume there is an easier way?)

dmmx3 · Mar 3, 2012

After some more trial and error, and googling, I think what I want to do is run a logistic regression.

Anyone have any tips on doing this in Excel? I'm not sure Excel can do this?

I have variables Y1, Y2, and X (binary).

Any help would be greatly appreciated.

energman58 · Mar 4, 2012

Is this any help?

http://blog.excelmasterseries.com/2010/04/using-logistic-regression-in-excel-to.html

dmmx3 · Mar 4, 2012

energman58 said:
Is this any help?

http://blog.excelmasterseries.com/2010/04/using-logistic-regression-in-excel-to.html

Thank you for the suggestion - I actually worked through this video, spending an hour or 2. I could not get it to work. The solver kept using variables that would return bad values, and would crash.

shg · Mar 4, 2012

I have variables Y1, Y2, and X (binary).

That's not what your example shows.

dmmx3 · Mar 4, 2012

shg said:
That's not what your example shows.

Right - after more research (as I stated) I altered what I am trying to do.

shg · Mar 4, 2012

So what does your data look like now? Or do you have it all sorted out?

dmmx3 · Mar 4, 2012

shg said:
So what does your data look like now? Or do you have it all sorted out?

Thanks for your interest. I do not have this figured out. I can manipulate the data various ways. Let me describe what I am trying to do - maybe that will make this clearer.

I have about 16000 data points. Each point represents one college basketball game. Each point has the following information:

1) Spread (The line determined by the sportsbook)
2) Total (The total score determined by the sportsbook)
3) Win/Loss (Did the home team win?)

So, a sample of data would look like this:

-3 ... 143 ... 1
-1 ... 127 ... 1
+3 ... 136 ... 1
-4 ... 149 ... 0

in order, those 4 data points represent the following information:

home team favored by 3, predicted total 143, home team wins
home team favored by 1, predicted total 127, home team wins
home team underdog by 3, predicted total 136, home team wins
home team favored by 4, predicted total 149, home team loses

As I said, I have about 16K of these points. I want to run a logistic regression to determine an equation that would give me an expected winning percentage for the home team, given the inputs of the spread and the total.

Again, thanks for your interest, and thanks in advance for any help.

(By the way, I can do this in Excel using a logarithmic or polynomial regression, but I need to do a logistic one.)

energman58 · Mar 5, 2012

Solver does that a lot! It can also claim to find solutions which violate constraints so you need to examine the solution.

I suspect you have way too much data for the cheap solver that you get free with Excel. If you don't have access to a better solver maybe you could try it with a random sample of your 16,000 points? A thousand should be plenty for this so long as they were randomly selected from your population but the solver shipped with Excel may be challenged even by this as it is really only a toy/marketing tool to get you to spend a lot of money (a thousand dollars minimum) on the real thing. There is a free industrial-grade MIP solver called LPSolve which has an Excel/VBA implementation which seems to be quite good (I have used it a bit) but to use it you have to write your own LP formulation (which Solver does for you behind the scenes) which is a bit of a pain.

Alternatively you could download free trial versions of the "grown up" solvers from Frontline (WWW.solver.com) if you only want to do this once.

shg · Mar 5, 2012

So:

F(z) = 1 / (1 + exp(-z))

z = Beta0 + Beta1 * Spread + Beta2 * Total

Use Solver to minimize RMS error (E7) by changing Beta0, Beta1, and Beta2, with a constraint that Beta0 + Beta1 * Spread + Beta2 * Total < 700 to prevent overflow.

Code:

       --B--- --C-- --D-- --E--- -----------------------------F------------------------------
   2                Beta0  1.007 via Solver                                                  
   3                Beta1 65.885 via Solver                                                  
   4                Beta2  0.694 via Solver                                                  
   5                                                                                         
   6                      Error                                                              
   7                       0.000 E7: {=SQRT(AVERAGE((D9:D27-E9:E27)^2))}                     
   8   Spread Total  W/L   Fit                                                               
   9        3   115     1  1.000 E9 and down: =1 / (1 + 1 / EXP(Beta0 + Beta1*B9 + Beta2*C9))
  10        0   110     1  1.000                                                             
  11       -2   145     0  0.000                                                             
  12       -1   134     1  1.000                                                             
  13        3   146     1  1.000                                                             
  14       -4   139     0  0.000                                                             
  15        1   135     1  1.000                                                             
  16        2   127     1  1.000                                                             
  17       -4   150     0  0.000                                                             
  18        0   135     1  1.000                                                             
  19       -1   148     1  1.000                                                             
  20       -1   144     1  1.000                                                             
  21       -2   133     0  0.000                                                             
  22        3   121     1  1.000                                                             
  23        3   143     1  1.000                                                             
  24        3   134     1  1.000                                                             
  25       -1   142     1  1.000                                                             
  26       -4   144     0  0.000                                                             
  27        2   133     1  1.000

How to Create a Trendline for Data Points of Different Weights?

dmmx3

New Member

dmmx3

New Member

energman58

Well-known Member

dmmx3

New Member

shg

MrExcel MVP

dmmx3

New Member

shg

MrExcel MVP

dmmx3

New Member

energman58

Well-known Member

shg

MrExcel MVP

Similar threads

Share this page

How to Create a Trendline for Data Points of Different Weights?

New Member

New Member

Well-known Member

New Member

MrExcel MVP

New Member

MrExcel MVP

New Member

Well-known Member

MrExcel MVP

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock