Trend Lines


Posted by Robert on August 23, 2001 2:51 PM

How do establish a Trend Line from a set of variables, and make sense of it? I know that the more data I have the better the trend line will be. How does Standard Deviation fit into this? I am very interested, but a bit mathematically challenged!

Posted by Loren on August 24, 2001 7:26 AM

Make a chart. R click the charted line. Choose add trandline.
R click the trendline and you get more choices such as display the
equation.



Posted by Eric on August 24, 2001 12:30 PM

How they're fit

You can also go to add-ins and select the "analysis toolpak" for more powerful linear regression tools.

As to your questions regarding the basics of regression analysis, the distance from each of the raw data points to the trendline is measured vertically (if x is an independent variable). This distance is called the "residual".

Since some of the values fall above the trendline and some below, some residuals are positive and some are negative. All of the residuals are squared (thus removing negative values) and added together (the sum of square residuals), and the line is iteratively refit until the sum of square residuals is as small as possible with the given set of data, this is the "least squares" method of fitting a linear model to your data.

see
http://www.intergalact.com/seam95/linfit.html
for an example

So increasing the number of points can help you get a line that is a more realistic estimation of the relationship among the data, and can improve the R-squared value (a measure of the degree of correlation between the dependent and independent variables) by reducing the impact of "misbehaving" data on the fit.

Incidentally, "linear" does not mean "straight line" in regression models (like fitting a trendline- all of the trendline options in excel are linear), instead it refers to equations in which the fitted parameters are independent. That is, the part of the equation solving for one parameter does not affect the solution for the other parameter.

Also, I would be careful making interpretations based on standard deviation. Imagine a set of data spanning a large range, yet fit well with a linear model- standard deviation will be large, reflecting the broad variance of the data given its range. It does not include the amount of variance accounted for by the linear relationship with the independent variable (X). So you could have the same standard deviation with a data set that fit v. poorly and one that fit v. well with a regression analysis.

I suspect the board is beginning to drift off to sleep! If you want to talk more about this please contact me at the email above.

Hope that helps