BigOlBuick
New Member
- Joined
- Dec 22, 2016
- Messages
- 9
Hello, I have a beginner's question about multiple regression, so a stats question really. I've only recently learned the basics of linear regression and I still have the following nagging doubt.
I'd like to analyse some sales data for the purpose of forecasting future performance. My dependent variable (Y) is 'profit/loss', which simply represents a sales figure for individual retail items. This is the variable I would like to forecast. There are certain quantifiable conditions for each attempted sale of an item (item sales rank, duration of item availability, advertising expenditure, etc.) and these are my independent variables. My question stems from the fact that the historical values I have for Y are either a positive number (ranging from 0 to 1000) or a fixed negative value of -100. An item may be sold for any amount of profit but the wholesale price to the seller of each item is the same, hence the same fixed loss amount for any unsold items. A sample of the data for Y might look like this (note the fixed negative value of -100 in a few instances):
23
55
201
-100
13
-100
321
124
57
-100
33
It's my understanding that a multiple regression model here would produce varying negative (and positive) values for Y, and this is not my issue. What I'd like to know is, are there any other implications of using this sort of input in a regression model? Or can it be treated in the same way as any ratio type data? Perhaps it sounds silly but I'm wondering whether the fixed negative values might somehow distort a regression model's output. I'm not trying to replicate the fixed -100 value for the losses, only trying to get to true averages such that I may accurately predict the profitability of an item's listing for sale given the pre-sale conditions (and avoid unprofitable listings). Hope this all makes sense. Thank you.
I'd like to analyse some sales data for the purpose of forecasting future performance. My dependent variable (Y) is 'profit/loss', which simply represents a sales figure for individual retail items. This is the variable I would like to forecast. There are certain quantifiable conditions for each attempted sale of an item (item sales rank, duration of item availability, advertising expenditure, etc.) and these are my independent variables. My question stems from the fact that the historical values I have for Y are either a positive number (ranging from 0 to 1000) or a fixed negative value of -100. An item may be sold for any amount of profit but the wholesale price to the seller of each item is the same, hence the same fixed loss amount for any unsold items. A sample of the data for Y might look like this (note the fixed negative value of -100 in a few instances):
23
55
201
-100
13
-100
321
124
57
-100
33
It's my understanding that a multiple regression model here would produce varying negative (and positive) values for Y, and this is not my issue. What I'd like to know is, are there any other implications of using this sort of input in a regression model? Or can it be treated in the same way as any ratio type data? Perhaps it sounds silly but I'm wondering whether the fixed negative values might somehow distort a regression model's output. I'm not trying to replicate the fixed -100 value for the losses, only trying to get to true averages such that I may accurately predict the profitability of an item's listing for sale given the pre-sale conditions (and avoid unprofitable listings). Hope this all makes sense. Thank you.