WestHillsWeb
New Member
- Joined
- Feb 24, 2012
- Messages
- 2
Hello.
This question ought to be fun for the real "I can do anything with Excel" type of people. Here's the situation:
I have a membership site and I'd like to figure out how to calculate attrition and retention rates from a table of sales data.
Most other discussion about attrition and retention rates deal with data sets that look different than mine. But basically, I have a bunch of sales data, one record for each sale, and there is no indication in the data of whether someone is in their first month of membership or their 2nd or their 7th.
Here's 10 months of some made up data that is similar to what I have:
I have 3 years of real data, by the way, and what I want to know is:
* What is the average length that someone remains a member, and what is that worth?
(There are other questions I'd ask of this data, but this is my most important question at the moment.)
If I simply grouped by name and counted records for each unique name, I'd end up with a skewed number. That's because the beginning of the data is not the beginning of time for this business. There were transactions in the prior year as well, but I don't have access to those records. So, with this sample data, Joe Smith could be in his first month or 10th. I just don't know. (Let's call this the "Starting Skew")
Similarly, if I just do an "average number of records for each name", the data will be skewed because the people who are still subscribers at the end may go on for many more months. (Let's call this the "Ending Skew")
So it seems to me that asking for "the average length of a membership" for all of the data at the same time is not a question that can be answered accurately.
You'd have to ask the question of a smaller set of data, right?
So, with my real 3 years worth of data, I figure I'd want to take one month, let's say June 2010, and calculate the average length of membership for people who joined in that month. That eliminates the Starting Skew.
And I would do those calculations for only a certain number of months (let's say, 12 months) so that I eliminate the Ending Skew.
Then, I could compare that average with the average length of membership for people who joined in July 2010. And then I could compare August 2010, and so on. I'd hope to find that my average length of membership would increase over time.
So, finally, MY QUESTION IS...
How do I calculate the average length of membership for someone who starts in a particular month? Is it even possible to do with excel? Or do I need to write some software (in visual basic, perl, or something else) to iterate over the data and figure this out for me the hard way?
I figure the algorithm for this calculation is:
* Find all the people whose first month was June 2010
* Delete or filter out all other data for people who had joined prior.
* Delete or filter out all other data for people who joined later.
* Delete ALL data before June 2010.
* Delete ALL data after May 2010.
Note: That leaves me with 12 months of data for ONLY those people who started in June 2010.
* Group the remaining data by member
* Subtotal (count) the number of records for each person in the data.
* Average the subtotals
Does that make sense?
The hard part here seems to me to be step 1: How do you find all the people whose FIRST records are in a particular month (without having to do it manually...I have 30,000 records!)
I think the rest is something I could handle doing manually. It's that first step that's the problem. Isn't it?
Any ideas?
Thank you VERY much for any and all help.
--Mark
p.s. I have Excel 2007. I have Access 2003 on another machine, if that's a better tool for this. But I'd rather do this in Excel if I can. I created the sample data for you to test with. THANK YOU!!!
This question ought to be fun for the real "I can do anything with Excel" type of people. Here's the situation:
I have a membership site and I'd like to figure out how to calculate attrition and retention rates from a table of sales data.
Most other discussion about attrition and retention rates deal with data sets that look different than mine. But basically, I have a bunch of sales data, one record for each sale, and there is no indication in the data of whether someone is in their first month of membership or their 2nd or their 7th.
Here's 10 months of some made up data that is similar to what I have:
Code:
Date,Name,Amount
1/10/2011,Joe Smith,$10
1/12/2011,Bill Gates,$10
2/10/2011,Joe Smith,$10
2/12/2011,Bill Gates,$10
2/20/2011,Thomas Edison,$10
2/27/2011,Steve Jobs,$10
3/10/2011,Joe Smith,$10
3/12/2011,Bill Gates,$10
3/15/2011,Henry Ford,$10
3/27/2011,Steve Jobs,$10
4/2/2011,Mark Zuckerberg,$10
4/10/2011,Joe Smith,$10
4/15/2011,Henry Ford,$10
4/27/2011,Steve Jobs,$10
5/2/2011,Mark Zuckerberg,$10
5/6/2011,Lou Gehrig,$10
5/10/2011,Joe Smith,$10
5/15/2011,Henry Ford,$10
5/27/2011,Steve Jobs,$10
5/28/2011,Babe Ruth,$10
6/2/2011,Mark Zuckerberg,$10
6/6/2011,Lou Gehrig,$10
6/10/2011,Joe Smith,$10
6/12/2011,Magic Johnson,$10
6/15/2011,Henry Ford,$10
6/27/2011,Steve Jobs,$10
6/28/2011,Babe Ruth,$10
7/2/2011,Mark Zuckerberg,$10
7/6/2011,Lou Gehrig,$10
7/7/2011,Freddie Mercury,$10
7/10/2011,Joe Smith,$10
7/12/2011,Magic Johnson,$10
7/15/2011,Henry Ford,$10
7/27/2011,Steve Jobs,$10
7/28/2011,Babe Ruth,$10
8/2/2011,Mark Zuckerberg,$10
8/6/2011,Lou Gehrig,$10
8/7/2011,Freddie Mercury,$10
8/10/2011,Joe Smith,$10
8/12/2011,Magic Johnson,$10
8/15/2011,Henry Ford,$10
8/27/2011,Steve Jobs,$10
8/28/2011,Babe Ruth,$10
9/2/2011,Mark Zuckerberg,$10
9/6/2011,Lou Gehrig,$10
9/7/2011,Freddie Mercury,$10
9/9/2011,Steven Tyler,$10
9/10/2011,Joe Smith,$10
9/15/2011,Henry Ford,$10
9/27/2011,Steve Jobs,$10
9/28/2011,Babe Ruth,$10
10/2/2011,Mick Jagger,$10
10/6/2011,Lou Gehrig,$10
10/7/2011,Freddie Mercury,$10
10/9/2011,Steven Tyler,$10
10/10/2011,Joe Smith,$10
10/15/2011,Henry Ford,$10
10/27/2011,Steve Jobs,$10
10/28/2011,Babe Ruth,$10
* What is the average length that someone remains a member, and what is that worth?
(There are other questions I'd ask of this data, but this is my most important question at the moment.)
If I simply grouped by name and counted records for each unique name, I'd end up with a skewed number. That's because the beginning of the data is not the beginning of time for this business. There were transactions in the prior year as well, but I don't have access to those records. So, with this sample data, Joe Smith could be in his first month or 10th. I just don't know. (Let's call this the "Starting Skew")
Similarly, if I just do an "average number of records for each name", the data will be skewed because the people who are still subscribers at the end may go on for many more months. (Let's call this the "Ending Skew")
So it seems to me that asking for "the average length of a membership" for all of the data at the same time is not a question that can be answered accurately.
You'd have to ask the question of a smaller set of data, right?
So, with my real 3 years worth of data, I figure I'd want to take one month, let's say June 2010, and calculate the average length of membership for people who joined in that month. That eliminates the Starting Skew.
And I would do those calculations for only a certain number of months (let's say, 12 months) so that I eliminate the Ending Skew.
Then, I could compare that average with the average length of membership for people who joined in July 2010. And then I could compare August 2010, and so on. I'd hope to find that my average length of membership would increase over time.
So, finally, MY QUESTION IS...
How do I calculate the average length of membership for someone who starts in a particular month? Is it even possible to do with excel? Or do I need to write some software (in visual basic, perl, or something else) to iterate over the data and figure this out for me the hard way?
I figure the algorithm for this calculation is:
* Find all the people whose first month was June 2010
* Delete or filter out all other data for people who had joined prior.
* Delete or filter out all other data for people who joined later.
* Delete ALL data before June 2010.
* Delete ALL data after May 2010.
Note: That leaves me with 12 months of data for ONLY those people who started in June 2010.
* Group the remaining data by member
* Subtotal (count) the number of records for each person in the data.
* Average the subtotals
Does that make sense?
The hard part here seems to me to be step 1: How do you find all the people whose FIRST records are in a particular month (without having to do it manually...I have 30,000 records!)
I think the rest is something I could handle doing manually. It's that first step that's the problem. Isn't it?
Any ideas?
Thank you VERY much for any and all help.
--Mark
p.s. I have Excel 2007. I have Access 2003 on another machine, if that's a better tool for this. But I'd rather do this in Excel if I can. I created the sample data for you to test with. THANK YOU!!!
Last edited: