Calculations on Large list of data with Duplicates

apc1968

New Member
Joined
Nov 4, 2012
Messages
14
Hi - new to this but thanks to all for help (Excel for Mac 2011). Have searched and can find lots to help but the size of my data means am crashing by usual methods of sorting - I need a formula for this rather than pivot tables.

I have a list of c. 40,000 rows. Col A is client name (largely different but some duplicates); Col B is Policy type (eg Home, Motor, Travel) and Col C is earnings on that policy (eg £60). An example would be:

NamePolicy TypeEarnings
JohnsonHome£60
KennedyHome£55
JonesMotor£45
SmithMotor£89
JonesTravel£70
KennedyTravel£60
WilliamsHome£22
WilliamsMotor£42
JonesMotorcycle£55

<colgroup><col span="2"><col></colgroup><tbody>
</tbody>


So I need to figure out (via a formula) two things:

1) How to count duplicate names (eg appearing 3x = 1 customer; appearing 2x = 2 customers; appearing 1x = 1 customer)

2) How to build a simple table to show the relationship between policies. This would have Motor, Home, Travel, Motorcycle as columns and also as rows - the data would then be how many policies and in each cross point?

I have tried just working the data using Countifs etc, but as soon as I apply it to the whole data set the machine freezes, presumably due to its size

Really appreciate anyones help here - thank you!

A
 

Excel Facts

How to change case of text in Excel?
Use =UPPER() for upper case, =LOWER() for lower case, and =PROPER() for proper case. PROPER won't capitalize second c in Mccartney
What formulas cause the freezing?

To count how many customers have three policies,
=SUMPRODUCT(--(COUNTIF(A2:A40000,A2:A40000)=3))/3


To find out how many customers have both a Home and a Motor policy, can we assume that no one has two Home (or two Motor) policies?
 
Upvote 0
Hi - new to this but thanks to all for help (Excel for Mac 2011). Have searched and can find lots to help but the size of my data means am crashing by usual methods of sorting - I need a formula for this rather than pivot tables.

I have a list of c. 40,000 rows. Col A is client name (largely different but some duplicates); Col B is Policy type (eg Home, Motor, Travel) and Col C is earnings on that policy (eg £60). An example would be:

Name
Policy Type
Earnings
Johnson
Home
£60
Kennedy
Home
£55
Jones
Motor
£45
Smith
Motor
£89
Jones
Travel
£70
Kennedy
Travel
£60
Williams
Home
£22
Williams
Motor
£42
Jones
Motorcycle
£55

<tbody>
</tbody>


So I need to figure out (via a formula) two things:

1) How to count duplicate names (eg appearing 3x = 1 customer; appearing 2x = 2 customers; appearing 1x = 1 customer)

2) How to build a simple table to show the relationship between policies. This would have Motor, Home, Travel, Motorcycle as columns and also as rows - the data would then be how many policies and in each cross point?

I have tried just working the data using Countifs etc, but as soon as I apply it to the whole data set the machine freezes, presumably due to its size

Really appreciate anyones help here - thank you!

A

It sounds as though you need to be linking your spreadsheet to a table in an Access database.

It's very simple and painless and then you'll be able to query your data until your heart's content without any freezes :)
 
Upvote 0
What formulas cause the freezing?

To count how many customers have three policies,
=SUMPRODUCT(--(COUNTIF(A2:A40000,A2:A40000)=3))/3


To find out how many customers have both a Home and a Motor policy, can we assume that no one has two Home (or two Motor) policies?

Hi - thanks for very quick reply! - I was originally trying to use Countif on each row to show how many times each name is repeated and then analysing the number of products into columns with Policy Type for the column header - slow and caused machine to crash

On the second point - some customers may have 3 Home Policies and 4 Motor for example

Really appreciate the response - thanks

A
 
Upvote 0
Hi - I get that the formula should work - but it returns a '0' - sorry, any ideas?

Thanks
 
Upvote 0
Avoid using whole column references.
Did the formula given work with your large data base for "how many customers with n policies"?
(note: duplicate policy types will each be counted)

As to how many with Home&Moter, still working...
 
Upvote 0
Hi - tried your formula on a small test example - works perfectly. If I extend it to me 40k records then it just shows a 0 and refuses to calc it....?
 
Upvote 0

Forum statistics

Threads
1,220,965
Messages
6,157,119
Members
451,398
Latest member
rjsteward

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top