Calculations on Large list of data with Duplicates

apc1968 · Nov 4, 2012

Hi - new to this but thanks to all for help (Excel for Mac 2011). Have searched and can find lots to help but the size of my data means am crashing by usual methods of sorting - I need a formula for this rather than pivot tables.

I have a list of c. 40,000 rows. Col A is client name (largely different but some duplicates); Col B is Policy type (eg Home, Motor, Travel) and Col C is earnings on that policy (eg £60). An example would be:

Name	Policy Type	Earnings

Johnson	Home	£60
Kennedy	Home	£55
Jones	Motor	£45
Smith	Motor	£89
Jones	Travel	£70
Kennedy	Travel	£60
Williams	Home	£22
Williams	Motor	£42
Jones	Motorcycle	£55

So I need to figure out (via a formula) two things:

1) How to count duplicate names (eg appearing 3x = 1 customer; appearing 2x = 2 customers; appearing 1x = 1 customer)

2) How to build a simple table to show the relationship between policies. This would have Motor, Home, Travel, Motorcycle as columns and also as rows - the data would then be how many policies and in each cross point?

I have tried just working the data using Countifs etc, but as soon as I apply it to the whole data set the machine freezes, presumably due to its size

Really appreciate anyones help here - thank you!

A

mikerickson · Nov 4, 2012

What formulas cause the freezing?

To count how many customers have three policies,
=SUMPRODUCT(--(COUNTIF(A2:A40000,A2:A40000)=3))/3

To find out how many customers have both a Home and a Motor policy, can we assume that no one has two Home (or two Motor) policies?

BengalMagic · Nov 4, 2012

apc1968 said:
Hi - new to this but thanks to all for help (Excel for Mac 2011). Have searched and can find lots to help but the size of my data means am crashing by usual methods of sorting - I need a formula for this rather than pivot tables.

I have a list of c. 40,000 rows. Col A is client name (largely different but some duplicates); Col B is Policy type (eg Home, Motor, Travel) and Col C is earnings on that policy (eg £60). An example would be:

Name
Policy Type
Earnings

Johnson
Home
£60

Kennedy
Home
£55

Jones
Motor
£45

Smith
Motor
£89

Jones
Travel
£70

Kennedy
Travel
£60

Williams
Home
£22

Williams
Motor
£42

Jones
Motorcycle
£55

<tbody>
</tbody>

So I need to figure out (via a formula) two things:

1) How to count duplicate names (eg appearing 3x = 1 customer; appearing 2x = 2 customers; appearing 1x = 1 customer)

2) How to build a simple table to show the relationship between policies. This would have Motor, Home, Travel, Motorcycle as columns and also as rows - the data would then be how many policies and in each cross point?

I have tried just working the data using Countifs etc, but as soon as I apply it to the whole data set the machine freezes, presumably due to its size

Really appreciate anyones help here - thank you!

A

It sounds as though you need to be linking your spreadsheet to a table in an Access database.

It's very simple and painless and then you'll be able to query your data until your heart's content without any freezes

apc1968 · Nov 4, 2012

mikerickson said:
What formulas cause the freezing?

To count how many customers have three policies,
=SUMPRODUCT(--(COUNTIF(A2:A40000,A2:A40000)=3))/3

To find out how many customers have both a Home and a Motor policy, can we assume that no one has two Home (or two Motor) policies?

Hi - thanks for very quick reply! - I was originally trying to use Countif on each row to show how many times each name is repeated and then analysing the number of products into columns with Policy Type for the column header - slow and caused machine to crash

On the second point - some customers may have 3 Home Policies and 4 Motor for example

Really appreciate the response - thanks

A

apc1968 · Nov 4, 2012

Hi - I get that the formula should work - but it returns a '0' - sorry, any ideas?

Thanks

mikerickson · Nov 4, 2012

Avoid using whole column references.
Did the formula given work with your large data base for "how many customers with n policies"?
(note: duplicate policy types will each be counted)

As to how many with Home&Moter, still working...

apc1968 · Nov 4, 2012

Hi - tried your formula on a small test example - works perfectly. If I extend it to me 40k records then it just shows a 0 and refuses to calc it....?

Calculations on Large list of data with Duplicates

apc1968

New Member

Excel Facts

mikerickson

MrExcel MVP

BengalMagic

Banned user

apc1968

New Member

apc1968

New Member

mikerickson

MrExcel MVP

apc1968

New Member

Similar threads

Forum statistics

Share this page

Calculations on Large list of data with Duplicates

apc1968

New Member

Excel Facts

mikerickson

MrExcel MVP

BengalMagic

Banned user

apc1968

New Member

apc1968

New Member

mikerickson

MrExcel MVP

apc1968

New Member

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock