Cumulative selective sum using sumproduct - how to speed up?

cesarez

New Member
Joined
May 1, 2012
Messages
9
Hi all,
I am trying to do the following:
- I have a list of items (say cars), one on each line, and attached there is the name of the owner and the value of the car
- I need to pick up the first n most valuable cars for each owner up to a value of x
- every car that makes me exceed the total value of x has not to be picked up

My attempt:
- List the cars from the most expensive to the cheapest
- For each line sum the value of the previous cars belonging to the same owner
- So you have a cumulative sum *for each owner*
- In a second column, I flag the lines in which the value exceeds x

Seems simple but excel experiences huge difficulties in calculating the cumulative sum, probably because I am working on more than 20k lines.
It takes ages to calculate and I never managed to paste values to save my work.

I tried to obtain this result using a sumproduct [=SUMPRODUCT((D13=D13:$D$13)*(AB13:$AB$13))] where column D is the name of the owner and column AB is the value of the car.
I tried to do the same using sumif, but seems even worse.

I am running excel 64bit on a i7 with 8gb RAM, so the HW/resources should not be the problem.
Is there a smarter way to do this without using VBA?

Hope you can help me because this issue is really blocking me....
Thanks a lot
C
 

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
Xls 2010 64 bit.

I cannot paste some lines but each client is identified by a 17 character code and values are in the range of 500k - 1mln, so 6-7 characters.

There are around 20k lines and 30-40 columns. No formulas are left (i padted values to speed up calcs).

Is this helpful ?
 
Upvote 0
Xls 2010 64 bit.

I cannot paste some lines but each client is identified by a 17 character code and values are in the range of 500k - 1mln, so 6-7 characters.

There are around 20k lines and 30-40 columns. No formulas are left (i padted values to speed up calcs).

Is this helpful ?

Try to post a very small sample along with the desired results.
 
Upvote 0
Below you can see an example of the desired result: lines are ordered by value, and in the fourth column you can see the cumulative sum of the previous lines for each single customer. So for example in the 6th line you can see the sum of all the "peter lines": 1st and 6th.

The formula used to do so is: =SUMPRODUCT((B9=$B$2:B9)*($C$2:C9)) [in the last line, obviously relative references adapt in each line], i.e. given the ID number of the current line, sum all the precedent values in the column Value having the same ID.

Consider that xls took 30-40 seconds even to calculate these 8 lines, so there is probably something deeply inefficient in the formula....

I don't know how relevant it is, but the real ID is a 17 character text value, the real Value is a number ranging from 20-30k to 2mln. The lines are around 20k and the file is around 20Mb. I use .xlsb to keep the file size smaller.







<table width="406" border="0" cellpadding="0" cellspacing="0"><col style="width: 48pt;" width="64"> <col style="width: 73pt;" width="97"> <col style="width: 48pt;" width="64"> <col style="width: 136pt;" width="181"> <tbody><tr style="height: 15pt;" height="20"> <td class="xl65" style="height: 15pt; width: 48pt;" width="64" height="20">Name</td> <td class="xl65" style="border-left: medium none; width: 73pt;" width="97">Id number</td> <td class="xl65" style="border-left: medium none; width: 48pt;" width="64">Value</td> <td class="xl65" style="border-left: medium none; width: 136pt;" width="181">Cumulative value per client</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">Peter</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">4543</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1100</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1100</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">Tim</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">2123</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1000</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">Tim</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">2123</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1000</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">2000</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">John</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1243</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">400</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">400</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">Mark</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">3342</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">300</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">300</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">Peter</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">4543</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">250</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1350</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">Mark</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">3342</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">200</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">500</td> </tr> <tr style="height: 15pt;" height="20"> <td class="xl66" style="height: 15pt; border-top: medium none;" height="20">John</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">1243</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">50</td> <td class="xl66" style="border-top: medium none; border-left: medium none;" align="right">450</td> </tr> </tbody></table>
 
Last edited:
Upvote 0
Below you can see an example of the desired result: lines are ordered by value, and in the fourth column you can see the cumulative sum of the previous lines for each single customer. So for example in the 6th line you can see the sum of all the "peter lines": 1st and 6th.

The formula used to do so is: =SUMPRODUCT((B9=$B$2:B9)*($C$2:C9)) [in the last line, obviously relative references adapt in each line], i.e. given the ID number of the current line, sum all the precedent values in the column Value having the same ID.

Consider that xls took 30-40 seconds even to calculate these 8 lines, so there is probably something deeply inefficient in the formula....

I don't know how relevant it is, but the real ID is a 17 character text value, the real Value is a number ranging from 20-30k to 2mln. The lines are around 20k and the file is around 20Mb. I use .xlsb to keep the file size smaller.







<TABLE border=0 cellSpacing=0 cellPadding=0 width=406><COLGROUP><COL style="WIDTH: 48pt" width=64><COL style="WIDTH: 73pt" width=97><COL style="WIDTH: 48pt" width=64><COL style="WIDTH: 136pt" width=181><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="WIDTH: 48pt; HEIGHT: 15pt" class=xl65 height=20 width=64>Name</TD><TD style="BORDER-LEFT: medium none; WIDTH: 73pt" class=xl65 width=97>Id number</TD><TD style="BORDER-LEFT: medium none; WIDTH: 48pt" class=xl65 width=64>Value</TD><TD style="BORDER-LEFT: medium none; WIDTH: 136pt" class=xl65 width=181>Cumulative value per client</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>Peter</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>4543</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1100</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1100</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>Tim</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>2123</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1000</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1000</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>Tim</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>2123</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1000</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>2000</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>John</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1243</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>400</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>400</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>Mark</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>3342</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>300</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>300</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>Peter</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>4543</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>250</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1350</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>Mark</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>3342</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>200</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>500</TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="HEIGHT: 15pt; BORDER-TOP: medium none" class=xl66 height=20>John</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>1243</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>50</TD><TD style="BORDER-LEFT: medium none; BORDER-TOP: medium none" class=xl66 align=right>450</TD></TR></TBODY></TABLE>

SUMIF(($B$2:B2,B2,$C$2:C2)

will be faster than

SUMPRODUCT((B2=$B$2:B2)*($C$2:C2))

Do you also have formulas with OFFSET, INDIRECT, VLOOKUP with match-type set to 0, and so on?

Also, just curious: What purpose do the cumulative figures serve?
 
Last edited:
Upvote 0
Hi Aladin,
sumif seems to work. Actually I alway thought sumproduct to be faster than other logical operators....

There are vlookups in the model, but i don't think they are generating the issue since the problem still exist even pasting all the values and eliminating almost all teh formulas.
 
Upvote 0
Hi Aladin,
sumif seems to work. Actually I alway thought sumproduct to be faster than other logical operators....

There are vlookups in the model, but i don't think they are generating the issue since the problem still exist even pasting all the values and eliminating almost all teh formulas.

20,000 formulas are a lot, but it's strange that they alone can cause such a slowdown...
 
Upvote 0
Hi Aladin,
sumif seems to work. Actually I alway thought sumproduct to be faster than other logical operators....

There are vlookups in the model, but i don't think they are generating the issue since the problem still exist even pasting all the values and eliminating almost all teh formulas.
 
Upvote 0

Forum statistics

Threads
1,223,240
Messages
6,170,951
Members
452,368
Latest member
jayp2104

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top