Huge Dataset - Looking for a SUMIFS alternative

DungeonMaster

New Member
Joined
Oct 21, 2015
Messages
4
Hello everyone, this is my first post here.

I am using Excel 2013 on Windows 8.1.

I have posted my issues on Reddit before, without results unfortunately.

I am currently manipulating a very large dataset (500k rows approximatly) that I need to reorder before analyzing the data.

This picture shows what I want to do with my data :

Columns A, B and C contain indicators (US county number, Year and age group). What I want to do is to sum, for each county, year and age group, values present in columns D to I To new columns (N and onwards in the picture I attached). In other words, I want to sum Values in Column D if their indicators in A, B and C are identical, and then transfer the result to another column.

I already had to do the same task last year (with a way smaller datafile), and found that the SUMIFS function worked perfectly for this kind of task (by creating some "helping columns", in K and L on the image I provided, and then using SUMIFS).

The issue is that using SUMIFS to fill the entire table is a daunting task for my computer (even with a 8-core processor). It takes ages to simply fill one column when it doesn't simply crash.

I was then wondering if anyone could guide me with a workaround that would be less calculation-heavy. I absolutely don't know how to code in VBA (except for copying/pasting macros), but I figure that the macro wouldn't be that hard to write. What I need would be :

- Sum D2 to D4
- Paste result in N2
- Sum D21 to D23
- Paste result in N3
etc

Thank you very much for your help!
 
With formulas it's hard to measure correctly. Anyhow I gave it a try with full test data (so source data in columns A-I as in the example).
2 Steps with testing:
1. Copy down the formulas in J2-K2 thru row 500,000.
2. Copy down the formulas in N2-X2 thru row 166,668 (my test data had groups of 3 rows each).

Measured on my analogue watch, it took about 8+12=20 seconds, both on:
my PC, Windows7 Intel Celeron CPU 530GT 2.00 GHZ - 4 GB - Excel 2007
my laptop, Windows 8.1 Intel Core i7-4510U 2.00 GHZ 12 GB - Excel 2013
 
Upvote 0

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
With formulas it's hard to measure correctly. ...
With formulas a macro can easily be used to fill in and copy down the formulas, so as to get more accurate timing. Something like the testdata code in post#8 does.

Of course the end user needn't do it that way. Should do whatever they're happiest with.
However, thanks for your info. Now, how good are pivot tables for this job ....?
 
Upvote 0
the test dataset is equally available to everyone including yourself, so if it's very simple could not OP be shown how effective it is for this problem?

Not on a tablet...
 
Upvote 0
FWIW it took about 4-5 seconds to create the pivot table manually (since you just need to check each field in the field list in order), and it takes 1-2 seconds to refresh. Plus it can be easily rearranged.
 
Upvote 0
FWIW it took about 4-5 seconds to create the pivot table manually (since you just need to check each field in the field list in order), and it takes 1-2 seconds to refresh. Plus it can be easily rearranged.
OK. Thanks for the info.
Seems that potential users have adequate options.
 
Upvote 0
FWIW it took about 4-5 seconds to create the pivot table manually (since you just need to check each field in the field list in order), and it takes 1-2 seconds to refresh. Plus it can be easily rearranged.
But don't you have to do unique concatenation of the first 3 columns to get the result for this problem by pivot table?
It's not obvious to me that the pivot table does this. Or is there some way around it?
 
Upvote 0
You just put the first three columns as row fields. That gives you every combination of the three present in the data.
 
Upvote 0
Hey everyone. I didn't get notified of your responses, glad to see my issues are interesting!

If you are willing to see how these solutions work, I can send you my datafiles.

I finally organized my data neatly, which was awfully long (I had to use SAS, STATA and SPSS for some data extraction).

RoryA, your approach with pivot tables seems really interesting (somebody already mentionned that on Reddit). I have absolutely no knowledge of it, but I guess now is a good time to research the matter.
 
Upvote 0
....RoryA, your approach with pivot tables seems really interesting (somebody already mentionned that on Reddit). I have absolutely no knowledge of it, but I guess now is a good time to research the matter.

Pivot Tables are my Go To solution for a lot of my work, even when they don't seem the obvious solution. You will find very good ROI learning Pivot Tables.
 
Upvote 0

Forum statistics

Threads
1,223,234
Messages
6,170,891
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top