Working with large spreadsheet of Data

ShogunStealth

New Member
Joined
Nov 6, 2021
Messages
23
Office Version
  1. 2019
Platform
  1. Windows
I am working with a large amount of data, over 600K+ and up to Column K. The raw data is series of columns which looks at SharePoint websites to look at numerous metrics to determine things like;
  • For each website how what are the file types for each website and the entire spreadsheet; eg .docx, .xlss, .pdf, etc and many others up to 150 different one assuming anything after the last "." is a file type.
  • How many of each file types are there for for each website and entire sheet eg docx 7, pdf 13 etc
  • What is the capacity of each of those files types for each website and the entire spreadsheet. Docx 140 Mb, pdf 2,123 Mb etc
  • When was the last time any file in that website updated - site 1 - 2 years 5 months 16 days
  • The age of each file type across 6 date ranges 0 -14 days, 15 - 30 days, 31- 90 days, 91 days - 180 days , 181 - 365 days, more than 365 days
There are a several of others.

Largely it is a series summation mainly several complex countifs, sumifs with fill-down, fill-rights type statements and numerous other formatting features. I don't need help with any of the formulas because it all works. The issue or advise I am seeking is how to pause the computation into blocks of effort as processing power is horrendous, especially when I don't filter the list the number of websites down to reduce the total of 600K+ line items. My i7-10510U CPU @ 1.80GHz[Cores 4] ran at 100% CPU for over 2 hours with completing the processing. It was interesting that memory never went much over 50% or 8 Gb

Are there techniques or strategies to either;
  • throttle the CPU from maxing out to 100% to stop damage to my laptop hardware
  • when some of the tables are 250 (x) and 150 (y) to stage or phase the fill-down or fill-right computations
  • or apply something similar to sumifs or countifs
I am open for options.

This is a copy the heading in the raw data

FilenameList NameLocationFile ExtensionCurrent Version Size KBTotal version Size KBSite addressCreated ByLast Modified ByLast Modified DateCreated Date
 

Excel Facts

Wildcard in VLOOKUP
Use =VLOOKUP("Apple*" to find apple, Apple, or applesauce
I have found the best way to handle really large datasets is to get rid of all of the equations and substitute them with a few vba macros that you can control when they run. Obviously the way to get the macros to run fast is to load all the data into one or more variant arrays as necessary and do the calculations in memory. I use dictionaries for doing index match and the if statement and a loop for countifs and average ifs etc. Effectively this does "pause" the calculation into blocks!!
 
Upvote 0

Forum statistics

Threads
1,223,896
Messages
6,175,259
Members
452,626
Latest member
huntinghunter

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top