Power Query is killing my spreadsheet

jimrward

Well-known Member
Joined
Feb 24, 2003
Messages
1,896
Office Version
  1. 2021
  2. 2019
  3. 2016
  4. 2013
  5. 2011
  6. 2010
  7. 2007
  8. 2003 or older
Platform
  1. Windows
Excel 2021 updated to latest versions, 32gb ram
125k row csv data source with 30 columns, I add an index column
using the above as a reference I split off data in to 3 categories to give me smaller datasets to work with
I then have around 20 queries that use the 3 subsets to create my required output into excel tables
each of the queries loads the table into a buffer, then adds conditional columns to drive aggregation later in the query finishing off with a group and sort in most cases

i have reached a point twice now where excel goes into not responding mode keels over and dies, and I lose all my changes
i have even repaired the broken spreadsheet manually extracted the queries into notepad started with a blank spreadsheet and built another model manually and slowly only for it to happen again

any pointers from the experts, my next thoughts are to have a model for each of the 3 categories
could it be that with a buffer in each query it is swamping memory and it is not very good at saving itself so it takes the easy option and dies
 

Excel Facts

Excel Joke
Why can't spreadsheets drive cars? They crash too often!
could it be that with a buffer in each query it is swamping memory and it is not very good at saving itself so it takes the easy option and dies
Yes, I think this is the case. And actually, it is the only option since PQ doesn't care about the memory, so it tries to do what it is being told - to buffer data and save it in the memory in this sample. However, Excel simply dies when there is no way out (it is terminated by the system actually).

When you buffer 125k row table in 20 queries, it is really harm more than help. Especially if the query is not actually "reusing" the source table(s).
"Reusing" here is not actually "working on the same table" but utilizing the entire table as a parameter in another function repeatedly, for example, as a parameter to a function that you call for each row in another table. For example, adding a new column that checks a column value in another table or list. You wouldn't want to load that same "another" table or list for each row over an over, instead, you would buffer the table and use it from the buffer in the iteration.

So, adding columns to a table for aggregation on the same data set is not "reusing" the base table.

I would try removing the buffering to avoid unnecessary memory usage in this case.
 
Upvote 0

Forum statistics

Threads
1,223,898
Messages
6,175,272
Members
452,628
Latest member
dd2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top