Query source folder contains too much info, the query breaks before it can load. What are my options?

Clonk92

New Member
Joined
Jun 20, 2024
Messages
7
Office Version
  1. 365
A team I'm working on used to manually process data received from clients writing out their PDF documents into an ever growing range of data within Excel. We only have access to Office 365 programs and Power Query has been a real game changer, but we're all pretty new to it.

We've been implementing the use of Power Query to reduce or eliminate the manual data entry by having the query pull data from a source folder.
However, whenever new data is added to the source folder the query reruns on all the original data and then adds in the new data, the process was long but is basically lightspeed when compared to the old approach being used here.

I figured this was just the nature of things, but we've added a bunch of new PDFs yesterday and now the query won't load. After an hour of running it gives an error that states "something went wrong"

Is there an approach I could be using to address file sizes getting quite large? If I were to split the query up into several more manageable ones and then append/merge them would that be a solution?

We need to be able to review any individual record within the dataset, but we don't need to have every record displayed all the time. The data is essentially a general ledger of events and transactions.

Thanks in advance
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
What I do is create a history file and append it to the files in the active load folder.

Go back to the last working refresh. Complete the refresh
Export the entire table to a CSV using DAX studio
Archive all the files in the source folder somewhere so they won’t be loaded again
Add the new files into the source folder (should just be a few).
Load the CSV history file and append to the current query.

This approach still loads all the data, but because the CSV has already been transformed, this quite fast.

This DAX Studio article should help you if you haven’t used it
 
Upvote 0
What I do is create a history file and append it to the files in the active load folder.

Go back to the last working refresh. Complete the refresh
Export the entire table to a CSV using DAX studio
Archive all the files in the source folder somewhere so they won’t be loaded again
Add the new files into the source folder (should just be a few).
Load the CSV history file and append to the current query.

This approach still loads all the data, but because the CSV has already been transformed, this quite fast.

This DAX Studio article should help you if you haven’t used it
This looks awesome, but I'll have to see if were cleared to download Dax or not...
Also reading through parts of the article... I might need to get familiar with Power Pivot too. Thanks for the response, it's looking like this project is going to be a lot bigger than I had originally hoped lol.


Quick question... If I were to split the source data into several folders, and then make a query for each folder, saved as a connect and then I make one final query that appends those connections together... would that help? This problem seemed to only happen after we dumped all of our PDFs into the one folder. I'll definitely look into the DAX and history table you've mentioned, but as a bandaid solution... would something like what I'm describing have any benefit?
 
Upvote 0
Quick question... If I were to split the source data into several folders, and then make a query for each folder, saved as a connect and then I make one final query that appends those connections together... would that help?
I doubt it. The work is the same. You could also try the incremental refresh approach linked above
 
Upvote 0

Forum statistics

Threads
1,220,965
Messages
6,157,119
Members
451,398
Latest member
rjsteward

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top