PowerQuery - Retaining old data when refreshing

BBxcl

New Member
Joined
Sep 29, 2021
Messages
12
Office Version
  1. 365
Platform
  1. Windows
Hi,

I’ve got a folder that has got multiple excel files (50+) with each file containing a weekly summary. As it is a weekly summary, each week, a new excel file is added to the folder.

Currently, I’ve got a “master” excel file that is linked to that folder using PowerQuery and imports all the weekly files into one single table so that further analysis can be carried out.

The problem is that as the number of excel summary files are increasing each week, the process of refreshing data is becoming slower and slower as PowerQuery tries to load in all the files from scratch rather than just loading in the new weekly file.

I wanted to ask if there was a way to make PowerQuery load in just the previous week’s file whilst retaining all the old file data as well? Could the old weekly file’s data be stored in as values independent of the Query and could PowerQuery just load in the new file?

Any help would be greatly appreciated!
 

Excel Facts

Wildcard in VLOOKUP
Use =VLOOKUP("Apple*" to find apple, Apple, or applesauce
Hi,

Refer to link below, hope it will help.


Cheers!!!
 
Upvote 0
Solution
Hi,

I’ve got a folder that has got multiple excel files (50+) with each file containing a weekly summary. As it is a weekly summary, each week, a new excel file is added to the folder.

Currently, I’ve got a “master” excel file that is linked to that folder using PowerQuery and imports all the weekly files into one single table so that further analysis can be carried out.

The problem is that as the number of excel summary files are increasing each week, the process of refreshing data is becoming slower and slower as PowerQuery tries to load in all the files from scratch rather than just loading in the new weekly file.

I wanted to ask if there was a way to make PowerQuery load in just the previous week’s file whilst retaining all the old file data as well? Could the old weekly file’s data be stored in as values independent of the Query and could PowerQuery just load in the new file?

Any help would be greatly appreciated!
If your files use a logical naming scheme, you just need to have your query call on the specific file you want to pull in. Append it to your existing table. See the example below:

= Table.Combine({Excel.CurrentWorkbook(){[Name="QueryTable1"]}[Content], QueryTable1})

This will pull your old query results first via the Excel table (which is not the same as the Query table, but is a copy of it displayed in Excel), then will pull the updated version via the query refresh, and then merge the two back together in the same table. If it ever breaks, you can do another tedious long pull or make sure you have the correct if statement or error handler to deal with it.

It works because the query is dependent on the excel table before the table refresh. ;)
 
Upvote 0
Hi,

I’ve got a folder that has got multiple excel files (50+) with each file containing a weekly summary. As it is a weekly summary, each week, a new excel file is added to the folder.

Currently, I’ve got a “master” excel file that is linked to that folder using PowerQuery and imports all the weekly files into one single table so that further analysis can be carried out.

The problem is that as the number of excel summary files are increasing each week, the process of refreshing data is becoming slower and slower as PowerQuery tries to load in all the files from scratch rather than just loading in the new weekly file.

I wanted to ask if there was a way to make PowerQuery load in just the previous week’s file whilst retaining all the old file data as well? Could the old weekly file’s data be stored in as values independent of the Query and could PowerQuery just load in the new file?

Any help would be greatly appreciated!
I don't know if I was clear in my previous comment; so, I'll try again.

It sounds like you want an incremental update. IOW you want to keep the existing data already pulled, and add the latest dataset to it. My understanding is that PowerQuery doesn't currently support adding rows to existing PowerQuery-generated query tables. But you can get around this if you're using it in Excel (which I assume is what you were using to do PowerQuery). I don't know if you could do the same thing with PowerQuery in PowerBI.

PowerQuery in Excel runs queries that can be stored as a "connection only" to be referenced in other queries, etc. Or it can produce a table in Excel - typically named the same as a named range in Excel. But the resulting named range/table in Excel is not directly the query. It's a copy of the results of the query. Because of this, you can query the resulting named range/table in Excel before pulling the new results. So, the end query would be as follows:

1. Set up a query that pulls in only the table from the specific file you want from the folder. Use whatever logic you need to so that it only looks for the one file in the folder. You can even set it up as "connection only". And Call it Incremental_Query, for example.
Power Query:
//Sample:
let
    Source = Date.AddDays(DateTime.LocalNow(),-1),
    AddressSource =  Excel.Workbook(File.Contents("C:\Users\MyUserName\Documents\" & Text.From(Date.Year(Source)) & Text.From(Date.Month(Source)) & Text.From(Date.Day(Source)) & "_DAILYDOCUMENT.xlsx"), null, true)
in
    AddressSource
2. Change your old query that previously pulled all of the files' tables into your Excel table to now pull just from the resulting Excel table version of itself and combine that query with the Query you made in Step 1. For our example, your existing PowerQuery-results table would have been named Orig_PowerQuery_Results .

In pulling your query from the existing table, your results shouldn't change the table. It's essentially refreshing itself with its previously output data and overwriting the table with itself.


Your final query from Step 2 will look something like this:
Power Query:
= Table.Combine({Excel.CurrentWorkbook(){[Name="Orig_PowerQuery_Results"], Incremental_Query})
Again, this assumes that you're using PowerQuery in Excel and loading the query results to an Excel table.

Step 1 can also be tricky depending on your file-naming convention. You might have to use the Advanced Editor view to hand-create a query that changes the file name it's searching for.

But basically, pulling all of the prior results from the currently open document is way quicker than pulling from all of the documents in a folder. Then just add on your new results. It will be much faster.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,317
Messages
6,171,425
Members
452,402
Latest member
siduslevis

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top