Extract differences from different workbooks

d365b

New Member
Joined
Dec 26, 2021
Messages
17
Office Version
  1. 365
Platform
  1. Windows
Hi everyone,

I hope you are well. I have a spreadsheet that evolves every week. I used Power Query and imported the workbooks produced the past weeks, basically copying the query and simply modifying the source. Is there a way to remove alla common data (what remains unchanged basically) and output only the differences (what’s new)?

Many thanks :)
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
Maybe. It would depend on how your data is structured and how the changes manifest.
 
Upvote 0
Maybe. It would depend on how your data is structured and how the changes manifest.
Hey JGordon. Each workbook, once filtered has 3500+ rows. Most of the rows remain unaltered. The changes could manifest as a new value in a row or a new row all together.
 
Upvote 0
Power Query:
let
    tbl1 = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    tbl2 = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    rows = Table.ToRecords(tbl1),
    tbl3 = Table.SelectRows(tbl2, each not List.Contains(rows,_))
in
    tbl3

Book1
ABCDEFGHIJKLMNO
1
2Table1Table2Query Output
3
4IDColumn2Column3Column4IDColumn2Column3Column4IDColumn2Column3Column4
5121231112123116B3420
6228126228126812C16
7322213322213101718D
84311035431103511GEH
953024853024812FDB
1061534206B3420
117591475914
128124016812C16
139763897638
1410171833101718D
1511GEH
1612FDB
17
Sheet1
 
Upvote 0
Power Query:
let
    tbl1 = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    tbl2 = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    rows = Table.ToRecords(tbl1),
    tbl3 = Table.SelectRows(tbl2, each not List.Contains(rows,_))
in
    tbl3

Book1
ABCDEFGHIJKLMNO
1
2Table1Table2Query Output
3
4IDColumn2Column3Column4IDColumn2Column3Column4IDColumn2Column3Column4
5121231112123116B3420
6228126228126812C16
7322213322213101718D
84311035431103511GEH
953024853024812FDB
1061534206B3420
117591475914
128124016812C16
139763897638
1410171833101718D
1511GEH
1612FDB
17
Sheet1
Hey JGordon, thank you for your suggestion! Considering the size of such tables, I do not have them as tables on the current workbook. Can the formula you shared be modified accordingly?

Thanks
 
Upvote 0
Just change the tbl1 and tbl2 steps to retrieve your tables from their locations.
 
Upvote 0
Just change the tbl1 and tbl2 steps to retrieve your tables from their locations.
Right JGordon,

I have got this far
Power Query:
let
    lastWeek = #"2022_01",
    thisWeek = #"2022_02",
    trace = Table.Column(Table.RemoveMatchingRows(thisWeek, Table.ToRecords(lastWeek)), "ProdNr")
in
    trace

So now I have a list of Product Numbers that have changed but now I would like to combine a table looking up the ProdNr in both lastWeek and thisWeek tables. I tried this for a single table but, despite not giving me an error, it keeps processing and never returns a table
Power Query:
tbl = Table.SelectRows(lastWeek , each List.ContainsAny( {[ProdNr]}, trace) )

Any idea what am I doing wrong?
 
Upvote 0
It may just be an inefficient way of going about it (in fact I'm pretty sure it is). If you can run it on a much smaller version of your datasets (say ~50 rows) and it works in a few seconds, then it is just a very cpu intensive algorithm for large data sets. Would need to rethink how to make it more efficient.
 
Upvote 0
The code I provided in post #4 is very slow for large datasets. I tested the code below on a test sample with 3500 rows and it was almost instantaneous.

Power Query:
let
    tbl1 = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    tbl2 = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    tcn2 = Table.ColumnNames(tbl2),
    renames = List.Transform(Table.ColumnNames(tbl1), each {_, _ & "1"}),
    tbl1r = Table.RenameColumns(tbl1, renames),
    tbl3 = Table.Join(tbl1r,Table.ColumnNames(tbl1r), tbl2, tcn2, JoinKind.RightAnti ),
    Result = Table.RemoveColumns(tbl3, Table.ColumnNames(tbl1r))
in
    Result
 
Upvote 0

Forum statistics

Threads
1,223,886
Messages
6,175,196
Members
452,616
Latest member
intern444

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top