How to filter multiple files for latest version of each file

Judoman

New Member
Joined
Nov 7, 2014
Messages
31
Office Version
  1. 2016
Platform
  1. Windows
I am using Power Query in Excel for the first time. Got query to work fine but have too much unneeded data.
Query combines 1000+ files. Each file has ~150 rows of data. Files are named using serial number (SN) and Revision letter for each file.
There are multiple files of different revision levels for each serial number. The file name is shown to left of each row of data.

I only need the rows of data for the latest/highest revision of each serial number. That would equal ~ 33 rows for each serial number. This would give me ~11,000 rows of data instead of the unneeded 540,000+ rows.

So question 1 is can this be done in power query?

If yes then question 2 is how is it done? Need specific details on how to do it as I am new at this in PQ.

Here is an example of 2 Revs for SN 133. I only need the rows that have Rev B to be displayed.

[TABLE="width: 716"]
<tbody>[TR]
[TD]Source.Name[/TD]
[TD]OPTION[/TD]
[TD]OPTION SELECTED[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev A.xls[/TD]
[TD]Aircraft Model[/TD]
[TD]F-2000[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev A.xls[/TD]
[TD]Winglets Installed[/TD]
[TD]Yes[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev A.xls[/TD]
[TD]ABOC Installed[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev A.xls[/TD]
[TD]BASC Installed[/TD]
[TD]700[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev A.xls[/TD]
[TD]Battery Voltage Drop Suppression[/TD]
[TD]No[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev B.xls[/TD]
[TD]Aircraft Model[/TD]
[TD]F-2000[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev B.xls[/TD]
[TD]Winglets Installed[/TD]
[TD]Yes[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev B.xls[/TD]
[TD]ABOC Installed[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev B.xls[/TD]
[TD]BASC Installed[/TD]
[TD]700[/TD]
[/TR]
[TR]
[TD]APM Form for F2X DN133 SN133 Rev B.xls[/TD]
[TD]Battery Voltage Drop Suppression[/TD]
[TD]No[/TD]
[/TR]
</tbody>[/TABLE]
 

Excel Facts

VLOOKUP to Left?
Use =VLOOKUP(A2,CHOOSE({1,2},$Z$1:$Z$99,$Y$1:$Y$99),2,False) to lookup Y values to left of Z values.
Here is one way, it assume the serial number is 3 items from the end

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    columns.Type = Table.TransformColumnTypes(Source,{{"Source.Name", type text}, {"OPTION", type text}, {"OPTION SELECTED", type any}}),
    serial.Singles = Table.Distinct(columns.Type, {"Source.Name"}),
    source.Split = Table.AddColumn(serial.Singles, "Source.Split", each Text.Split([Source.Name]," ")),
    serial.Exttract = Table.AddColumn(source.Split, "Serial", each [Source.Split]{List.Count([Source.Split])-3}),
    serial.rev.Extract = Table.AddColumn(serial.Exttract, "SerialRev", each [Source.Split]{List.Count([Source.Split])-3}&List.Last([Source.Split])),
    columns.Stripdown = Table.RemoveColumns(serial.rev.Extract,{"OPTION", "OPTION SELECTED", "Source.Split"}),
    index0.Add = Table.AddIndexColumn(columns.Stripdown, "Index", 0, 1),
    index1.Add = Table.AddIndexColumn(index0.Add, "Index.1", 1, 1),
    next.Join = Table.NestedJoin(index1.Add,{"Index.1"},index1.Add,{"Index"},"Added Index1",JoinKind.LeftOuter),
    columns.Stripdown = Table.RemoveColumns(next.Join,{"SerialRev", "Index", "Index.1"}),
    next.Extpand = Table.ExpandTableColumn(columns.Stripdown, "Added Index1", {"Serial"}, {"Next.Serial"}),
    tokeep.Test = Table.AddColumn(next.Extpand, "ToKeep", each if [Serial]<>[Next.Serial] then true else false),
    tokeep.Filter = Table.SelectRows(tokeep.Test, each ([ToKeep] = true)),
    allrows.Retrieve = Table.NestedJoin(columns.Type,{"Source.Name"},tokeep.Filter,{"Source.Name"},"Filtered Rows",JoinKind.Inner),
    table.Remove = Table.RemoveColumns(allrows.Retrieve,{"Filtered Rows"})
in
    table.Remove
 
Last edited:
Upvote 0
When your code was used I got:
Expression.SyntaxError: Token Eof expected.

In list of workbook queries I have a warning of Download did not complete.

Here is my original code that shows all revs of each line item:

let
Source = Folder.Files("J:\Dept 240\APM\EASy II\Requests\Processed Requests\F2X"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from F2X", each #"Transform File from F2X"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from F2X"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from F2X", Table.ColumnNames(#"Transform File from F2X"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type text}, {"Column2", type text}, {"Column3", type any}, {"Column4", type text}, {"Column5", type text}, {"Column6", type any}, {"Column7", type any}, {"Column8", type any}, {"Column9", Int64.Type}, {"Column10", type text}, {"Column11", type any}, {"Column12", type any}, {"Column13", type any}, {"Column14", type any}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column3"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "OPTION"}, {"Column5", "OPTION SELECTED"}})
in
#"Renamed Columns"
 
Upvote 0
I didn't get that error, but I did get a duplicate step-name.

This version works for me but it is just one file, not a folder like yours, so it needs modifying.

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    columns.Type = Table.TransformColumnTypes(Source,{{"Source.Name", type text}, {"OPTION", type text}, {"OPTION SELECTED", type any}}),
    serial.Singles = Table.Distinct(columns.Type, {"Source.Name"}),
    source.Split = Table.AddColumn(serial.Singles, "Source.Split", each Text.Split([Source.Name]," ")),
    serial.Exttract = Table.AddColumn(source.Split, "Serial", each [Source.Split]{List.Count([Source.Split])-3}),
    serial.rev.Extract = Table.AddColumn(serial.Exttract, "SerialRev", each [Source.Split]{List.Count([Source.Split])-3}&List.Last([Source.Split])),
    columns.Stripdown = Table.RemoveColumns(serial.rev.Extract,{"OPTION", "OPTION SELECTED", "Source.Split"}),
    index0.Add = Table.AddIndexColumn(columns.Stripdown, "Index", 0, 1),
    index1.Add = Table.AddIndexColumn(index0.Add, "Index.1", 1, 1),
    next.Join = Table.NestedJoin(index1.Add,{"Index.1"},index1.Add,{"Index"},"Added Index1",JoinKind.LeftOuter),
    columns2.Stripdown = Table.RemoveColumns(next.Join,{"SerialRev", "Index", "Index.1"}),
    next.Extpand = Table.ExpandTableColumn(columns2.Stripdown, "Added Index1", {"Serial"}, {"Next.Serial"}),
    tokeep.Test = Table.AddColumn(next.Extpand, "ToKeep", each if [Serial]<>[Next.Serial] then true else false),
    tokeep.Filter = Table.SelectRows(tokeep.Test, each ([ToKeep] = true)),
    allrows.Retrieve = Table.NestedJoin(columns.Type,{"Source.Name"},tokeep.Filter,{"Source.Name"},"Filtered Rows",JoinKind.Inner),
    table.Remove = Table.RemoveColumns(allrows.Retrieve,{"Filtered Rows"})
in
    table.Remove
 
Upvote 0
Now I can see the steps under Applied Steps. But get Expression.Error: We couldn't find an Excel table named 'Table3'.Details:
Table3

What is Table 3? looks like it would be a file name. I put in name of my file but it did not work.
 
Upvote 0
As I said, my code is for one file, and that file is current workbook. I am querying a table called Table3, you will need to change it to get all of your files in first before you use the transforming code I give.
 
Upvote 0
As I'm new to power query I'm not sure how to get all my files in. I have an excel worksheet that contains all the files. Worksheet has total of 14 columns but I only need to use the 3 I made sample from. Do I enter the column (table) names? You already have the names of my 3 columns in what you sent me.
 
Upvote 0
Doesn't this part of the code that you supplied do that, grabbing all the files in the folder? I couldn't test it as I don't have any files.
Code:
let
    Source = Folder.Files("J:\Dept 240\APM\EASy II\Requests\Processed Requests\F2X"),
    #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from F2X", each #"Transform File from F2X"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from F2X"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from F2X", Table.ColumnNames(#"Transform File from F2X"(#"Sample File"))),
Append my code starting at columns.TYpe, change my first row refence to the previous step, and tweak it all.
 
Upvote 0
Sorry for not replying sooner. Been out sick a few days. Looks like I should add your code to mine. But I don't know where to add it. Nor do I know exactly what/how to change your first row to previous step - -change to what? Then how/what do I tweak it all? Below is what I currently have in Advanced Editor using your code and mine.

let
Source = Folder.Files("J:\Dept 240\APM\EASy II\Requests\Processed Requests\F2X"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from F2X", each #"Transform File from F2X"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from F2X"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from F2X", Table.ColumnNames(#"Transform File from F2X"(#"Sample File"))),






#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type text}, {"Column2", type text}, {"Column3", type any}, {"Column4", type text}, {"Column5", type text}, {"Column6", type any}, {"Column7", type any}, {"Column8", type any}, {"Column9", Int64.Type}, {"Column10", type text}, {"Column11", type any}, {"Column12", type any}, {"Column13", type any}, {"Column14", type any}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column3"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "OPTION"}, {"Column5", "OPTION SELECTED"}})
in
#"Renamed Columns"


let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
columns.Type = Table.TransformColumnTypes(Source,{{"Source.Name", type text}, {"OPTION", type text}, {"OPTION SELECTED", type any}}),
serial.Singles = Table.Distinct(columns.Type, {"Source.Name"}),
source.Split = Table.AddColumn(serial.Singles, "Source.Split", each Text.Split([Source.Name]," ")),
serial.Exttract = Table.AddColumn(source.Split, "Serial", each [Source.Split]{List.Count([Source.Split])-3}),
serial.rev.Extract = Table.AddColumn(serial.Exttract, "SerialRev", each [Source.Split]{List.Count([Source.Split])-3}&List.Last([Source.Split])),
columns.Stripdown = Table.RemoveColumns(serial.rev.Extract,{"OPTION", "OPTION SELECTED", "Source.Split"}),
index0.Add = Table.AddIndexColumn(columns.Stripdown, "Index", 0, 1),
index1.Add = Table.AddIndexColumn(index0.Add, "Index.1", 1, 1),
next.Join = Table.NestedJoin(index1.Add,{"Index.1"},index1.Add,{"Index"},"Added Index1",JoinKind.LeftOuter),
columns2.Stripdown = Table.RemoveColumns(next.Join,{"SerialRev", "Index", "Index.1"}),
next.Extpand = Table.ExpandTableColumn(columns2.Stripdown, "Added Index1", {"Serial"}, {"Next.Serial"}),
tokeep.Test = Table.AddColumn(next.Extpand, "ToKeep", each if [Serial]<>[Next.Serial] then true else false),
tokeep.Filter = Table.SelectRows(tokeep.Test, each ([ToKeep] = true)),
allrows.Retrieve = Table.NestedJoin(columns.Type,{"Source.Name"},tokeep.Filter,{"Source.Name"},"Filtered Rows",JoinKind.Inner),
table.Remove = Table.RemoveColumns(allrows.Retrieve,{"Filtered Rows"})
in
table.Remove
 
Upvote 0

Forum statistics

Threads
1,223,886
Messages
6,175,194
Members
452,616
Latest member
intern444

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top