Good morning all!
I'm relatively new to Power Query and I'm having trouble removing the duplicates in one column after sorting rows. I've searched all over and found a few things that seem to indicate that Power Query uses the Source Order to determine which file to keep. Since I am importing multiple text files from a directory, I can't control the sort order and PowerQuery is bringing in the oldest data first. Because of this, regardless of how I sort, the oldest data (See: First source imported) is being kept.
Now, I have seen some references to using the Table Buffer on those sites but when I try to apply it to my table I get an error. Here is the M-code for the table I am importing:
Here are the changes to add the Table Buffer:
Could anyone shed some light on what I am doing incorrectly? All help and insight is appreciated!
I'm relatively new to Power Query and I'm having trouble removing the duplicates in one column after sorting rows. I've searched all over and found a few things that seem to indicate that Power Query uses the Source Order to determine which file to keep. Since I am importing multiple text files from a directory, I can't control the sort order and PowerQuery is bringing in the oldest data first. Because of this, regardless of how I sort, the oldest data (See: First source imported) is being kept.
Now, I have seen some references to using the Table Buffer on those sites but when I try to apply it to my table I get an error. Here is the M-code for the table I am importing:
Code:
let
Source = Folder.Files("C:\1097-1098 Applicant Tracking Tool\Version 3\datasets to load\1098Data"),
#"Invoke Custom Function1" = Table.AddColumn(Source, "Transform Binary from 1098Data", each #"Transform Binary from 1098Data"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform Binary from 1098Data"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform Binary from 1098Data", Table.ColumnNames(#"Removed Other Columns1"[#"Transform Binary from 1098Data"]{0})),
#"Split Column by Position" = Table.SplitColumn(#"Expanded Table Column1","Source.Name",Splitter.SplitTextByPositions({0, 10}, false),{"Source.Name.1", "Source.Name.2"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Split Column by Position","Source.Name.2",Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv),{"Source.Name.2.1", "Source.Name.2.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter",{{"Source.Name.2.1", "FileDate"}}),
#"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"SEPARATION_DATE"}),
#"Removed Columns" = Table.RemoveColumns(#"Removed Columns1",{"Source.Name.1", "Source.Name.2.2"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Req-App", each [REQUISITION] & " - " & [APPLICATION]),
#"Sorted Rows" = Table.Sort(#"Added Custom",{{"Req-App", Order.Ascending}, {"FileDate", Order.Descending}}),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Req-App"})
in
#"Removed Duplicates"
Here are the changes to add the Table Buffer:
Code:
let
Source = Folder.Files("C:\1097-1098 Applicant Tracking Tool\Version 3\datasets to load\1098Data"),
#"Invoke Custom Function1" = Table.AddColumn(Source, "Transform Binary from 1098Data", each #"Transform Binary from 1098Data"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform Binary from 1098Data"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform Binary from 1098Data", Table.ColumnNames(#"Removed Other Columns1"[#"Transform Binary from 1098Data"]{0})),
#"Split Column by Position" = Table.SplitColumn(#"Expanded Table Column1","Source.Name",Splitter.SplitTextByPositions({0, 10}, false),{"Source.Name.1", "Source.Name.2"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Split Column by Position","Source.Name.2",Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv),{"Source.Name.2.1", "Source.Name.2.2"}),
#"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter",{{"Source.Name.2.1", "FileDate"}}),
#"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"SEPARATION_DATE"}),
#"Removed Columns" = Table.RemoveColumns(#"Removed Columns1",{"Source.Name.1", "Source.Name.2.2"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Req-App", each [REQUISITION] & " - " & [APPLICATION]),
#"Sorted Rows" = Table.Sort(#"Added Custom",{{"Req-App", Order.Ascending}, {"FileDate", Order.Descending}}),
#"Buffered" = Table.Buffer( #”Sorted Rows”),
#"Removed Duplicates" = Table.Distinct(#"Buffered", {"Req-App"})
in
#"Removed Duplicates"
Could anyone shed some light on what I am doing incorrectly? All help and insight is appreciated!