Dynamically insert new columns in Excel Power Query and sort column pairs horizontally

nik_l

New Member
Joined
Feb 24, 2019
Messages
2
I'm stumped with a requirement and can't seem to get past it (I've tried to scour the internet, and I'm unable to fix it myself, given my lesser experience with Power Query). I could accomplish this using VBA, but the number of records, that need to be processed are shy of half a million records! :eeek:

So here's the simplified task at hand. There are two columns - "Substring" and "Main String", as shown below [Input]:

[TABLE="width: 235"]
<tbody>[TR]
[TD]Substring[/TD]
[TD]Main String[/TD]
[/TR]
[TR]
[TD]AB|CDE[/TD]
[TD]ABCDEF[/TD]
[/TR]
[TR]
[TD]ABC|DE|GH[/TD]
[TD]ABCDEFGHI[/TD]
[/TR]
[TR]
[TD]A|BC|X|YZ[/TD]
[TD]ABCDYZ[/TD]
[/TR]
</tbody>[/TABLE]
The Substring column has mini substrings separated by a delimiter (|). These Mini Substrings may or may not be found within the Main string. The output at the end should look like this [Expected Output]:

[TABLE="width: 764"]
<tbody>[TR]
[TD]Substring[/TD]
[TD]Main String[/TD]
[TD]MSS1[/TD]
[TD]ML1[/TD]
[TD]MSS2[/TD]
[TD]ML2[/TD]
[TD]MSS3[/TD]
[TD]ML3[/TD]
[TD]MSS4[/TD]
[TD]ML4[/TD]
[/TR]
[TR]
[TD]AB|CDE[/TD]
[TD]ABCDEF[/TD]
[TD]CDE[/TD]
[TD]3[/TD]
[TD]AB[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ABC|DE|GH[/TD]
[TD]ABCDEFGHI[/TD]
[TD]ABC[/TD]
[TD]3[/TD]
[TD]DE[/TD]
[TD]2[/TD]
[TD]GH[/TD]
[TD]2[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]A|BC|X|YZ[/TD]
[TD]ABCDYZ[/TD]
[TD]BC[/TD]
[TD]2[/TD]
[TD]YZ[/TD]
[TD]2[/TD]
[TD]A[/TD]
[TD]1[/TD]
[TD]X[/TD]
[TD]0[/TD]
[/TR]
</tbody>[/TABLE]
(Note: MSS and ML in the column headers stand for Mini Substring and Match Length respectively)

The Mini Substrings can vary in number, so the row that has the maximum number of Mini Substrings would define the number of MSS and ML columns. In this case, the third row has four of them, so there were four MSS and ML columns each.

Not just that, the sequence of these Mini Substrings has to be such that those with highest matching lengths should be placed on the leftmost MSS and ML pair of columns. And then those with the least matching lengths should go on the rightmost pair of MSS and ML columns. And everything in between should progress in that order.

I got until the stage where the current output looks like this [Interim Output]:

[TABLE="width: 421"]
<tbody>[TR]
[TD]Substring[/TD]
[TD]Main String[/TD]
[TD]MSS1[/TD]
[TD]MSS2[/TD]
[TD]MSS3[/TD]
[TD]MSS4[/TD]
[/TR]
[TR]
[TD]AB|CDE[/TD]
[TD]ABCDEF[/TD]
[TD]AB[/TD]
[TD]CDE[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]ABC|DE|GH[/TD]
[TD]ABCDEFGHI[/TD]
[TD]ABC[/TD]
[TD]DE[/TD]
[TD]GH[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]A|BC|X|YZ[/TD]
[TD]ABCDYZ[/TD]
[TD]A[/TD]
[TD]BC[/TD]
[TD]X[/TD]
[TD]YZ[/TD]
[/TR]
</tbody>[/TABLE]
Dynamically inserting the ML columns just after the respective MSS columns is the challenge I'm unable to get past by. And I'm sure sorting them horizontally by the rank of matching lengths (i.e. lengths of those mini substrings that found a match in the main string) would also be another challenge. Here's the Power Query that I could come up with that generated the [Interim Output].

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Sub String", type text}, {"Main String", type text}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Sub String", "For Split"),
    #"Added Custom" = Table.AddColumn(#"Duplicated Column", "Split Count", each List.Count(Text.Split([Sub String],"|"))),
    #"Reordered Columns" = Table.ReorderColumns(#"Added Custom",{"Split Count", "Sub String", "Main String", "For Split"}),
    #"Max No Of Splittable Columns" = List.Max(#"Reordered Columns"[Split Count]),
    #"List Of MSS Columns" = List.Transform({1..#"Max No Of Splittable Columns"}, each "MSS"&Text.From(_)),
    #"Split Columns By Delimiter" = Table.SplitColumn(#"Reordered Columns","For Split",Splitter.SplitTextByDelimiter("|"), #"List Of MSS Columns"),
    #"Removed Columns" = Table.RemoveColumns(#"Split Columns By Delimiter",{"Split Count"})
in
    #"Removed Columns"

Could you please guide me out of this? You can easily tell I am new to Power Query, so I very much appreciate any help that I can get. Thank you!

Note: I've also posted this on a different forum, but awaiting a response there.
 

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
Here is the Mcode for adding columns and inserting the length of each column and then arranging their location. I am unable to determine how to sort horizontally as you desire

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Substring", type text}, {"Main String", type text}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Substring", "Substring - Copy"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Duplicated Column", "Substring - Copy", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"Substring - Copy.1", "Substring - Copy.2", "Substring - Copy.3", "Substring - Copy.4"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Substring - Copy.1", type text}, {"Substring - Copy.2", type text}, {"Substring - Copy.3", type text}, {"Substring - Copy.4", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type1", "Custom", each Text.Length([#"Substring - Copy.1"])),
    #"Reordered Columns" = Table.ReorderColumns(#"Added Custom",{"Substring", "Main String", "Substring - Copy.1", "Custom", "Substring - Copy.2", "Substring - Copy.3", "Substring - Copy.4"}),
    #"Added Custom1" = Table.AddColumn(#"Reordered Columns", "Custom.1", each Text.Length([#"Substring - Copy.2"])),
    #"Reordered Columns1" = Table.ReorderColumns(#"Added Custom1",{"Substring", "Main String", "Substring - Copy.1", "Custom", "Substring - Copy.2", "Custom.1", "Substring - Copy.3", "Substring - Copy.4"}),
    #"Added Custom2" = Table.AddColumn(#"Reordered Columns1", "Custom.2", each Text.Length([#"Substring - Copy.3"])),
    #"Reordered Columns2" = Table.ReorderColumns(#"Added Custom2",{"Substring", "Main String", "Substring - Copy.1", "Custom", "Substring - Copy.2", "Custom.1", "Substring - Copy.3", "Custom.2", "Substring - Copy.4"}),
    #"Added Custom3" = Table.AddColumn(#"Reordered Columns2", "Custom.3", each Text.Length([#"Substring - Copy.4"]))
in
    #"Added Custom3"

To determine the length of a column use
Insert Custom column
Type = Text.Length([Column Name])
 
Upvote 0
Hi Alan,


Thanks for your reply!


This would have been good enough had the Substring column had a maximum of four predefined mini substrings. Unfortunately, that isn't the case. They can be 10, 15, or even 20. In essence, the Substring column can be split into a varying number of columns depending on the incoming data.


For instance, one of the biggest substrings can look like this: AA|AB|AC|AD|AE|AF|AG|AH|AI|AJ


Another instance can be that the biggest substring is just this: AA|AB


Sorry if I couldn't make this clear in my original post.


How do I handle such dynamism?

Note: I've just posted a reply and I don't see it reflecting here, so I'm repeating my reply
 
Upvote 0

Forum statistics

Threads
1,225,741
Messages
6,186,761
Members
453,370
Latest member
juliewar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top