Power Query to split a column containing varying numbers of items and organize results under matching columns

KRice

Well-known Member
Joined
Dec 9, 2003
Messages
3,080
Office Version
  1. 365
Platform
  1. Windows
I have recently begun using Power Query to clean up a data table that was obtained by transforming a text file from a financial business system. Each row of the table represents a payment for goods or services. Unfortunately, key transaction details are combined under the Description column. Those transaction details may include the Vendor Name (N), Vendor Number (V#), Purchase Order Number (PO#), Invoice Number (I#), Finance Group (F#), and Requisition Number (R#). There are many inconsistencies in how the data appear, as the order of these details varies from row to row, and the presence of these details varies as well. Using abbreviations to represent the six types of details listed above, the table includes row-by-row examples that contain VPN, VP, VIFRN, VIFN, NP, VIF, VIFR, N, and NI. If I split the Description column at the pipe ( | ) delimiter, I obtain the result shown in columns C:G, which is not useful for subsequent analysis.
MonthlyMaint.xlsx
ABCDEFG
1DescriptionDesc.1Desc.2Desc.3Desc.4Desc.5
2VPNV# 013578 |PO# 021623 |TUIT RES INC.V# 013578PO# 021623TUIT RES INC.
3VPV# 009373 |PO# 175247V# 009373PO# 175247
4VIFRNV# 009243 |I# 152803 |F# SYS2 |R# 207673| COLUMBUS SALESV# 009243I# 152803F# SYS2R# 207673COLUMBUS SALES
5VIFNV# 014804 |I# IN338171 |F# SYS1|NET SYSTEMSV# 014804I# IN338171F# SYS1NET SYSTEMS
6NPData Analytics |PO# 2082Data AnalyticsPO# 2082
7VIFV# 021703 |I# 19389 |F# SYS1V# 021703I# 19389F# SYS1
8VIFRV# 008273 |I# 030678996 |F# SYS2 |R# 201878V# 008273I# 030678996F# SYS2R# 201878
9NBGE HEALTH SUPPLIESBGE HEALTH SUPPLIES
10NICalls First, Inc. |I# 69177781159Calls First, Inc.I# 69177781159
Transactions

What I would like is an output similar to this, where all of the V#, PO#, I#, F#, R#, and Vendor Name details are shown under the correct columns. The order of these columns is not important. I would very much appreciate any help.
MonthlyMaint.xlsx
IJKLMN
1V#PO#I#F#R#Name
2V# 013578PO# 021623TUIT RES INC.
3V# 009373PO# 175247
4V# 009243I# 152803F# SYS2R# 207673COLUMBUS SALES
5V# 014804I# IN338171F# SYS1NET SYSTEMS
6PO# 2082Data Analytics
7V# 021703I# 19389F# SYS1
8V# 008273I# 030678996F# SYS2R# 201878
9BGE HEALTH SUPPLIES
10I# 69177781159Calls First, Inc.
Transactions
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    AddedIndex = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
    SplitColumn = Table.ExpandListColumn(Table.TransformColumns(AddedIndex, {{"Description", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Description"),
    Trimmed = Table.TransformColumns(SplitColumn,{{"Description", Text.Trim, type text}}),
    GetHeader = Table.AddColumn(Trimmed, "Custom", each Text.Start([Description],Text.PositionOf([Description],"#")+1)),
    Pivoted = Table.Pivot(GetHeader, List.Distinct(GetHeader[Custom]), "Custom", "Description"),
    RenamedColumns = Table.RenameColumns(Pivoted,{{"", "Name"}}),
    KeepColumns = Table.SelectColumns(RenamedColumns,{"V#", "PO#", "I#", "F#", "R#", "Name"})
in
    KeepColumns

Book1
BCDEFGHI
1DescriptionV#PO#I#F#R#Name
2V# 013578 |PO# 021623 |TUIT RES INC.V# 013578PO# 021623TUIT RES INC.
3V# 009373 |PO# 175247V# 009373PO# 175247
4V# 009243 |I# 152803 |F# SYS2 |R# 207673| COLUMBUS SALESV# 009243I# 152803F# SYS2R# 207673COLUMBUS SALES
5V# 014804 |I# IN338171 |F# SYS1|NET SYSTEMSV# 014804I# IN338171F# SYS1NET SYSTEMS
6Data Analytics |PO# 2082PO# 2082Data Analytics
7V# 021703 |I# 19389 |F# SYS1V# 021703I# 19389F# SYS1
8V# 008273 |I# 030678996 |F# SYS2 |R# 201878V# 008273I# 030678996F# SYS2R# 201878
9BGE HEALTH SUPPLIESBGE HEALTH SUPPLIES
10Calls First, Inc. |I# 69177781159I# 69177781159Calls First, Inc.
Sheet1
 
Upvote 0
Solution
Another solution for your reference:
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    res = Table.Combine(
                        List.Transform(
                                        Source[Description],
                                        each Table.FromRecords(
                                                                {
                                                                    Record.Combine(
                                                                                    List.Transform(
                                                                                                    Text.Split(_,"|"),
                                                                                                    (x)=>if Text.Contains(x,"#") then Record.FromList({x},{Text.Split(x," "){0}})
                                                                                                         else [Name=x]
                                                                                                  )
                                                                                  )
                                                                }
                                                              )
                                      )
                       )
in
    res
1627749507164.png
 
Upvote 0
Change some functions:
Power Query:
let
    Source=Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    res=Table.Combine(
                        List.Transform(
                                        Source[Description],
                                        each
                                            let
                                                a=Text.Split(_,"|")
                                            in
                                                #table(
                                                        List.Transform(a,(x)=>if Text.Contains(x,"#") then Text.Split(x," "){0} else "Name"),
                                                        {a}
                                                      )
                                      )
                     )
in
    res

1627751177405.png
 
Upvote 0
@shaowu459 , Thank you for these solutions. Your offerings and the one from @Bo_Ry are much better than anything I tried before asking for help. I have several approaches to study now! I appreciate the help from both of you.
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,183
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top