Power Query with multiple header rows

Moosles

New Member
Joined
Apr 1, 2021
Messages
23
Office Version
  1. 2019
Platform
  1. Windows
I have a table of data which is in an awkward format which I'm trying to simplify using Power Query. I'm not very well versed in Power Query so I'm probably missing something obvious here, so I'm hoping you can help! The table of data is in a format like this:

ID Number 1ID Number 2StatusStart Date1
Activity 1
2
Activity 2
3
Activity 3
4
Activity 4
5
Activity 5
6
Activity 6
123451Active14-Feb-19300250
123452Active11-May-2016080130
123453Completed3-Jun-2022040
678901Active6-Sep-19175350125
678902Active11-Nov-1990
678903Cancelled6-Apr-20210
678904Active9-May-20110
678905Active22Feb-21-200
678906Completed6-Aug-21105265110

And I want to end up with a simplified table of data like this:

ID Number 1ID Number 2StatusStart DateActivity NumberActivity NameCost
123451Active14-Feb-191Activity Name 1300
123451Active14-Feb-193Activity Name 3250
123452Active11-May-202Activity Name 2160
123452Active11-May-205Activity Name 580
123452Active11-May-206Activity Name 6130
123453Completed03-Jun-201Activity Name 1220
123453Completed03-Jun-204Activity Name 440
678901Active06-Sep-191Activity Name 1175
678901Active06-Sep-193Activity Name 3350
678901Active06-Sep-196Activity Name 6125
678902Active11-Nov-194Activity Name 490
678903Completed06-Apr-202Activity Name 2210
678904Active09-May-204Activity Name 4110
678905Active22-Feb-211Activity Name 1200
678906Completed06-Aug-212Activity Name 2105
678906Completed06-Aug-213Activity Name 3265
678906Completed06-Aug-215Activity Name 5110

The issue is that the Header row in the first table is actually 2 rows, split by Activity Number and Activity Name:

ID Number 1ID Number 2StatusStart Date123456
Activity Name 1Activity Name 2Activity Name 3Activity Name 4Activity Name 5Activity Name 6

So when I end up using the Power Query (and filtering the output) the output is this:

ID Number 1ID Number 2StatusStart DateActivity NumberCost
123451Active14-Feb-191300
123451Active14-Feb-193250
123452Active11-May-202160
123452Active11-May-20580
123452Active11-May-206130
123453Completed03-Jun-201220
123453Completed03-Jun-20440
678901Active06-Sep-191175
678901Active06-Sep-193350
678901Active06-Sep-196125
678902Active11-Nov-19490
678903Completed06-Apr-202210
678904Active09-May-204110
678905Active22-Feb-211200
678906Completed06-Aug-212105
678906Completed06-Aug-213265
679806Completed6-Aug-215110

But I'd like to obtain the Activity Name as well. Once I Close & Load from the Power Query I can apply an HLOOKUP formula in the output to pull the Activity Name, but if I refresh the Power Query then this manual change gets removed. Is there a way that I can pull in the Activity Name directly from the source data table as part of the Power Query as well?
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    ActivityNames = List.Skip(Record.ToList(Source{0}),4),
    OldHeaders = List.LastN(Table.ColumnNames(Source), 6),
    RenameList = List.Generate(()=>0, (x)=> x<List.Count(OldHeaders), (x)=> x+1, (x)=> {OldHeaders{x}, OldHeaders{x} & " - " & ActivityNames{x}}),
    tbl = Table.RenameColumns(Table.Skip(Source,1),RenameList),
    tbl1 = Table.UnpivotOtherColumns(tbl, {"ID Number 1", "ID Number 2", "Status", "Start Date"}, "Activity Name", "Cost"),
    tbl2 = Table.SplitColumn(tbl1, "Activity Name", Splitter.SplitTextByDelimiter(" - ", QuoteStyle.Csv), {"Activity Number", "Activity Name"}),
    Result = Table.TransformColumnTypes(tbl2,{{"Activity Number", Int64.Type}, {"Activity Name", type text}, {"Start Date", type date}})
in
    Result

Book1
ABCDEFGHIJKLMNOPQRS
1
2ID Number 1ID Number 2StatusStart Date123456ID Number 1ID Number 2StatusStart DateActivity NumberActivity NameCost
3Activity 1Activity 2Activity 3Activity 4Activity 5Activity 6123451Active2/14/20191Activity 1300
4123451Active2/14/2019300250123451Active2/14/20193Activity 3250
5123452Active5/11/202016080130123452Active5/11/20202Activity 2160
6123453Completed6/3/202022040123452Active5/11/20205Activity 580
7678901Active9/6/2019175350125123452Active5/11/20206Activity 6130
8678902Active11/11/201990123453Completed6/3/20201Activity 1220
9678903Cancelled4/6/2020210123453Completed6/3/20204Activity 440
10678904Active5/9/2020110678901Active9/6/20191Activity 1175
11678905Active2/22/2021200678901Active9/6/20193Activity 3350
12678906Completed8/6/2021105265110678901Active9/6/20196Activity 6125
13678902Active11/11/20194Activity 490
14678903Cancelled4/6/20202Activity 2210
15678904Active5/9/20204Activity 4110
16678905Active2/22/20211Activity 1200
17678906Completed8/6/20212Activity 2105
18678906Completed8/6/20213Activity 3265
19678906Completed8/6/20215Activity 5110
20
Sheet3
 
Upvote 0
What I did was did exactly as what you did.
I then duplicated the Activity number column and added a prefix to the duplicated column of "Activity Name"

The above is on the assumption that the Activity Number is the same as numerical prefix as the Activity Name.

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID Number 1", "ID Number 2", "Status", "Start Date"}, "Attribute", "Value"),
    #"Filtered Rows" = Table.SelectRows(#"Unpivoted Other Columns", each ([Start Date] <> null)),
    #"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"Attribute", "Activity Number"}, {"Value", "Cost"}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Renamed Columns", "Activity Number", "Activity Number - Copy"),
    #"Reordered Columns" = Table.ReorderColumns(#"Duplicated Column",{"ID Number 1", "ID Number 2", "Status", "Start Date", "Activity Number", "Activity Number - Copy", "Cost"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Reordered Columns",{{"Activity Number - Copy", "Activity Name"}}),
    #"Added Prefix" = Table.TransformColumns(#"Renamed Columns1", {{"Activity Name", each "Activity Name " & _, type text}})
in
    #"Added Prefix"
 
Upvote 0
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    ActivityNames = List.Skip(Record.ToList(Source{0}),4),
    OldHeaders = List.LastN(Table.ColumnNames(Source), 6),
    RenameList = List.Generate(()=>0, (x)=> x<List.Count(OldHeaders), (x)=> x+1, (x)=> {OldHeaders{x}, OldHeaders{x} & " - " & ActivityNames{x}}),
    tbl = Table.RenameColumns(Table.Skip(Source,1),RenameList),
    tbl1 = Table.UnpivotOtherColumns(tbl, {"ID Number 1", "ID Number 2", "Status", "Start Date"}, "Activity Name", "Cost"),
    tbl2 = Table.SplitColumn(tbl1, "Activity Name", Splitter.SplitTextByDelimiter(" - ", QuoteStyle.Csv), {"Activity Number", "Activity Name"}),
    Result = Table.TransformColumnTypes(tbl2,{{"Activity Number", Int64.Type}, {"Activity Name", type text}, {"Start Date", type date}})
in
    Result

Book1
ABCDEFGHIJKLMNOPQRS
1
2ID Number 1ID Number 2StatusStart Date123456ID Number 1ID Number 2StatusStart DateActivity NumberActivity NameCost
3Activity 1Activity 2Activity 3Activity 4Activity 5Activity 6123451Active2/14/20191Activity 1300
4123451Active2/14/2019300250123451Active2/14/20193Activity 3250
5123452Active5/11/202016080130123452Active5/11/20202Activity 2160
6123453Completed6/3/202022040123452Active5/11/20205Activity 580
7678901Active9/6/2019175350125123452Active5/11/20206Activity 6130
8678902Active11/11/201990123453Completed6/3/20201Activity 1220
9678903Cancelled4/6/2020210123453Completed6/3/20204Activity 440
10678904Active5/9/2020110678901Active9/6/20191Activity 1175
11678905Active2/22/2021200678901Active9/6/20193Activity 3350
12678906Completed8/6/2021105265110678901Active9/6/20196Activity 6125
13678902Active11/11/20194Activity 490
14678903Cancelled4/6/20202Activity 2210
15678904Active5/9/20204Activity 4110
16678905Active2/22/20211Activity 1200
17678906Completed8/6/20212Activity 2105
18678906Completed8/6/20213Activity 3265
19678906Completed8/6/20215Activity 5110
20
Sheet3

I would go with Gordan approach.

Hope we have helped.
 
Upvote 0
If you are using @JGordon11 code and in the future adding more columns to the data set for example "Activity Number: 7" and "Activity Name 7" amend the below line of code:


Power Query:
    OldHeaders = List.LastN(Table.ColumnNames(Source), 6),

to

Power Query:
  OldHeaders = List.LastN(Table.ColumnNames(Source), 7),
 
Upvote 0
Or if you're always going to have 4 columns before the Activities then the step below will accomodate any number of activities:

Power Query:
OldHeaders = List.Skip(Table.ColumnNames(Source), 4),
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,178
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top