Flattening a hierarchy from 5 columns into a unique one

tomoonot

New Member
Joined
May 6, 2023
Messages
5
Office Version
  1. 365
Platform
  1. Windows
I'm trying to flatten a hierarchy of 5 columns into one singular column with only unique values from an original file that is copypasted into the InputTable The output column is not too important, so I am trying to create this hierarchy by ommitting it.

It is kinda working, but it's not optimal, since I cannot seem to be able to figure out, how to lookup a value and retrieve its header.
So I have to circle back a using a hardcoded sequence of 5 repeating values and then applying unique to those and then filtering them.
Also not sure if it's possible to have a table of dynamic size, since I am copy pasting into it, I would like to have it adjust automatically how many rows are not empty under the header, but no idea how to do so.

And since there are 4 Columns of Rows, in the 2nd table, where I try to flatten it, I had to pre-allocate 4-5x rows for the output of the flattenning.
In case where I had a 1000 rows, I had to pre-allocate 5000, which seemed to be quite slow.
I'm pretty sure it could be done more elegantly, dynamically and in a way that is not so resource intensive,

I'm imagining that if I could just get the header row from a formula without using the hard coded workaround and 2 helper tables, it should be possible to do the whole flattenning, removing duplicates and filtering out the output in one go, but I am struggling to find out how.

Also watched some vdeos on dynamic arrays using NamedRange# with the has/pound sign in the end, but haven't really gotten to trying it out as I am mostly getting by by using index and 1 result at a time by ROW()-1, but perhaps arrays could be better? Tried something similar a few times before but inside a named table they'd spill, not sure if I had used the correct notation at the time though.

Input table: Source data
Dymmy_Sheet - Copy.xlsx
ABCDEF
1StageLevel1Level2Level3Level4Output
2Stage 1A100: Level1 AB200: Level2 ABC300: Level3 ABCXYZ10: Name 1Output 1
3Stage 1A100: Level1 AB200: Level2 ABC300: Level3 ABCXYZ10: Name 1Output 2
4Stage 1A100: Level1 AB200: Level2 ABC300: Level3 ABCXYZ15: Name 2Output 1
5Stage 1A100: Level1 AB200: Level2 ABC300: Level3 ABCXYZ15: Name 2Output 2
6Stage 1A100: Level1 AB200: Level2 ABC300: Level3 ABCXYZ15: Name 2Output 3
7Stage 1A100: Level1 AB200: Level2 ABC350: Level3 ABCXYZ20: Name 1Output 4
8Stage 1A100: Level1 AB200: Level2 ABC350: Level3 ABCXYZ20: Name 1Output 5
9Stage 1A100: Level1 AB200: Level2 ABC350: Level3 ABCXYZ25: Name 2Output 6
10Stage 1A100: Level1 AB250: Level2 ABC400: Level3 ABCXYZ30: Name 3Output 7
11Stage 2A150: Level1 AB300: Level2 ABC450: Level4 ABCSDT20: Name 1Output 11
12Stage 2A150: Level1 AB300: Level2 ABC450: Level4 ABCSDT20: Name 1Output 12
13Stage 2A150: Level1 AB300: Level2 ABC500: Level4 ABCSDT25: Name 1Output 20
14Stage 3A200: Level1 AB350: Level2 ABC550: Level4 ABCSDT30: Name 1Output 25
Transform
Cells with Conditional Formatting
CellConditionCell FormatStop If True
A2:F14Cell ValueduplicatestextNO
E:ECell ValueduplicatestextNO
F:FCell ValueduplicatestextNO


AllHierarchyTable: Flattened table
Dymmy_Sheet - Copy.xlsx
HIJK
1allHierarchyTypeTest1Test2
2A100: Level1 ALevel1#N/A#VALUE!
3B200: Level2 ABLevel2#N/A#VALUE!
4C300: Level3 ABCLevel3#N/A#VALUE!
5XYZ10: Name 1Level4#N/A#VALUE!
6Output 1Output#N/A#VALUE!
7A100: Level1 ALevel1#N/A#VALUE!
8B200: Level2 ABLevel2#N/A#VALUE!
9C300: Level3 ABCLevel3#N/A#VALUE!
10XYZ10: Name 1Level4#N/A#VALUE!
11Output 2Output#N/A#VALUE!
12A100: Level1 ALevel1#N/A#VALUE!
13B200: Level2 ABLevel2#N/A#VALUE!
14C300: Level3 ABCLevel3#N/A#VALUE!
15XYZ15: Name 2Level4#N/A#VALUE!
16Output 1Output#N/A#VALUE!
17A100: Level1 ALevel1#N/A#VALUE!
18B200: Level2 ABLevel2#N/A#VALUE!
19C300: Level3 ABCLevel3#N/A#VALUE!
20XYZ15: Name 2Level4#N/A#VALUE!
21Output 2Output#N/A#VALUE!
22A100: Level1 ALevel1#N/A#VALUE!
23B200: Level2 ABLevel2#N/A#VALUE!
24C300: Level3 ABCLevel3#N/A#VALUE!
25XYZ15: Name 2Level4#N/A#VALUE!
26Output 3Output#N/A#VALUE!
27A100: Level1 ALevel1#N/A#VALUE!
28B200: Level2 ABLevel2#N/A#VALUE!
29C350: Level3 ABCLevel3#N/A#VALUE!
30XYZ20: Name 1Level4#N/A#VALUE!
31Output 4Output#N/A#VALUE!
32A100: Level1 ALevel1#N/A#VALUE!
33B200: Level2 ABLevel2#N/A#VALUE!
34C350: Level3 ABCLevel3#N/A#VALUE!
35XYZ20: Name 1Level4#N/A#VALUE!
36Output 5Output#N/A#VALUE!
37A100: Level1 ALevel1#N/A#VALUE!
38B200: Level2 ABLevel2#N/A#VALUE!
39C350: Level3 ABCLevel3#N/A#VALUE!
40XYZ25: Name 2Level4#N/A#VALUE!
41Output 6Output#N/A#VALUE!
42A100: Level1 ALevel1#N/A#VALUE!
43B250: Level2 ABLevel2#N/A#VALUE!
44C400: Level3 ABCLevel3#N/A#VALUE!
45XYZ30: Name 3Level4#N/A#VALUE!
46Output 7Output#N/A#VALUE!
47A150: Level1 ALevel1#N/A#VALUE!
48B300: Level2 ABLevel2#N/A#VALUE!
49C450: Level4 ABCLevel3#N/A#VALUE!
50SDT20: Name 1Level4#N/A#VALUE!
51Output 11Output#N/A#VALUE!
Transform
Cell Formulas
RangeFormula
H2:H51H2=IFERROR(INDEX(InputTable[[Level1]:[Output]], INT((ROW()-2)/COLUMNS(InputTable[[Level1]:[Output]]))+1, MOD(ROW()-2, COLUMNS(InputTable[[Level1]:[Output]]))+1), "")
I2:I51I2=IF(allHierarchy<>"", CHOOSE(MOD(ROW()-2,5)+1, "Level1", "Level2", "Level3", "Level4", "Output"), "")
J2:J51J2=INDEX(InputTable[#Headers], MATCH([@allHierarchy], INDEX(InputTable, 0, MATCH([@allHierarchy],InputTable, 0)), 0))
K2:K51K2=IF([@allHierarchy]<>"",XLOOKUP([@allHierarchy],InputTable,InputTable[#Headers],,0,1),"")
Named Ranges
NameRefers ToCells
allHierarchy=FlattenTable[allHierarchy]J2:K2, I2:I51
Level1=InputTable[Level1]H2:H51, J2:K51
Level2=InputTable[Level2]H2:H51, J2:K51
Level3=InputTable[Level3]H2:H51, J2:K51
Level4=InputTable[Level4]H2:H51, J2:K51
Output=InputTable[Output]H2:H51, J2:K51
Stage=InputTable[Stage]J2:K51


UniqueTable: Unique values from the flattened table
Cell Formulas
RangeFormula
M2:M24M2=IFERROR(IF(allHierarchy<>"",CHOOSE(MATCH([@Type2], {"Level1","Level2","Level3","Level4","Output"}, 0), "Alpha", "Beta", "Gamma", "Delta", "Epsilon"), ""),"")
N2:N24N2=IFERROR(INDEX(UNIQUE(FILTER(FlattenTable[allHierarchy], (FlattenTable[allHierarchy]<>"")*(FlattenTable[allHierarchy]<>0))), ROW()-1), "")
O2:O24O2=IF([@Uniques]<>"",XLOOKUP([@Uniques],FlattenTable[allHierarchy],FlattenTable[Type],,0), "")
Named Ranges
NameRefers ToCells
allHierarchy=FlattenTable[allHierarchy]M2:O24


Result table: Final result, however with a lot of intermediary steps and 2 helper tables
Cell Formulas
RangeFormula
Q2:Q16Q2=IF([@[Name/ID]]<>"", INDEX(UniqueTable[Kind], MATCH([@[Name/ID]], UniqueTable[Uniques], 0)), "")
R2:R16R2=IFERROR( INDEX(FILTER(UniqueTable[[Kind]:[Uniques]], ( UniqueTable[Kind]="Alpha")+( UniqueTable[Kind]="Beta")+( UniqueTable[Kind]="Gamma")+( UniqueTable[Kind]="Delta") ), ROW()-1,2), "")
S2:S16S2=IF([@Kind]="Alpha",XLOOKUP([@[Name/ID]],Level1,Stage,,),"")
Named Ranges
NameRefers ToCells
Level1=InputTable[Level1]S2:S16
Stage=InputTable[Stage]S2:S16
 

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
Using power query
Book6
ABCD
1StageAttributeValueKind
2Stage 1Level1A100: Level1 AAlpha
3Stage 1Level2B200: Level2 ABBeta
4Stage 1Level3C300: Level3 ABCGamma
5Stage 1Level4XYZ10: Name 1Delta
6Stage 1Level4XYZ15: Name 2Delta
7Stage 1Level3C350: Level3 ABCGamma
8Stage 1Level4XYZ20: Name 1Delta
9Stage 1Level4XYZ25: Name 2Delta
10Stage 1Level2B250: Level2 ABBeta
11Stage 1Level3C400: Level3 ABCGamma
12Stage 1Level4XYZ30: Name 3Delta
13Stage 2Level1A150: Level1 AAlpha
14Stage 2Level2B300: Level2 ABBeta
15Stage 2Level3C450: Level4 ABCGamma
16Stage 2Level4SDT20: Name 1Delta
17Stage 2Level3C500: Level4 ABCGamma
18Stage 2Level4SDT25: Name 1Delta
19Stage 3Level1A200: Level1 AAlpha
20Stage 3Level2B350: Level2 ABBeta
21Stage 3Level3C550: Level4 ABCGamma
22Stage 3Level4SDT30: Name 1Delta
Table1


Code in PQ
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Stage", type text}, {"Level1", type text}, {"Level2", type text}, {"Level3", type text}, {"Level4", type text}, {"Output", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Stage"}, "Attribute", "Value"),
    #"Removed Duplicates" = Table.Distinct(#"Unpivoted Columns", {"Attribute", "Value"}),
    #"Added Custom" = Table.AddColumn(#"Removed Duplicates", "Kind", each if [Attribute] = "Level1" then "Alpha" else if [Attribute] = "Level2" then "Beta" else if [Attribute] = "Level3" then "Gamma" else if [Attribute] = "Level4" then "Delta" else "Epsilon"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Attribute] <> "Output"))
in
    #"Filtered Rows"
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,886
Messages
6,175,190
Members
452,616
Latest member
intern444

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top