Power Query to clean up statement data from multiple to single rows

kcmuppet

Active Member
Joined
Nov 2, 2005
Messages
444
Office Version
  1. 365
Platform
  1. Windows
Hi,

In the example below I have a text description field that is split/spilled over 3 rows, and also a 4 columns of numbers that need to be on the same row.

What's the best way to transform the table so that the rows are merged? Should it be some sort of next unpivot?

Tax Calcs to 5 Apr2024.xlsx
ABCDEFG
1From this:
2Date of SaleSale QuantityStock, Matching DetailsProceedsCostNet LossNet Gain
314/11/202335ADVANCED MICRO DEVICES INC
4COM STK USD0.013341.21
508 Pool of 352508.59832.62
614/11/2023134APPLE INC
7COM USD0.0000119997.59
808 Pool of 1342423.1617574.43
921/02/20242500ATOME PLC
10ORD GBP0.002 (AIM)1252.55
1108 Pool of 2,5002000747.45
1221/07/202359BEYOND MEAT
13USD0.0001679
1408 Pool of 595044.674365.67
15
16
17To this:
18Date of SaleSale QuantityStock, Matching DetailsProceedsCostNet LossNet Gain
1914/11/202335ADVANCED MICRO DEVICES INC COM STK USD0.01 08 Pool of 353341.212508.59832.62
2014/11/2023134APPLE INC COM USD0.00001 08 Pool of 13419997.592423.1617574.43
2121/02/20242500ATOME PLC ORD GBP0.002 (AIM) 08 Pool of 2,5001252.552000747.45
2221/07/202359BEYOND MEAT USD0.0001 08 Pool of 596795044.674365.67
Table008 (Page 16) (2)


Power Query:
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZFNa4QwEIb/yuCpBclmJhOjR1dDka4frHahiNeeCv7/WzOmthQqhiG8QzIzT97Mc4J8QbyQJpOkibFhK+tH2VW+hrap7j3U/tFUfoSmq8Lhbyzp/C2rvoVxeoW3sdZKo/QxjIrwv8s6h2FdP2H9gG1aCLI6V7aISW5IZbQV/EFDw8I2DDd/ghIxwpL5WBSF+2l+iBK7CwuTUZjFBJ11rNhsNYQXTULDEVkLztS3HobbEU5/r+HlOggOwVPZtM/SlSwpa0+IKI0jZJjelGOn2O4FguN2c7b3Xf1734Vf8+V0gLM7I8Zk7syT3TSrmVXmgmKT2aiSZfkC", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Date of Sale" = _t, #"Sale Quantity" = _t, #"Stock, Matching Details" = _t, Proceeds = _t, Cost = _t, #"Net Loss" = _t, #"Net Gain" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date of Sale", type date}, {"Sale Quantity", Int64.Type}, {"Stock, Matching Details", type text}, {"Proceeds", type number}, {"Cost", type number}, {"Net Loss", type number}, {"Net Gain", type number}})
in
    #"Changed Type"
 

Excel Facts

Do you hate GETPIVOTDATA?
Prevent GETPIVOTDATA. Select inside a PivotTable. In the Analyze tab of the ribbon, open the dropown next to Options and turn it off
Here's one way:

Power Query:
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZFNa4QwEIb/yuCpBclmJhOjR1dDka4frHahiNeeCv7/WzOmthQqhiG8QzIzT97Mc4J8QbyQJpOkibFhK+tH2VW+hrap7j3U/tFUfoSmq8Lhbyzp/C2rvoVxeoW3sdZKo/QxjIrwv8s6h2FdP2H9gG1aCLI6V7aISW5IZbQV/EFDw8I2DDd/ghIxwpL5WBSF+2l+iBK7CwuTUZjFBJ11rNhsNYQXTULDEVkLztS3HobbEU5/r+HlOggOwVPZtM/SlSwpa0+IKI0jZJjelGOn2O4FguN2c7b3Xf1734Vf8+V0gLM7I8Zk7syT3TSrmVXmgmKT2aiSZfkC", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Date of Sale" = _t, #"Sale Quantity" = _t, #"Stock, Matching Details" = _t, Proceeds = _t, Cost = _t, #"Net Loss" = _t, #"Net Gain" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date of Sale", type date}, {"Sale Quantity", Int64.Type}, {"Stock, Matching Details", type text}, {"Proceeds", type number}, {"Cost", type number}, {"Net Loss", type number}, {"Net Gain", type number}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Grouper", each if [Date of Sale] <> null then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Grouper"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
    #"Filled Down1" = Table.FillDown(#"Removed Columns",{"Date of Sale", "Sale Quantity"}),
    #"Grouped Rows" = Table.Group(#"Filled Down1", {"Date of Sale", "Sale Quantity", "Grouper"}, {{"Stock Details", each Text.Combine([#"Stock, Matching Details"], " "), type text}, {"Proceeds", each List.Sum([Proceeds]), type nullable number}, {"Cost", each List.Sum([Cost]), type nullable number}, {"Net Loss", each List.Sum([Net Loss]), type nullable number}, {"Net Gain", each List.Sum([Net Gain]), type nullable number}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Grouped Rows",{"Grouper"})
in
    #"Removed Columns1"
 
Upvote 0
Solution
try

Power Query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZFNa4QwEIb/yuCpBclmJhOjR1dDka4frHahiNeeCv7/WzOmthQqhiG8QzIzT97Mc4J8QbyQJpOkibFhK+tH2VW+hrap7j3U/tFUfoSmq8Lhbyzp/C2rvoVxeoW3sdZKo/QxjIrwv8s6h2FdP2H9gG1aCLI6V7aISW5IZbQV/EFDw8I2DDd/ghIxwpL5WBSF+2l+iBK7CwuTUZjFBJ11rNhsNYQXTULDEVkLztS3HobbEU5/r+HlOggOwVPZtM/SlSwpa0+IKI0jZJjelGOn2O4FguN2c7b3Xf1734Vf8+V0gLM7I8Zk7syT3TSrmVXmgmKT2aiSZfkC", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Date of Sale" = _t, #"Sale Quantity" = _t, #"Stock, Matching Details" = _t, Proceeds = _t, Cost = _t, #"Net Loss" = _t, #"Net Gain" = _t]),
_1 = (x)=> Table.FromRows({List.Transform(Table.ToColumns(x), each Text.Combine(_,""))}, Value.Type(Source)),
_2 = Table.Combine(Table.Group(Source, "Date of Sale", {"x", _1}, 0, (x,y)=> Byte.From(x <> "" and y <> ""))[x])
in _2

Regards,
 
Upvote 0
Power Query:
_1 = (x)=> Table.FromRows({List.Transform(Table.ToColumns(x), each Text.Combine(_,""))}, Value.Type(Source)),
_2 = Table.Combine(Table.Group(Source, "Date of Sale", {"x", _1}, 0, (x,y)=> Byte.From(x <> "" and y <> ""))[x])
in _2

Thanks for this solution too @alex78 . Do you have time to explain how it works?
 
Upvote 0
Power Query:
let
    source = your_table,
    data = Table.Split(source, 3),
    cols = List.Buffer(Table.ColumnNames(source)),
    tx = [Date of Sale = List.First, #"Stock, Matching Details" = (x) => Text.Combine(x, " ")],
    fx = (tbl) => List.Transform(cols, (x) => Record.FieldOrDefault(tx, x, List.Sum)(Table.Column(tbl, x))), 
    result = Table.FromList(List.Transform(data, fx), (x) => x, Value.Type(source))
in
    result
 
Upvote 0
Power Query:
let
    source = your_table,
    data = Table.Split(source, 3),
    cols = List.Buffer(Table.ColumnNames(source)),
    tx = [Date of Sale = List.First, #"Stock, Matching Details" = (x) => Text.Combine(x, " ")],
    fx = (tbl) => List.Transform(cols, (x) => Record.FieldOrDefault(tx, x, List.Sum)(Table.Column(tbl, x))),
    result = Table.FromList(List.Transform(data, fx), (x) => x, Value.Type(source))
in
    result
Thanks for your solution too @AlienSx.

I think it's splitting the table at every 3 rows then creating a list of column headers. Then after that, I can't work it out. Do you have time to explain?
 
Upvote 0
Thanks for your solution too @AlienSx.

I think it's splitting the table at every 3 rows then creating a list of column headers. Then after that, I can't work it out. Do you have time to explain?
...after that I transform a list of tables into the list of rows (each table is transformed into single row). To do that we need to get single value out of each column of our tables. For column with date we just take first item. For text column we combine text. For all other columns we just sum values in the column.
tx: record with 2 fields - date and text columns. Field values are functions - List.First and Text.Combine respectively. Default transformation will be List.Sum so we may use it for all other columns in Record.FieldOrDefault function.
fx: function that gets a table as argument and return a list of row values. It simply transforms a list of column names by calling each column by its name >> find transformation function using column name in record tx via Record.FieldOrDefault >> apply this function to the list of column values received using Table.Column function.
result: simple Table.FromList function that takes a result of List.Transform (list of rows), just do nothing and assign original table column names and types.
 
Upvote 0

Forum statistics

Threads
1,225,361
Messages
6,184,510
Members
453,237
Latest member
lordleo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top