Power Query Split Rows Bases on a Condition

radonwilson

Board Regular
Joined
Jun 23, 2021
Messages
50
Office Version
  1. 2019
Platform
  1. Windows
  1. I want to split each row in Order Table into 2 separate rows where the quantity does not match the quantity given in Refund Table and further mark them as "Partial Order" and "Partial Return".
  2. With the help of the unitary method, I want to split their values as well. (refer to the result table).
  3. I want to add a column named Refund_Com to add values of [Referral]/1.18 column of Refund_Table with [Referral] column of Order Table where the Tag is mentioned as "Partial_Return".
  4. If there is an OrderID that is present in both the tables and where the order quantity matches, then It should be tagged as "Full_Refund". And If an order id is present only in the order's table then it should be marked as "Order".
I have been able to solve this up to Tags Step. But I got puzzled with further calculations.

Here is my M CODE:-
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Order"]}[Content],
    Tags = Table.AddColumn(Source, "Tags", each 
    
    
    let

    MyList = List.PositionOf(Refund[OrderID],[OrderID])

    in

if MyList = -1 then {"Order"}
else if Refund[Qty]{MyList} = [Qty] then {"Full Refund"}
else if Refund[Qty]{MyList} <> [Qty] then {"Partial_Order","Partial_Return"}
else null),
    Expanded_Tags = Table.ExpandListColumn(Tags, "Tags")
in
    Expanded_Tags



Orders Table
OrderIDSKUQtyBase_AmountO_TaxTCSReferralClosingI_TaxTotal
407-4389736FDR002821,158.92139.08-11.58-201.2-60-47.016978.2
407-4389778FDR005621,069.64128.36-10.7-119.8-60-32.364975.1
407-4389665FDR005652,674.10320.9-26.75-299.5-150-80.912437.8
407-4389224FDR00121592.37106.63-5.9237-69.9-30-17.98575.2
407-4389758FDR008521,069.64128.36-10.7-119.8-60-32.364975.14

Refund Table
OrderIDSKUQtyBase_AmountTaxTCSReferralClosingTotal
407-4389736FDR0028
1​
-579.46​
-69.54​
5.796.6135.4-601.2
407-4389778FDR0056
1​
-534.82​
-64.18​
5.35-41.4935.4-593.35
407-4389665FDR0056
2​
-1,069.64​
-128.36​
10.7-82.9870.8-1,199.48
407-4389224FDR0012
1​
-592.37​
-106.63​
5.92-29.61835.4-682.78

Modified/Result_Order Table
OrderIDSKUQtyBase_AmountO_TaxTCSReferralClosingI_TaxTotalRefund_ComTag
407-4389736FDR00281579.4669.54-5.79-100.6-30-23.51489.1Partial Order
407-4389736FDR00281579.4669.54-5.79-100.6-30-23.51489.1-95Partial Return
407-4389778FDR00561534.8264.18-5.35-59.9-30-16.182487.6Partial Order
407-4389778FDR00561534.8264.18-5.35-59.9-30-16.182487.6-95Partial Return
407-4389665FDR005631,604.46192.54-16.05-179.7-90-48.5461462.7Partial Order
407-4389665FDR005621,069.64128.36-10.7-119.8-60-32.364975.1-190Partial Return
407-4389224FDR00121592.37106.63-5.9237-69.9-30-17.98575.2Full Return
407-4389758FDR008521,069.64128.36-10.7-119.8-60-32.364975.14Order
 

Excel Facts

How to fill five years of quarters?
Type 1Q-2023 in a cell. Grab the fill handle and drag down or right. After 4Q-2023, Excel will jump to 1Q-2024. Dash can be any character.
Update: I somehow did it.

Any simple way of achieving the same.

Here is updated M Code

Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="Order"]}[Content],
    Tags = Table.AddColumn(Source, "Tags", each 
    
    
    let

    MyList = List.PositionOf(Refund[OrderID],[OrderID])

    in

if MyList = -1 then {"Order"}
else if Refund[Qty]{MyList} = [Qty] then {"Full Refund"}
else if Refund[Qty]{MyList} <> [Qty] then {"Partial_Order","Partial_Return"}
else null),

    Expanded_Tags = Table.ExpandListColumn(Tags, "Tags"),

    Data_Type = Table.TransformColumnTypes(Expanded_Tags,{{"OrderID", type text}, {"SKU", type text}, {"Qty", Int64.Type}, {"Base_Amount", type number}, {"O_Tax", type number}, {"TCS", type number}, {"Referral", type number}, {"Closing", type number}, {"I_Tax", type number}, {"Total", type number}, {"Tags", type text}}),

    Value_Split = Table.FromRecords(Table.TransformRows(Data_Type,(r)=>Record.TransformFields(r,
List.Transform(List.Select(Record.FieldNames(r),each _  <> "OrderID" and _ <> "SKU" and _ <> "Tags"),each
    {_,each if r[Tags]="Partial_Return" then _/r[Qty]*(Refund[Qty]{List.PositionOf(Refund[OrderID],r[OrderID])}) else if r[Tags]="Partial_Order" then _/r[Qty]*(r[Qty]-Refund[Qty]{List.PositionOf(Refund[OrderID],r[OrderID])})
  else _}
)
))),
    #"Added Custom" = Table.AddColumn(Value_Split, "Custom", each if [Tags]="Partial_Return" then Number.Round(Refund[Referral]{List.PositionOf(Refund[OrderID],[OrderID])}/1.18 +[Referral],0) else null)
in
    #"Added Custom"
 
Upvote 1
This cannot be defined as "I somehow did it". This is a beautiful solution. I don't want to talk too quickly, but this is the most elegant way to solve this problem. The Expanded_Tags step is the key, and you used it perfectly.

Although I think your solution is perfect and there is no need for an alternative one, my attempt was using Joins instead of using the List.PositionOf() function which might be a bit faster if there are thousands of rows, and I'd like to share it if you don't mind.

Power Query:
let
    Order = Excel.CurrentWorkbook(){[Name="Order"]}[Content],
    Refund = Excel.CurrentWorkbook(){[Name="Refund"]}[Content],
    FullOrder = Table.AddColumn(Table.NestedJoin(Order, {"OrderID"}, Refund, {"OrderID"}, "Refund", JoinKind.LeftAnti), "Tag", each "Order"),
    FullRefund =  Table.AddColumn(Table.NestedJoin(Order, {"OrderID", "Qty"}, Refund, {"OrderID", "Qty"}, "Refund", JoinKind.Inner), "Tag", each "Full Return"),
    CombineFull = Table.RemoveColumns(Table.Combine({FullOrder, FullRefund}), {"Refund"}),
    ExcludeFullRefund = Table.NestedJoin(Refund, {"OrderID"}, FullRefund, {"OrderID"}, "FullRefund", JoinKind.LeftAnti),
    Partial = Table.NestedJoin(Order, {"OrderID"}, ExcludeFullRefund, {"OrderID"}, "Partial", JoinKind.Inner),
    Multiplier = Table.AddColumn(Partial, "Multiplier", each 1 -  [Partial][Qty]{0} / [Qty]),
    PartialOrder = Table.AddColumn(Table.FromRecords(Table.TransformRows(Multiplier, (row) => Record.TransformFields(row, List.Transform({"Qty", "Base_Amount", "O_Tax", "TCS", "Referral", "Closing", "I_Tax", "Total"}, (fld) => {fld, each row[Multiplier] * _})))), "Tag", each "Partial Order"),
    PartialRefund = Table.AddColumn(Table.AddColumn(Table.ExpandTableColumn(Table.FromRecords(Table.TransformRows(Multiplier, (row) => Record.TransformFields(row, List.Transform({"Qty", "Base_Amount", "O_Tax", "TCS", "Referral", "Closing", "I_Tax", "Total"}, (fld) => {fld, each (1 - row[Multiplier]) * _})))), "Partial", {"Referral"}, {"Refund.Referral"}), "Refund_Com", each [Refund.Referral] / 1.18 +[Referral]), "Tag", each "Partial Return"),
    CombinePartial = Table.RemoveColumns(Table.Combine({PartialRefund, PartialOrder}), {"Partial", "Multiplier","Refund.Referral"}),
    CombineAll = Table.Combine({CombinePartial, CombineFull}),
    Result = Table.Sort(CombineAll,{{"OrderID", Order.Ascending}})
in
    Result

Thanks for posting the solution. Please don't forget to mark your own solution as the answer.
 
Upvote 0
Solution
This cannot be defined as "I somehow did it". This is a beautiful solution. I don't want to talk too quickly, but this is the most elegant way to solve this problem. The Expanded_Tags step is the key, and you used it perfectly.

Although I think your solution is perfect and there is no need for an alternative one, my attempt was using Joins instead of using the List.PositionOf() function which might be a bit faster if there are thousands of rows, and I'd like to share it if you don't mind.

Power Query:
let
    Order = Excel.CurrentWorkbook(){[Name="Order"]}[Content],
    Refund = Excel.CurrentWorkbook(){[Name="Refund"]}[Content],
    FullOrder = Table.AddColumn(Table.NestedJoin(Order, {"OrderID"}, Refund, {"OrderID"}, "Refund", JoinKind.LeftAnti), "Tag", each "Order"),
    FullRefund =  Table.AddColumn(Table.NestedJoin(Order, {"OrderID", "Qty"}, Refund, {"OrderID", "Qty"}, "Refund", JoinKind.Inner), "Tag", each "Full Return"),
    CombineFull = Table.RemoveColumns(Table.Combine({FullOrder, FullRefund}), {"Refund"}),
    ExcludeFullRefund = Table.NestedJoin(Refund, {"OrderID"}, FullRefund, {"OrderID"}, "FullRefund", JoinKind.LeftAnti),
    Partial = Table.NestedJoin(Order, {"OrderID"}, ExcludeFullRefund, {"OrderID"}, "Partial", JoinKind.Inner),
    Multiplier = Table.AddColumn(Partial, "Multiplier", each 1 -  [Partial][Qty]{0} / [Qty]),
    PartialOrder = Table.AddColumn(Table.FromRecords(Table.TransformRows(Multiplier, (row) => Record.TransformFields(row, List.Transform({"Qty", "Base_Amount", "O_Tax", "TCS", "Referral", "Closing", "I_Tax", "Total"}, (fld) => {fld, each row[Multiplier] * _})))), "Tag", each "Partial Order"),
    PartialRefund = Table.AddColumn(Table.AddColumn(Table.ExpandTableColumn(Table.FromRecords(Table.TransformRows(Multiplier, (row) => Record.TransformFields(row, List.Transform({"Qty", "Base_Amount", "O_Tax", "TCS", "Referral", "Closing", "I_Tax", "Total"}, (fld) => {fld, each (1 - row[Multiplier]) * _})))), "Partial", {"Referral"}, {"Refund.Referral"}), "Refund_Com", each [Refund.Referral] / 1.18 +[Referral]), "Tag", each "Partial Return"),
    CombinePartial = Table.RemoveColumns(Table.Combine({PartialRefund, PartialOrder}), {"Partial", "Multiplier","Refund.Referral"}),
    CombineAll = Table.Combine({CombinePartial, CombineFull}),
    Result = Table.Sort(CombineAll,{{"OrderID", Order.Ascending}})
in
    Result

Thanks for posting the solution. Please don't forget to mark your own solution as the answer.
Thanks for sharing your solution as well. I really appreciate it.

As per my requirements, I have prepared a query that is using List.Position.Of and Table.SelectRows (filtering table in each row) functions multiple times, which is making my query run slower. I have also tried using Table.NestedJoin() instead of List.Position.Of() but the issue is that My matching column values are concatenated with OrderIDs in Order_Table. I have shared a screenshot so you can understand it well.

So how could I join these 2 tables with the Nested Join function when their values are partially matching? My goal is to run my query faster.

1689073891769.png
 
Upvote 0
The preferred approach would be to have separate columns for the Order ID and the SKU values. If separating them is not possible, then we should split them before performing the join.

Is it possible to provide new sample data tables (Order and Refund) that align with the existing design, where fields Order ID and SKU are concatenated using a dash? (or should we just concatenate ID and SKU fields in the initially provided sample data? Both in Order and Refund tables?) This would allow us to work on the real data structure. I'm having some difficulty in matching the table names and structures (Order, Refund, now Item) - my bad. It would be helpful to have a sample data structure that is closer to the existing design.
 
Upvote 0
The preferred approach would be to have separate columns for the Order ID and the SKU values. If separating them is not possible, then we should split them before performing the join.
I would like to know if splitting is the only available option to merge the table in the final step.

I am loading the final transformed table to the sheet where the OrderID and SKU Column concatenation is required.
To Merge Order_Table and SKU_Table I have to first split the OID_SKU column in Order_Table to get SKUs only and then have to perform the merge operation. And after the merge is done I have to remove the SKUs-only column. I want to get rid of these 2 steps and perform the merge directly.

Now how to use this function to merge Table.NestedJoin(Order_Table, {"OID_SKU"}, SKU_Table, {"SKU"}, "SKU_Table", JoinKind.LeftOuter) ??

1689571526677.png





Order_Table
OrderIDSKUQtyBase_AmountO_TaxTCSReferralClosingI_TaxTotal
407-4389736FDR002821,158.92139.08-11.58-201.2-60-47.016978.2
407-4389778FDR005621,069.64128.36-10.7-119.8-60-32.364975.1
407-4389665FDR005652,674.10320.9-26.75-299.5-150-80.912437.8
407-4389224FDR00121592.37106.63-5.9237-69.9-30-17.98575.2
407-4389758FDR008521,069.64128.36-10.7-119.8-60-32.364975.14


Refund_Table
OrderIDSKUQtyBase_AmountTaxTCSReferralClosingTotal
407-4389736FDR00281-579.46-69.545.796.6135.4-601.2
407-4389778FDR00561-534.82-64.185.35-41.4935.4-593.35
407-4389665FDR00562-1,069.64-128.3610.7-82.9870.8-1,199.48
407-4389224FDR00121-592.37-106.635.92-29.61835.4-682.78


SKU_Table
SKUPricePackaging
FDR002855025
FDR005665030
FDR001266050
FDR008577974
 
Upvote 0
Once again, I am having trouble to match your sample data tables and description. Basically, I am looking at the Order_Table, and I can't see a field called OID_SKU. Moreover, I can't match your data tables and the screenshot you sent. In fact, I can't match Item table with anything else.
All I can see is that SKU_Table has the SKU field which is the foreign ID in the Order Table that could be used without any other effort.

I would like to know if splitting is the only available option to merge the table in the final step.
Regarding your question about splitting: Yes, you need to split the field and create a standalone column to perform the join since it requires columns to match. Sample (I am using colon as delimiter instead arrow as shown in your screenshot)

Power Query:
let
    SplitOrderTable = Table.SplitColumn(Order_Table, "OID_SKU", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"OID_SKU.ID", "OID_SKU.SKU"}),
    Source = Table.NestedJoin(SplitOrderTable, {"OID_SKU.ID"}, SKU_Table, {"SKU"}, "SKU_Table", JoinKind.LeftOuter)
in
    Source

Splitting by using the Table.SplitColumn() function won't be a significant performance issue.
 
Upvote 0
Thank you for your reply. I got it now.

My Bad, I mistakenly pasted another table.

Here is a correct table.

Order_Table
OID_SKUQtyBase_AmountO_TaxTCSReferralClosingI_TaxTotal
407-4389736|FDR002821158.92139.08-11.58-201.2-60-47.016978.2
407-4389778|FDR005621069.64128.36-10.7-119.8-60-32.364975.1
407-4389665|FDR005652674.1320.9-26.75-299.5-150-80.912437.8
407-4389224|FDR00121592.37106.63-5.9237-69.9-30-17.98575.2
407-4389758|FDR008521069.64128.36-10.7-119.8-60-32.364975.14
 
Upvote 0
Now it makes sense. All you need to do is adding an extra step to split the OrderID and SKU fields in the Order Table. This extra line to split the field value has minimal effect on the speed comparing to any other method that you would try.

To merge Order_Table and SKU_Table: (I know you already said that you do it this way, but I am just sending the entire code for future readers).
Power Query:
let
    SplitOrderTable = Table.SplitColumn(Order_Table, "OID_SKU", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"OrderID", "SKU"}),
    Source = Table.NestedJoin(SplitOrderTable, {"SKU"}, SKU_Table, {"SKU"}, "SKU_Table", JoinKind.LeftOuter)
in
    Source

And the following is the modified M code for the original question. I still think that your initial solution is much better logically but I just can't make sure about the speed. Your code could be also altered to add the split step.

Power Query:
let
    OrderTable = Excel.CurrentWorkbook(){[Name="Order_Table"]}[Content],
    Order = Table.SplitColumn(Order_Table, "OID_SKU", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"OrderID", "SKU"}),
    Refund = Excel.CurrentWorkbook(){[Name="Refund_Table"]}[Content],
    FullOrder = Table.AddColumn(Table.NestedJoin(Order, {"OrderID"}, Refund, {"OrderID"}, "Refund", JoinKind.LeftAnti), "Tag", each "Order"),
    FullRefund =  Table.AddColumn(Table.NestedJoin(Order, {"OrderID", "Qty"}, Refund, {"OrderID", "Qty"}, "Refund", JoinKind.Inner), "Tag", each "Full Return"),
    CombineFull = Table.RemoveColumns(Table.Combine({FullOrder, FullRefund}), {"Refund"}),
    ExcludeFullRefund = Table.NestedJoin(Refund, {"OrderID"}, FullRefund, {"OrderID"}, "FullRefund", JoinKind.LeftAnti),
    Partial = Table.NestedJoin(Order, {"OrderID"}, ExcludeFullRefund, {"OrderID"}, "Partial", JoinKind.Inner),
    Multiplier = Table.AddColumn(Partial, "Multiplier", each 1 -  [Partial][Qty]{0} / [Qty]),
    PartialOrder = Table.AddColumn(Table.FromRecords(Table.TransformRows(Multiplier, (row) => Record.TransformFields(row, List.Transform({"Qty", "Base_Amount", "O_Tax", "TCS", "Referral", "Closing", "I_Tax", "Total"}, (fld) => {fld, each row[Multiplier] * _})))), "Tag", each "Partial Order"),
    PartialRefund = Table.AddColumn(Table.AddColumn(Table.ExpandTableColumn(Table.FromRecords(Table.TransformRows(Multiplier, (row) => Record.TransformFields(row, List.Transform({"Qty", "Base_Amount", "O_Tax", "TCS", "Referral", "Closing", "I_Tax", "Total"}, (fld) => {fld, each (1 - row[Multiplier]) * _})))), "Partial", {"Referral"}, {"Refund.Referral"}), "Refund_Com", each [Refund.Referral] / 1.18 +[Referral]), "Tag", each "Partial Return"),
    CombinePartial = Table.RemoveColumns(Table.Combine({PartialRefund, PartialOrder}), {"Partial", "Multiplier","Refund.Referral"}),
    CombineAll = Table.Combine({CombinePartial, CombineFull}),
    Result = Table.Sort(CombineAll,{{"OrderID", Order.Ascending}})
in
    Result
 
Upvote 0

Forum statistics

Threads
1,223,933
Messages
6,175,471
Members
452,646
Latest member
tudou

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top