Combining () from different rows

MasterBash

Board Regular
Joined
Jan 22, 2022
Messages
70
Office Version
  1. 365
Platform
  1. Windows
Hi,

I am trying to open a commercial invoice pdf file in Excel and there are a lot of variations to it. I am glad to say I am almost done with this project and there are over 65 steps trying to cover all situations.

Unfortunately, I still have one situation and I am looking for a solution for. It happens when the name of a product is so long that there is a line break in the name of the product. After testing it out multiple times today, it appears line breaks are only with items that have parenthesis () in them.

What it is like :Expected results
8617779890 769467894 Item Name InAnother Language(Item in english) 0.250 PZA 1.000 1.000 PC 276.97 276.97 MXN861777989 769467894 Item Name InAnother Language(Item in english) 0.250 KG 1.000 1.000 PC 276.97 276.97 MXN
8311448890 7445269977 This_is_another_item(Item name_translated 3.478 PZA 1.000 1.000 PC 683.27 683.27 MXN8311448890 7445269977 This_is_another_item(Item name_translated to english) 3.478 KG 1.000 1.000 PC 683.27 683.27 JPY
to english)

Now, because I am doing a lot of modifications, such as doing 2 splits left with space as delimiter to remove the leading numbers (8... and 7...) and deleting those columns and just keeping the item name first, that means "to english) also gets deleted. The correct same should be : This_is_another_item(Item name_translated to english)

I am looking to combine the third and fourth rows together. Here is another issue - There is also quantity, weight, price, currency, etc in the first row. I don't know how to combine the name correctly. I also need the date to the right, so I am also doing a bunch split column with space as delimiter to seperate each values.

How can I combine the name correctly ?
 

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
The table above already contains merged columns. If I add the step early on then it may be easier to do but I will get extra data I will need to filter out somehow :

What it is like :Item nameImportant values
8617779890 769467894Item Name InAnother Language(Item in english)0.250 PZA 1.000 1.000 PC 276.97 276.97 MXN
8311448890 7445269977This_is_another_item(Item name_translated3.478 PZA 1.000 1.000 PC 683.27 683.27 MXN
A7F0to english)3.478

So I am guessing if I would be combining row 3 and 4 like it may be easier to get the correct name, but I am afraid I wlll get duplicates or extra values (I am assuming). Here is an example how I expect it to look like if I knew how to combine the rows.

What I am guessing it would look like without initially merging the columns :

What I think it will look like :Item nameImportant values
8617779890 769467894Item Name InAnother Language(Item in english)0.250 PZA 1.000 1.000 PC 276.97 276.97 MXN
8311448890 7445269977 A7F0This_is_another_item(Item name_translated to english)3.478 PZA 1.000 1.000 PC 683.27 683.27 MXN
3.478

then I am not exactly sure how I would filter out those extra values (Like A7F0 and the extra 3.478) that appeared in the third row that came from the merge with the fourth row. However, if the 3.478 appears after MXN, that will be fine because I am already using a text replace to add a delimiter and filter out everything on the right of it. However, that stills leave out A7F0.
 
Upvote 0
Sorry for bumping my thread, I am still trying to find the best solution to the problem that I am currently having.

I believe an even better solution would simply merging the rows inside a column to combine the (), but I don't know if that is possible :

What it is like :Item nameImportant values
8617779890 769467894Item Name InAnother Language(Item in english)0.250 PZA 1.000 1.000 PC 276.97 276.97 MXN
8311448890 7445269977This_is_another_item(Item name_translated3.478 PZA 1.000 1.000 PC 683.27 683.27 MXN
A7F0to english)3.478

What it is like :Item nameImportant values
8617779890 769467894Item Name InAnother Language(Item in english)0.250 PZA 1.000 1.000 PC 276.97 276.97 MXN
8311448890 7445269977This_is_another_item(Item name_translated to english)3.478 PZA 1.000 1.000 PC 683.27 683.27 MXN
A7F03.478

Then I can just filter Item name and remove empty values, so that will also remove A7F0 and 3.478.
 
Upvote 0
Power Query:
let
    count = (txt) => Text.Length(Text.Select(txt, {"(", ")"})),
    #"(_)" = (lst) => List.Generate(
        () => [i = 0, line = lst{0}, len = count(line), accum = line],
        (x) => x[i] < List.Count(lst),
        (x) => 
            [
                i = x[i] + 1, 
                line = lst{i}, 
                len = x[len] + count(line), 
                accum = if Number.IsEven(x[len]) then line else Text.Combine({x[accum], line}, " ")
            ],
        (x) => if Number.IsEven(x[len]) then x[accum] else null
    ),
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    columns = List.Buffer(Table.ToColumns(Source)),
    zip = List.Zip({columns{0}, List.RemoveNulls(#"(_)"(columns{1})), columns{2}}),
    result = Table.FromList(zip, (x) => x, Table.ColumnNames(Source))
in
    result
 
Upvote 0
Thank you, I am still trying to understand.

Count and (_) are asking for a parameter.

Source is trying to load a table from Excel, not exactly sure why because the changes should be made directly from Power Query. I do not have a source table. I am simply opening a pdf file in Power Query, but I am trying to load it properly in an excel sheet. There is nothing loaded before then.
 
Upvote 0
Thank you, I am still trying to understand.

Count and (_) are asking for a parameter.

Source is trying to load a table from Excel, not exactly sure why because the changes should be made directly from Power Query. I do not have a source table. I am simply opening a pdf file in Power Query, but I am trying to load it properly in an excel sheet. There is nothing loaded before then.
After loading pdf into PQ you have some table as you describe above. Then simply replace my Excel.Workbook(... with a reference to your table.
 
Upvote 0

Forum statistics

Threads
1,224,862
Messages
6,181,457
Members
453,042
Latest member
AbdelrahmanExcel

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top