Extracting text multiple times from a column into separate rows and filling down until a change

JamesCR0

New Member
Joined
Oct 3, 2012
Messages
5
<!--[if gte mso 9]><xml> <o:OfficeDocumentSettings> <o:AllowPNG/> </o:OfficeDocumentSettings> </xml><![endif]--> Hi Mr Excel forum members,

I haven’t posted here for a number of years, I normally find answers on Mike and Bills YouTube videos, but have not found anything, or have not correlated the examples to my situation.

I am struggling to accomplish a text extraction task and would greatly appreciate help on this.

I have exported a very large table of sales data. I need to extract 1 to 2 text strings from column F that has a minimum of 3 rows of combined text data so that I can make a useable table for pivot table analysis. I can only extract the data in the format shown below, hence the task in hand.

Excel 2010
FGH

<tbody>
[TD="align: center"]1[/TD]
[TD="bgcolor: #A6E5F4"]Verbe/Kunde/Material[/TD]
[TD="bgcolor: #A6E5F4, align: center"]Planned Sales[/TD]
[TD="bgcolor: #A6E5F4, align: center"]Actual Sales[/TD]

[TD="align: center"]2[/TD]
[TD="bgcolor: #FFFDBF"]** 50073 Sales Person A[/TD]
[TD="bgcolor: #FFFDBF, align: right"]2,144,064 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]3,582,392 PC[/TD]

[TD="align: center"]3[/TD]
[TD="bgcolor: #FFFDBF"]* 54865 Piatti Freschi Italia[/TD]
[TD="bgcolor: #FFFDBF, align: right"]297,456 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]594,912 PC[/TD]

[TD="align: center"]4[/TD]
[TD="bgcolor: #A2C3EA"] 118262 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"]240,000 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]480,000 PC[/TD]

[TD="align: center"]5[/TD]
[TD="bgcolor: #A2C3EA"] 101860 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"]57,456 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]57,456 PC[/TD]

[TD="align: center"]6[/TD]
[TD="bgcolor: #A2C3EA"] 101861 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]57,456 PC[/TD]

[TD="align: center"]7[/TD]
[TD="bgcolor: #FFFDBF"]* 54936 Nuova Jolli s.r.l.[/TD]
[TD="bgcolor: #FFFDBF, align: right"][/TD]
[TD="bgcolor: #FFFDBF, align: right"]285,120 PC[/TD]

[TD="align: center"]8[/TD]
[TD="bgcolor: #A2C3EA"] 105313 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]152,064 PC[/TD]

[TD="align: center"]9[/TD]
[TD="bgcolor: #A2C3EA"] 181301 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]133,056 PC[/TD]

[TD="align: center"]10[/TD]
[TD="bgcolor: #FFFDBF"]* 55373 S.A. Mamma Lucia N.V.[/TD]
[TD="bgcolor: #FFFDBF, align: right"]484,704 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]484,704 PC[/TD]

[TD="align: center"]11[/TD]
[TD="bgcolor: #A2C3EA"] 160303 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"]484,704 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]484,704 PC[/TD]

[TD="align: center"]12[/TD]
[TD="bgcolor: #FFFDBF"]** 50091 Sales Person "***B***"[/TD]
[TD="bgcolor: #FFFDBF, align: right"]2,891,808 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]6,480,176 PC[/TD]

[TD="align: center"]13[/TD]
[TD="bgcolor: #FFFDBF"]* 200003 Compagnie des Fromage[/TD]
[TD="bgcolor: #FFFDBF, align: right"][/TD]
[TD="bgcolor: #FFFDBF, align: right"]754,080 PC[/TD]

[TD="align: center"]14[/TD]
[TD="bgcolor: #A2C3EA"] 156162 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]69,120 PC[/TD]

[TD="align: center"]15[/TD]
[TD="bgcolor: #A2C3EA"] 156163 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]112,320 PC[/TD]

[TD="align: center"]16[/TD]
[TD="bgcolor: #A2C3EA"] 156460 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]572,640 PC[/TD]

[TD="align: center"]17[/TD]
[TD="bgcolor: #FFFDBF"]* 200049 Gendreau[/TD]
[TD="bgcolor: #FFFDBF, align: right"]156,816 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]308,880 PC[/TD]

[TD="align: center"]18[/TD]
[TD="bgcolor: #A2C3EA"] 852914 Convina-Plus Barr[/TD]
[TD="bgcolor: #FFFFFF, align: right"]156,816 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]275,616 PC[/TD]

[TD="align: center"]19[/TD]
[TD="bgcolor: #A2C3EA"] 852917 Convina-Plus Barr[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]33,264 PC[/TD]

</tbody>


My target is to separate the text stings and have the various parts in columns adjacent to the raw sales data, which is coloured blue. The text location (thankfully) is uniform in its position within the relevant cell (although not showing blank spacing in post untill the last table). The adjacent yellow rows are unimportant (subtotals), after extraction they will be deleted along with column F once I have copied the table to another sheet as values only, so they could be left blank or have a “-“ if it makes the formula easier to create. Also note the number before the Sales Person e.g. ** 50073, does not have to be include in my table.

I am aiming to achieve a useable table for analysis as below:

Excel 2010
ABCDEFGH
Sales PersonAccount No.CustomerProduct CoderProduct Description
-----
-----
Sales Person APiatti Freschi ItaliaVitessa CPET tray
Sales Person APiatti Freschi ItaliaVitessa CPET tray
Sales Person APiatti Freschi ItaliaVitessa CPET tray
-----
Sales Person ANuova Jolli s.r.l.Vitessa CPET tray
Sales Person ANuova Jolli s.r.l.Vitessa CPET tray
-----
Sales Person AS.A. Mamma Lucia N.V.Vitessa CPET tray
-----
-----
Sales Person "***B***"Compagnie des FromageVitessa CPET tray
Sales Person "***B***"Compagnie des FromageVitessa CPET tray
Sales Person "***B***"Compagnie des FromageVitessa CPET tray
-----
Sales Person "***B***"GendreauConvina-Plus Barr
Sales Person "***B***"GendreauConvina-Plus Barr

<tbody>
[TD="align: center"]1[/TD]

[TD="bgcolor: #A6E5F4"]Verbe/Kunde/Material[/TD]
[TD="bgcolor: #A6E5F4, align: center"]Planned Sales[/TD]
[TD="bgcolor: #A6E5F4, align: center"]Actual Sales[/TD]

[TD="align: center"]2[/TD]

[TD="bgcolor: #FFFDBF"]** 50073 Sales Person A[/TD]
[TD="bgcolor: #FFFDBF, align: right"]2,144,064 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]3,582,392 PC[/TD]

[TD="align: center"]3[/TD]

[TD="bgcolor: #FFFDBF"]* 54865 Piatti Freschi Italia[/TD]
[TD="bgcolor: #FFFDBF, align: right"]297,456 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]594,912 PC[/TD]

[TD="align: center"]4[/TD]

[TD="align: right"]54865[/TD]

[TD="align: right"]118262[/TD]

[TD="bgcolor: #A2C3EA"] 118262 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"]240,000 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]480,000 PC[/TD]

[TD="align: center"]5[/TD]

[TD="align: right"]54865[/TD]

[TD="align: right"]101860[/TD]

[TD="bgcolor: #A2C3EA"] 101860 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"]57,456 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]57,456 PC[/TD]

[TD="align: center"]6[/TD]

[TD="align: right"]54865[/TD]

[TD="align: right"]101861[/TD]

[TD="bgcolor: #A2C3EA"] 101861 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]57,456 PC[/TD]

[TD="align: center"]7[/TD]

[TD="bgcolor: #FFFDBF"]* 54936 Nuova Jolli s.r.l.[/TD]
[TD="bgcolor: #FFFDBF, align: right"][/TD]
[TD="bgcolor: #FFFDBF, align: right"]285,120 PC[/TD]

[TD="align: center"]8[/TD]

[TD="align: right"]54936[/TD]

[TD="align: right"]105313[/TD]

[TD="bgcolor: #A2C3EA"] 105313 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]152,064 PC[/TD]

[TD="align: center"]9[/TD]

[TD="align: right"]54936[/TD]

[TD="align: right"]181301[/TD]

[TD="bgcolor: #A2C3EA"] 181301 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]133,056 PC[/TD]

[TD="align: center"]10[/TD]

[TD="bgcolor: #FFFDBF"]* 55373 S.A. Mamma Lucia N.V.[/TD]
[TD="bgcolor: #FFFDBF, align: right"]484,704 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]484,704 PC[/TD]

[TD="align: center"]11[/TD]

[TD="align: right"]55373[/TD]

[TD="align: right"]160303[/TD]

[TD="bgcolor: #A2C3EA"] 160303 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"]484,704 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]484,704 PC[/TD]

[TD="align: center"]12[/TD]

[TD="bgcolor: #FFFDBF"]** 50091 Sales Person "***B***"[/TD]
[TD="bgcolor: #FFFDBF, align: right"]2,891,808 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]6,480,176 PC[/TD]

[TD="align: center"]13[/TD]

[TD="bgcolor: #FFFDBF"]* 200003 Compagnie des Fromage[/TD]
[TD="bgcolor: #FFFDBF, align: right"][/TD]
[TD="bgcolor: #FFFDBF, align: right"]754,080 PC[/TD]

[TD="align: center"]14[/TD]

[TD="align: right"]200003[/TD]

[TD="align: right"]156162[/TD]

[TD="bgcolor: #A2C3EA"] 156162 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]69,120 PC[/TD]

[TD="align: center"]15[/TD]

[TD="align: right"]200003[/TD]

[TD="align: right"]156163[/TD]

[TD="bgcolor: #A2C3EA"] 156163 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]112,320 PC[/TD]

[TD="align: center"]16[/TD]

[TD="align: right"]200003[/TD]

[TD="align: right"]156460[/TD]

[TD="bgcolor: #A2C3EA"] 156460 Vitessa CPET tray[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]572,640 PC[/TD]

[TD="align: center"]17[/TD]

[TD="bgcolor: #FFFDBF"]* 200049 Gendreau[/TD]
[TD="bgcolor: #FFFDBF, align: right"]156,816 PC[/TD]
[TD="bgcolor: #FFFDBF, align: right"]308,880 PC[/TD]

[TD="align: center"]18[/TD]

[TD="align: right"]200049[/TD]

[TD="align: right"]852914[/TD]

[TD="bgcolor: #A2C3EA"] 852914 Convina-Plus Barr[/TD]
[TD="bgcolor: #FFFFFF, align: right"]156,816 PC[/TD]
[TD="bgcolor: #FFFFFF, align: right"]275,616 PC[/TD]

[TD="align: center"]19[/TD]

[TD="align: right"]200049[/TD]

[TD="align: right"]852917[/TD]

[TD="bgcolor: #A2C3EA"] 852917 Convina-Plus Barr[/TD]
[TD="bgcolor: #FFFFFF, align: right"][/TD]
[TD="bgcolor: #FFFFFF, align: right"]33,264 PC[/TD]

</tbody>

I have created formulas to extract the various text parts but it’s the repeating the first three columns I am struggling with and the formula in the first column works up until there is a change in Sales Person. I gladly accept and corrections or formula efficiencies.


Excel 2010
ABCDEFGH
1Sales PersonAccount No.CustomerProduct CoderProduct DescriptionVerbe/Kunde/MaterialPlanned SalesActual Sales
2Sales Person A ** 50073 Sales Person A2,144,064 PC3,582,392 PC
3Sales Person A54865Piatti Freschi Italia* 54865 Piatti Freschi Italia297,456 PC594,912 PC
4Sales Person A118262Vitessa CPET tray118262 Vitessa CPET tray240,000 PC480,000 PC
Sheet1 (3)
Cell Formulas
RangeFormula
C3=IF(LEFT($F3,1)="","-",MID($F3,19,21))
A2=IF(LEFT($F2,3)="** ",MID($F2,16,40),LOOKUP(2,A2,$A$2:A2))
A3=IF(LEFT($F3,3)="** ",MID($F3,16,40),LOOKUP(2,A3,$A$2:A3))
A4=IF(LEFT($F4,3)="** ",MID($F4,16,40),LOOKUP(2,A4,$A$2:A4))
B2=IF(LEFT(F2,2)="**","",MID(F2,16,40))
B3=IF(LEFT($F3,1)="","-",MID($F3,11,6))
D4=IF(LEFT($F4,1)="","-",MID($F4,14,7))
E4=IF(LEFT($F4,1)="","-",MID($F4,23,17))


Thanks you for your time and help in advance, James.

Excel 2010, Win 7
 

Excel Facts

Quick Sum
Select a range of cells. The total appears in bottom right of Excel screen. Right-click total to add Max, Min, Count, Average.
Give this formulae a try. The formulae in A2, B2 and C2 should be entered with Ctrl - Shift - Enter.

Convert the asterisks in column F to ß before use.
Excel Workbook
ABCDEF
1Sales PersonAccount No.CustomerProduct CoderProduct DescriptionVerbe/Kunde/Material
2***** 50073 Sales Person A
3***** 54865 Piatti Freschi Italia
4Sales Person A54865Piatti Freschi Italia118262Vitessa CPET tray118262 Vitessa CPET tray
5Sales Person A54865Piatti Freschi Italia101860Vitessa CPET tray101860 Vitessa CPET tray
6Sales Person A54865Piatti Freschi Italia101861Vitessa CPET tray101861 Vitessa CPET tray
7***** 54936 Nuova Jolli s.r.l.
8Sales Person A54936Nuova Jolli s.r.l.105313Vitessa CPET tray105313 Vitessa CPET tray
9Sales Person A54936Nuova Jolli s.r.l.181301Vitessa CPET tray181301 Vitessa CPET tray
10***** 55373 S.A. Mamma Lucia N.V.
11Sales Person A55373S.A. Mamma Lucia N.V.160303Vitessa CPET tray160303 Vitessa CPET tray
12***** 50091 Sales Person "B"
13***** 200003 Compagnie des Fromage
14Sales Person "B"200003Compagnie des Fromage156162Vitessa CPET tray156162 Vitessa CPET tray
15Sales Person "B"200003Compagnie des Fromage156163Vitessa CPET tray156163 Vitessa CPET tray
16Sales Person "B"200003Compagnie des Fromage156460Vitessa CPET tray156460 Vitessa CPET tray
17Sales Person "B"200003Compagnie des Fromage200049Gendreau200049 Gendreau
18Sales Person "B"200003Compagnie des Fromage852914Convina-Plus Barr852914 Convina-Plus Barr
Sheet
 
Upvote 0
You can use these formulae with your original sheet (no need for converting asterisks in column F):


Spreadsheet Formulas
CellFormula
A2=IF(COUNTIF(F2,"~*"&"*")>0,"",MID(INDEX(F$2:F$18,LOOKUP(100,0/ISNUMBER(FIND("**",F$2:F2)),ROW($F$2:F2))-1),10,100))
B2{=IF(COUNTIF(F2,"~*"&"*")>0,"",TRIM(LEFT(SUBSTITUTE(SUBSTITUTE(INDEX(F$2:F$18,LOOKUP(100,0/ISNUMBER(FIND("* ",F$2:F2)),ROW($F$2:F2))-1),"* ","")," ",REPT(" ",20)),20)))}
C2{=IF(COUNTIF(F2,"~*"&"*")>0,"",TRIM(MID(SUBSTITUTE(SUBSTITUTE(INDEX(F$2:F$18,LOOKUP(100,0/ISNUMBER(FIND("* ",F$2:F2)),ROW($F$2:F2))-1),"* ","")," ",REPT(" ",100)),100,1000)))}
D2=IF(C2<>"",0+LEFT(F2,FIND(" ",F2)-1),"")
E2=IF(C2<>"",MID(F2,FIND(" ",F2)+1,100),"")

<tbody>
</tbody>
Formula Array:
Produce enclosing
{ } by entering
formula with CTRL+SHIFT+ENTER!

<tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4
 
Last edited:
Upvote 0
Hi István,

Thank you so much for your help. Your extra post does save time in not having to convert the “*” before I apply the formulas. I altered some of the formulas to get it to work fully in the relevant columns; the changes are more to do with where the text lies in column F4 so that I get a consistent output.

On my test data (4,000 rows), it works apart from one oddity. This is more to do with the report generation than the formula. I did believe the report generation outputted consistent spaced text; however, it does not, but only but one text space!

The Product Description column has an extra space at the front of the text e.g. “ Vitessa CPET tray” whereas further down the table it has no extra space and this alternates back and forth. The starting text point of Vitessa (or other product description) is at 22 and 21 respectively in column F4. I tried adding the TRIM function within the formula but this did not work form me. Is there a way to allow for this variance in the result in column E?

Here is what I have so far:


Excel 2010
ABCDEF
1Sales PersonAccount No.CustomerProduct CodeProduct DescriptionVerbe/Kunde/Material
2     ** 50073 Sales Person A
3* 54865 Piatti Freschi Italia
4Sales Person A54865Piatti Freschi Italia118262Vitessa CPET tray118262 Vitessa CPET tray
5Sales Person A54865Piatti Freschi Italia101860Vitessa CPET tray101860 Vitessa CPET tray
6Sales Person A54865Piatti Freschi Italia101861Vitessa CPET tray101861 Vitessa CPET tray
Test 2
Cell Formulas
RangeFormula
A2=IF(COUNTIF(F2,"~*"&"*")>0,"",MID(INDEX(F$2:F$19995,LOOKUP(100,0/ISNUMBER(FIND("**",F$2:F2)),ROW($F$2:F2))-1),16,100))
D2=IF(C2<>"",0+LEFT(F2,FIND(" ",F2)+20),"")
E2=IF(C2<>"",MID(F2,FIND(" ",F2)+21,18),"")
B2{=IF(COUNTIF(F2,"~*"&"*")>0,"",TRIM(MID(SUBSTITUTE(SUBSTITUTE(INDEX(F$2:F$19995,LOOKUP(100,0/ISNUMBER(FIND("* ",F$2:F2)),ROW($F$2:F2))-1),"* ","")," ",REPT(" ",5)),1,46)))}
C2{=IF(COUNTIF(F2,"~*"&"*")>0,"",TRIM(MID(SUBSTITUTE(SUBSTITUTE(INDEX(F$2:F$19995,LOOKUP(100,0/ISNUMBER(FIND("* ",F$2:F2)),ROW($F$2:F2))-1),"* ","")," ",REPT(" ",4)),40,40)))}
Press CTRL+SHIFT+ENTER to enter array formulas.


Thanks again, James
 
Upvote 0
If I understand well in column E in some rows there is a leading space, for example "_Vitessa_CPET_tray”.

Enter this into E2:

=IF(C2<>"",TRIM(MID(F2,FIND(" ",F2)+1,100)),"")

If this does not work correctly, please show me some cells from column F where the formula works/does not work; replace spaces with underscores for me to see how many extra spaces are there and where.
 
Upvote 0

Forum statistics

Threads
1,223,711
Messages
6,174,028
Members
452,542
Latest member
Bricklin

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top