Extract text from string (complex rules)

hanmas

New Member
Joined
Feb 19, 2012
Messages
24
Hi, can anyone please help me with this problem? It has been driving me crazy for days!

I have a column with data in the following (ideal) format:

texttextexttext/s2
texttextexttext/s2/c14
texttextexttext/s24/c2

and need a VBA solution to automatically split it across 3 columns, i.e

column1 column1 column1
texttextexttext s2
texttextexttext s2 c14
texttextexttext s24 c2

/s will always be present and will always be followed by one or more digits (no limit).
/c may or may not be present but if it is present, again it will always be followed by one or more digits (no limit)

I have been using TextToColumns with "/" as the delimiter and it works fine in most cases. However, the data frequently contains additional "/" characters which breaks everything. some examples of valid but problematic data are:

texttext3/12/2012text/s2/c2
text/textexttext/s2/c2

and even:
text/stext/ctext/s2/c2 (i.e. multiple occurrences of /s or /c)!!!

Please help!
Thanks,
Graham
 
Hello again,
I'm running this against 30,000 rows and it is taking a *very* long time to run. Could anyone please help suggest a faster method
If you still want to go with a VBA solution, I think this macro, being non-looping, should execute somewhat quickly...

Code:
Sub Test()
  Const StartRow As Long = 1
  Application.ScreenUpdating = False
  Columns("A").Copy Cells(1, "B")
  If StartRow > 1 Then Range("B1:B" & StartRow - 1).Clear
  Cells(StartRow, "C").Resize(Cells(Rows.Count, "A").End(xlUp).Row - StartRow + 1).FormulaR1C1 = "=LEFT(RC1,LEN(RC1)-LEN(TRIM(RIGHT(SUBSTITUTE(LOWER(RC1),""/s"",REPT("" "",999)),999)))-2)"
  Columns("A").Value = Columns("C").Value
  Columns("C").Clear
  Columns("A").Replace "/s", Chr(255), xlPart
  Columns("B").Replace "*/s", Chr(255) & "s", xlPart
  Columns("B").Replace "*" & Chr(255), "", xlPart
  Columns("B").TextToColumns Range("B1"), xlDelimited, , , False, False, False, False, True, "/"
  Application.ScreenUpdating = True
End Sub
If my guess at Row 1 for the start of you data was wrong, then change the assignment to the StartRow constant in the Const statement to your actual data's start row.
 
Upvote 0

Excel Facts

When they said...
When they said you are going to "Excel at life", they meant you "will be doing Excel your whole life".
Thanks very much Rick :-) This seems to work very well...I'll try it out on real data later in the week and let you know how I get on.
Graham
 
Upvote 0
Hello again,
I'm running this against 30,000 rows and it is taking a *very* long time to run. Could anyone please help suggest a faster method

Hi Graham

I just tested Mick's code with with 30,000 lines and it took just over 3 seconds. How much does it take on your pc?

This is another solution that according to my tests is faster than both Mick's and Rick's solutions:

Code:
Sub SplitS()
Dim r As Range, rC As Range
 
Application.ScreenUpdating = False
Set r = Range("A1", Range("A" & Rows.Count).End(xlUp))
With CreateObject("VBScript.RegExp")
    .Pattern = "^(.*)/(s\d+)(?:/(c\d+))?$"
    .IgnoreCase = True
    For Each rC In r
        If .Test(rC) Then rC.Resize(1, 3) = Split(.Replace(rC, "$1|$2|$3"), "|")
    Next rC
End With
Application.ScreenUpdating = True
End Sub
 
Upvote 0
Wow, thanks PCG...I'll also test your solution later in the week. I am learning so much by seeing all of these different approaches...thank you so much!
Graham
 
Upvote 0

Forum statistics

Threads
1,223,268
Messages
6,171,099
Members
452,379
Latest member
IainTru

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top