Splitting string based on case

omnivl

Board Regular
Joined
Aug 25, 2014
Messages
53
Hi
Im a bit stuck on this one, i need to split a string based on a lowercase hitting an uppercase for example:

Data
Sam JonesPaul SmithEvan Angus WalshJon Smith

Output
Sam Jones;Paul Smith;Evan Angus Walsh;Jon Smith

I have the data in column D and there will be 50000 + rows
 

Excel Facts

What is the fastest way to copy a formula?
If A2:A50000 contain data. Enter a formula in B2. Select B2. Double-click the Fill Handle and Excel will shoot the formula down to B50000.
VBA for you to test - test on a copy of some of your data

What VBA does
1 Values in column D are written to array
2 (For each array value )semi-colon inserted before upper case letter EXCEPT if previous character is a space
3 Array is written back to column D
(Amend D to another column and the values will be written there instead)

Put this in the sheet module (right-click sheet tab \ View Code \ paste in window on right )
Code:
Sub Test2()
Application.ScreenUpdating = False: Application.Calculation = xlCalculationManual
    Dim cel As Range, myStr As String, arr As Variant, a As Long, i As Integer
    arr = Range("D2", Range("D" & Rows.Count).End(xlUp)).Value          'write values to array
        For a = 1 To UBound(arr)
            myStr = arr(a, 1)
                For i = Len(arr(a, 1)) To 2 Step -1
                    If Mid(myStr, i, 1) Like "*[A-Z]*" And Mid(myStr, i - 1, 1) <> Chr(32) Then
                        myStr = Left(myStr, i - 1) & ";" & Right(myStr, Len(myStr) - i + 1)
                    End If
                Next i
            arr(a, 1) = myStr                                           'replace single value in array
            myStr = ""
        Next a
    Range("[COLOR=#ff0000]D[/COLOR]2").Resize(UBound(arr)).Value = Application.Transpose(arr)  'write array to range
    Erase arr
Application.ScreenUpdating = True: Application.Calculation = xlCalculationAutomatic
End Sub
 
Upvote 0
Thanks @Fennek - a simpler test is good :)
@omnivl
- I suggest you try the code unmodified first
- then introduce Fennek's suggestion
- the VBA be looking for a 2 character match (currently 1)
- the string is checked starting with last character
- so you may need to alter the Left&Right elements on the next line
 
Upvote 0
- I suggest you try the code unmodified first
I'm wondering if you did? ;)
When writing the result array back to the sheet, aren't you transposing a vertical array into a horizontal one but still trying to write it into a vertical range - thereby producing a column all of the same (first) value?

@omnivl
I'm also envisaging some problems for you if you have names like Kevin McManus. Your 'rule' for inserting the semicolon (& the previously suggested code) will produce Kevin Mc;Manus

My suggested code below is different in at least two ways:
- It does not require looping through the characters in the original strings, and
- It makes the assumption that a 2-character last name is less likely than the 'McManus' type issue so I have only inserted a semicolon where 2 or more lower case letters are followed by an upper case.

I have also written the results into column E. Simple change near the end of the code to over-write the column D data if you are comfortable that the code is doing what you want.

Suggest testing on a copy of your workbook anyway.

Rich (BB code):
Sub SplitCase()
  Dim a As Variant
  Dim i As Long
  
  a = Range("D2", Range("D" & Rows.Count).End(xlUp)).Value
  With CreateObject("VBScript.RegExp")
    .Global = True
    .Pattern = "([a-z]{2,})([A-Z])"
    For i = 1 To UBound(a)
      a(i, 1) = .Replace(a(i, 1), "$1;$2")
    Next i
  End With
  Range("E2").Resize(UBound(a)).Value = a
End Sub
 
Last edited:
Upvote 0
@Peter_SSs Is quite correct
The line should in fact read
Code:
Range("D2").Resize(UBound(arr)).Value = arr 'write array to range

His code will run much faster - so best not waste your time (or your computer's time) testing my code :)
 
Last edited:
Upvote 0
@Peter_SSs
A nice solution, though names like "James MacGregor" or "Nicholas VanderKamp" will still cause problems.
 
Upvote 0
.. though names like "James MacGregor" or "Nicholas VanderKamp" will still cause problems.
Sure will. Quite often impossible to get perfect text solutions in Excel. :)
 
Upvote 0
Quite often impossible to get perfect text solutions in Excel. :)
Whilst I don't resile from the above, I think that most of the 'complications' raised so far relate to last names. The corollary of that is that first names tend to be simpler in nature so I think the following version should have less 'mishaps'. The only changed line from my previous code is the .Pattern line

Rich (BB code):
Sub SplitCase_v2()
  Dim a As Variant
  Dim i As Long
  
  a = Range("D2", Range("D" & Rows.Count).End(xlUp)).Value
  With CreateObject("VBScript.RegExp")
    .Global = True
    .Pattern = "([a-z])([A-Z][a-z]+ )"
    For i = 1 To UBound(a)
      a(i, 1) = .Replace(a(i, 1), "$1;$2")
    Next i
  End With
  Range("E2").Resize(UBound(a)).Value = a
End Sub

Sample data in column D, results of code in column E

Excel Workbook
DE
1
2Sam JonesPaul SmithEvan Angus WalshJon SmithSam Jones;Paul Smith;Evan Angus Walsh;Jon Smith
3Tom JonesTom Jones
4
5Ken HongPeter HallKen Hong;Peter Hall
6Sam SmithKevin McManusJim O'ConnorSam Smith;Kevin McManus;Jim O'Connor
7Tim SmithJames MacGregorNicholas VanderKampKim NgSonia HallTim Smith;James MacGregor;Nicholas VanderKamp;Kim Ng;Sonia Hall
Split case (3)



Hopefully we will get some feedback from the OP soon. :)
 
Upvote 0

Forum statistics

Threads
1,223,911
Messages
6,175,323
Members
452,635
Latest member
laura12345

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top