Separate a text string when undercase meets uppercase / find where the case is changing

egotajcs

New Member
Joined
May 6, 2017
Messages
27
Hi guys,

I have a quite strange and unique problem. I make an "automatic parser excel file", which uses a big database (where everything is in a single cell, a long-long string) for input, and parsing the datas into separate columns. I could do a lot of things on my own, but I stucked at a point.

I need to separate the horses' names and the trainer's names, but there isn't any space between them. So, the strings look like this:


[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD="align: center"]A[/TD]
[/TR]
[TR]
[TD]ThrockleyJohn Davies[/TD]
[/TR]
[TR]
[TD]Royal RegentLucy Normile[/TD]
[/TR]
[TR]
[TD]Spes NostraIain Jardine[/TD]
[/TR]
[TR]
[TD]KomodoJedd O'Keeffe[/TD]
[/TR]
[TR]
[TD]Lucent DreamJohn C McConnell[/TD]
[/TR]
</tbody>[/TABLE]

As you can see, there could be one and only solution to separate this into two columns:
find the first position where an undercased character followed by an uppercased character

If I could get that kind of position, I could separate the horses' names and the trainer's names into two separate columns using LEN, LEFT, RIGHT and MID formulas.

But I don't know how to determine 'where is the first undercase character which is followed by an uppercased character'.

Can anybody help with this?

Thank you
 
Assuming your first data is in cell A1 , enter the below formula in cell B1 and hit Shift+Ctrl+Enter keys together, to enter the formula as an array formula.

Then, select the cell B1 and drag down to copy the formula to other cells in column B.

See if the results in column B are OK for you ...

Code:
=LEFT(A1;SMALL(FIND(CHAR(ROW(INDIRECT("65:90"))); A1 & "ABCDEFGHIJKLMNOPQRSTUVWXYZ");2)-1)

When you enter the formula as an array formula, you will see something like this {=Formula above}
 
Upvote 0

Excel Facts

Format cells as date
Select range and press Ctrl+Shift+3 to format cells as date. (Shift 3 is the # sign which sort of looks like a small calendar).
I'm sure the OP will be very grateful Joe, but I suspect there might be one or two horses called McSomething or MacSomething. The function may have to be changed to "assume" that the horses name will be at least 4 or 5 characters in length.
Undoubtedly, that will be the case, but it appears the OP has no interest in a VBA solution so I will do nothing further. :outtahere:
 
Upvote 0
Bit of a typo there Haluk

Code:
[COLOR=#333333]=LEFT(A1,SMALL(FIND(CHAR(ROW(INDIRECT("65:90"))), A1 & "ABCDEFGHIJKLMNOPQRSTUVWXYZ"),4)-1)[/COLOR]

Also changed your 2 to a 4 to cover the awkward horses. Nice job!!!!
 
Upvote 0
OMG, @JoeMoe, you are the absoulete best. I've never dreamt a solution like this one. It is fantastic.

Thank you! :)
 
Upvote 0
I've made the VBA function, and it is working like a real formula. I think, it is for me a sneak peek into an unknown world. The VBA. Shame, I don't understand your code.
But it's working like a charm.

I'm so happy, thanks, I can continue the parser excel file.
 
Upvote 0
Hi Haluk, thank you for your solution as well!

I hope, this second solution will help to anybody else too. This is why I tried to write this thread's name a little bit google friendly name, so if in the future, somebody search for terms like under and uppercase, etc., they can find this thread.

Thank you guys for your work!

:)
 
Upvote 0
Here's a UDF (user-defined function) you can use like a native excel function. First install the function in your workbook using the instructions below.
To install the UDF:
1. With your workbook active press Alt and F11 keys. This will open the VBE window.
2. In the project tree on the left of the VBE window, find your project and click on it.
3. On the VBE menu: Insert>Module
4. Copy the UDF from your browser window and paste it into the white space in the VBE window.
5. Close the VBE window and Save the workbook. If you are using Excel 2007 or a later version do a SaveAs and save it as a macro-enabled workbook (.xlsm file extension).

Use the UDF like this: =HorseName(A1)
Rich (BB code):
Function HorseName(S As String) As String
'horse name is everything to the left of the first lower case letter
'followed immediately by an uppercase letter
Dim i As Long
For i = 1 To Len(S)
    If Mid(S, i, 1) Like "[a-z]" And Mid(S, i + 1, 1) Like "[A-Z]" Then
        HorseName = Left(S, i)
        Exit Function
    End If
Next i
End Function
You can write the highlighted line of code a little bit simpler like this....

If Mid(S, i, 2) Like "[a-z][A-Z]" Then
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,184
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top