Separate a text string when undercase meets uppercase / find where the case is changing

egotajcs · Jul 15, 2017

Hi guys,

I have a quite strange and unique problem. I make an "automatic parser excel file", which uses a big database (where everything is in a single cell, a long-long string) for input, and parsing the datas into separate columns. I could do a lot of things on my own, but I stucked at a point.

I need to separate the horses' names and the trainer's names, but there isn't any space between them. So, the strings look like this:

[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD="align: center"]A[/TD]
[/TR]
[TR]
[TD]ThrockleyJohn Davies[/TD]
[/TR]
[TR]
[TD]Royal RegentLucy Normile[/TD]
[/TR]
[TR]
[TD]Spes NostraIain Jardine[/TD]
[/TR]
[TR]
[TD]KomodoJedd O'Keeffe[/TD]
[/TR]
[TR]
[TD]Lucent DreamJohn C McConnell[/TD]
[/TR]
</tbody>[/TABLE]

As you can see, there could be one and only solution to separate this into two columns:

find the first position where an undercased character followed by an uppercased character

If I could get that kind of position, I could separate the horses' names and the trainer's names into two separate columns using LEN, LEFT, RIGHT and MID formulas.

But I don't know how to determine 'where is the first undercase character which is followed by an uppercased character'.

Can anybody help with this?

Thank you

Haluk · Jul 15, 2017

Assuming your first data is in cell A1 , enter the below formula in cell B1 and hit Shift+Ctrl+Enter keys together, to enter the formula as an array formula.

Then, select the cell B1 and drag down to copy the formula to other cells in column B.

See if the results in column B are OK for you ...

Code:

=LEFT(A1;SMALL(FIND(CHAR(ROW(INDIRECT("65:90"))); A1 & "ABCDEFGHIJKLMNOPQRSTUVWXYZ");2)-1)

When you enter the formula as an array formula, you will see something like this {=Formula above}

JoeMo · Jul 15, 2017

TonyUK72 said:
I'm sure the OP will be very grateful Joe, but I suspect there might be one or two horses called McSomething or MacSomething. The function may have to be changed to "assume" that the horses name will be at least 4 or 5 characters in length.

Undoubtedly, that will be the case, but it appears the OP has no interest in a VBA solution so I will do nothing further.

TonyUK72 · Jul 15, 2017

Bit of a typo there Haluk

Code:

[COLOR=#333333]=LEFT(A1,SMALL(FIND(CHAR(ROW(INDIRECT("65:90"))), A1 & "ABCDEFGHIJKLMNOPQRSTUVWXYZ"),4)-1)[/COLOR]

Also changed your 2 to a 4 to cover the awkward horses. Nice job!!!!

egotajcs · Jul 17, 2017

OMG, @JoeMoe, you are the absoulete best. I've never dreamt a solution like this one. It is fantastic.

Thank you!

egotajcs · Jul 17, 2017

I've made the VBA function, and it is working like a real formula. I think, it is for me a sneak peek into an unknown world. The VBA. Shame, I don't understand your code.
But it's working like a charm.

I'm so happy, thanks, I can continue the parser excel file.

egotajcs · Jul 17, 2017

Hi Haluk, thank you for your solution as well!

I hope, this second solution will help to anybody else too. This is why I tried to write this thread's name a little bit google friendly name, so if in the future, somebody search for terms like under and uppercase, etc., they can find this thread.

Thank you guys for your work!

Rick Rothstein · Jul 17, 2017

JoeMo said:
Here's a UDF (user-defined function) you can use like a native excel function. First install the function in your workbook using the instructions below.
To install the UDF:
1. With your workbook active press Alt and F11 keys. This will open the VBE window.
2. In the project tree on the left of the VBE window, find your project and click on it.
3. On the VBE menu: Insert>Module
4. Copy the UDF from your browser window and paste it into the white space in the VBE window.
5. Close the VBE window and Save the workbook. If you are using Excel 2007 or a later version do a SaveAs and save it as a macro-enabled workbook (.xlsm file extension).

Use the UDF like this: =HorseName(A1)

Rich (BB code):

Function HorseName(S As String) As String 'horse name is everything to the left of the first lower case letter 'followed immediately by an uppercase letter Dim i As Long For i = 1 To Len(S) If Mid(S, i, 1) Like "[a-z]" And Mid(S, i + 1, 1) Like "[A-Z]" Then HorseName = Left(S, i) Exit Function End If Next i End Function

You can write the highlighted line of code a little bit simpler like this....

If Mid(S, i, 2) Like "[a-z][A-Z]" Then

Separate a text string when undercase meets uppercase / find where the case is changing

egotajcs

New Member

Haluk

Rules Violation

JoeMo

MrExcel MVP

TonyUK72

Board Regular

egotajcs

New Member

egotajcs

New Member

egotajcs

New Member

Rick Rothstein

MrExcel MVP

Similar threads

Share this page

Separate a text string when undercase meets uppercase / find where the case is changing

New Member

Rules Violation

MrExcel MVP

Board Regular

New Member

New Member

New Member

MrExcel MVP

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock