Manipulate a String - Vba solution

stuartgb100

Active Member
Joined
May 10, 2015
Messages
322
Office Version
  1. 2021
Platform
  1. Windows
Hi,

Trying to help a friend with this problem:

They have masses of essentially text data which is basically a specialist English to French dictionary.
A cell contains both the English word(s) and the French equivalent, and he wishes to split this into two cells.

Looking through the data it seems that the English expression is always in bold text, but not always just alphabetic characters.

So I'm looking for a way to split everything bold in the cell from everything not bold.

A couple of examples:

anti-backflow throat (fireplace)^ gorge anti-refoulante f
beam,^ prestressed concrete poutre précontrainte f; poutrelle céramique f; poutrelle béton f

where '^' represents the end of the bold section

I've also got to deal with French accents

Any ideas please ?

Thanks.
 

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
is each entry on a new line (english french)
if the ^ exists then text to column
 
Upvote 0
Looking through the data it seems that the English expression is always in bold text, ..

A couple of examples:

anti-backflow throat (fireplace)^ gorge anti-refoulante f
beam,^ prestressed concrete poutre précontrainte f; poutrelle céramique f; poutrelle béton f

where '^' represents the end of the bold section
Did you make a mistake with marking the end of the bold text in the second example as the red text looks very English to me. :)
 
Upvote 0
Assuming the translation data is in column A, starting on row 2, and the results can go in columns B:C then try this on a small set of sample data.
Code:
Sub SplitBold()
  Dim a As Variant
  Dim i As Long, j As Long, lr As Long, chrs As Long
  Dim bUnbold As Boolean
  
  lr = Range("A" & Rows.Count).End(xlUp).Row
  ReDim a(2 To lr, 1 To 2)
  For i = 2 To lr
    bUnbold = False
    With Cells(i, 1)
    chrs = Len(.Value)
      j = 0
      Do
        j = j + 1
        If .Characters(j, 1).Font.Bold = False Then bUnbold = True
      Loop Until bUnbold Or j = chrs
      If Not bUnbold Then j = j + 1
      a(i, 1) = Trim(Left(.Value, j - 1))
      a(i, 2) = Trim(Mid(.Value, j))
    End With
  Next i
  With Range("B2:C2").Resize(UBound(a) - 1)
    .Value = a
    .Columns.AutoFit
  End With
End Sub

My sample data & results:


Book1
ABC
2abcabc
3anti-backflow throat (fireplace)gorge anti-refoulantefanti-backflow throat (fireplace)gorge anti-refoulantef
4
5beam, prestressed concretepoutre prcontraintef; poutrelle cramiquef; poutrelle btonfbeam,prestressed concretepoutre prcontraintef; poutrelle cramiquef; poutrelle btonf
6fsfasfdfsfasfd
7abcabc
Eng Fr
 
Last edited:
Upvote 0
Okay, so having tried it on a larger sample of data, I've mixed results.
It may be that the problems will have to be sorted out with a variety of 'cleaning' routines being run individually on problem cells.
On the other hand, it might be that the code could be modified to run some 'cleaning' first, before the split ?

At random I have chosen 4 original examples that caused problems, together with the results after the split.
In every case after conversion, all text becomes unboldened.

[TABLE="width: 726"]
<colgroup><col></colgroup><tbody>[TR]
[TD]nail, extra large head clout clou à tête plate, ‘extra large’ m
[TABLE="width: 726"]
<colgroup><col></colgroup><tbody>[TR]
[TD]pin, dome-headed upholstery pointe décorative à tête bombée f[/TD]
[/TR]
[TR]
[TD]pine pin m; (from Landes region of France) pin des Landes m[/TD]
[/TR]
[TR]
[TD]plane, hand/jack rabot manuel m; ~ semelle metal m

the English split:[TABLE="width: 366"]
<colgroup><col></colgroup><tbody>[TR]
[TD]nail,[/TD]
[/TR]
[TR]
[TD]pin, dome-headed[/TD]
[/TR]
[TR]
[TD]pine[/TD]
[/TR]
[TR]
[TD]plane, hand/jack

the French split:
[TABLE="width: 366"]
<colgroup><col></colgroup><tbody>[TR]
[TD] extra large head clout clou à tête plate, ‘extra large’ m[/TD]
[/TR]
[TR]
[TD] upholstery pointe décorative à tête bombée f[/TD]
[/TR]
[TR]
[TD] pin m; (from Landes region of France) pin des Landes m[/TD]
[/TR]
[TR]
[TD]rabot manuel m; ~ semelle metal m

These are direct copy/pastes from the problem cell, so hopefully any errors caused by the import into Excel will be preserved in this post. If not, I could post a worksheet.

Any further help would be much appreciated.






[/TD]
[/TR]
</tbody>[/TABLE]
<strike></strike>
[/TD]
[/TR]
</tbody>[/TABLE]
<strike></strike>

[/TD]
[/TR]
</tbody>[/TABLE]
<strike></strike>
[/TD]
[/TR]
</tbody>[/TABLE]
<strike></strike>
 
Upvote 0
The reason the existing code is not doing what you want, relates to the 'messy' nature of your data.
The existing code looks from the start of the string until it finds a non-bold character (or the end of the string if no bold found) and splits at that point.

In that first sample line in your last post, the blank character between the comma and the word 'extra' is not bold, hence my code split it at that point.
There are similar situations with those other sample texts as well. In fact, you can clearly see in the third example that there is bold text then non-bold then bold then non-bold.

How would it be if we went the other way and started looking from the end of the string back towards the beginning & split as soon as we come to a bold character?

Code:
Sub SplitBold_v2()
  Dim a As Variant
  Dim i As Long, j As Long, lr As Long, chrs As Long
  Dim bBold As Boolean
  
  lr = Range("A" & Rows.Count).End(xlUp).Row
  ReDim a(2 To lr, 1 To 2)
  For i = 2 To lr
    bBold = False
    With Cells(i, 1)
    chrs = Len(.Value)
      If chrs > 0 Then
      j = chrs + 1
        Do
          j = j - 1
          If .Characters(j, 1).Font.Bold = True Then bBold = True
        Loop Until bBold Or j < 2
        If bBold Then j = j + 1
      
        a(i, 1) = Trim(Left(.Value, j - 1))
        a(i, 2) = Trim(Mid(.Value, j))
      End If
    End With
  Next i
  With Range("B2:C2").Resize(UBound(a) - 1)
    .Value = a
    .Columns.AutoFit
  End With
End Sub


Book1
ABC
2abcabc
3anti-backflow throat (fireplace)gorge anti-refoulantefanti-backflow throat (fireplace)gorge anti-refoulantef
4
5beam, prestressed concretepoutre prcontraintef; poutrelle cramiquef; poutrelle btonfbeam,prestressed concretepoutre prcontraintef; poutrelle cramiquef; poutrelle btonf
6fsfasfdfsfasfd
7abcabc
8nail,extra large head cloutclou tte plate, extra largemnail,extra large head cloutclou tte plate, extra largem
9pin, dome-headedupholsterypointe dcorative tte bombefpin, dome-headedupholsterypointe dcorative tte bombef
10pinepinm;(from Landes region of France)pin des Landesmpinepinm;(from Landes region of France)pin des Landesm
11plane, hand/jackrabot manuelm; ~ semelle metalmplane, hand/jackrabot manuelm; ~ semelle metalm
Eng Fr (2)
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,824
Messages
6,181,186
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top