Split bold from normal text

bczwy

New Member
Joined
Nov 20, 2017
Messages
13
Hi everyone,

I hope anyone can help me with this issue. I am not very expert on coding with VBA but I'm trying to search for a code that splits BOLD text from non bold text.

All my texts are looking like this:
[FONT=&quot]1: Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

I got like 500 of these sort textes and I would like to have the bold part in a column before.

Help is appreciated a lot!

Thanks in advance.[/FONT]
 
But it seems like sometimes like maybe about 30-50% of my documents that I am using the script on the non bold text is also being split in multiple columns.
It would be hard to know why that is happening without seeing it happen for ourselves. Can you post a sample workbook of original data where some of the original data includes text that is being split improperly like you describe above?

Also, while some 9 times slower than ZVI's code (4.5 seconds versus 0.25 seconds for 500 rows of data) which makes use of a feature of the Value property that I was unaware or (until now), I thought you might like to see the code I came up with which appears to work correctly with the data you posted originally...
Code:
[table="width: 500"]
[tr]
	[td]Sub SplitAtBold()
  Dim R As Long, X As Long, LastRow As Long, Cell As Range, Result As Variant
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  ReDim Result(1 To LastRow, 1 To 2)
  Range("A1:A" & LastRow).Replace Chr(160), " ", xlPart, , , , False, False
  For R = 1 To LastRow  'Each Cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    With Cells(R, "A")
      If IsNull(.Font.Bold) Then
        For X = InStr(.Value, ".") + 1 To Len(.Value)
          If .Characters(X, 1).Text Like "[! ]" And Not .Characters(X, 1).Font.Bold Then
            Result(R, 1) = Trim(Left(.Value, X - 1))
            Result(R, 2) = Mid(.Value, X)
            Exit For
          End If
        Next
      Else
        Result(R, .Font.Bold + 2) = .Value
      End If
    End With
  Next
  Range("A1").Resize(UBound(Result), 2) = Result
End Sub[/td]
[/tr]
[/table]
 
Upvote 0

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
As it has already been marked by Rick, the result can depend on data.
For example there can be single space char with Bold font in the data.

The logic of the code was based on your testing data from the provided link and the algorithm was as follows:

1. Data of each cell splits by the end of the Bold font in its characters.

2. If a Bold font is not found in a cell then data of that cell goes to the 2nd column. It is because of your testing data – content of the row 18 without Bold font relates to the previous row 17.

3. Code provides some fixing:

3.1 Left trimming of space char in the 2nd column:
a( i, 1 ) = Replace( a( i, 1 ) , s & " ", s, 1, 1, 0

3.2 In the row 8 the dot character after word Acetylated is not Bold but obviously it should not be moved with the following space symbol to the 2nd column:
a( i, 1 ) = Replace( a( i, 1 ) , s & ".", "." & s, 1, 1, 0

3.3 In row 25 in (AHA) the last character (the ending round bracket) mistakenly is not bold:
a(i, 1) = Replace(a(i, 1), s & ")", ")" & s, 1, 1, 0)

You may publish some new problematic data for it analyzing and for adjusting the algorithm.
 
Last edited:
Upvote 0
3.1 Left trimming of space char in the 2nd column:
a( i, 1 ) = Replace( a( i, 1 ) , s & " ", s, 1, 1, 0
@ZVI,

Just to alert you... I noted that some of the spaces in the OP's data were actually non-breaking spaces (ASCII 160) which your code may have to account for.
 
Upvote 0
Thank you Rick!
Of course, it was just simple case fixing.
I'd expect some other rules from OP, not from us only :)
Regards
 
Last edited:
Upvote 0
It would be hard to know why that is happening without seeing it happen for ourselves. Can you post a sample workbook of original data where some of the original data includes text that is being split improperly like you describe above?

Also, while some 9 times slower than ZVI's code (4.5 seconds versus 0.25 seconds for 500 rows of data) which makes use of a feature of the Value property that I was unaware or (until now), I thought you might like to see the code I came up with which appears to work correctly with the data you posted originally...
Code:
[TABLE="width: 500"]
<tbody>[TR]
[TD]Sub SplitAtBold()
  Dim R As Long, X As Long, LastRow As Long, Cell As Range, Result As Variant
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  ReDim Result(1 To LastRow, 1 To 2)
  Range("A1:A" & LastRow).Replace Chr(160), " ", xlPart, , , , False, False
  For R = 1 To LastRow  'Each Cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    With Cells(R, "A")
      If IsNull(.Font.Bold) Then
        For X = InStr(.Value, ".") + 1 To Len(.Value)
          If .Characters(X, 1).Text Like "[! ]" And Not .Characters(X, 1).Font.Bold Then
            Result(R, 1) = Trim(Left(.Value, X - 1))
            Result(R, 2) = Mid(.Value, X)
            Exit For
          End If
        Next
      Else
        Result(R, .Font.Bold + 2) = .Value
      End If
    End With
  Next
  Range("A1").Resize(UBound(Result), 2) = Result
End Sub[/TD]
[/TR]
</tbody>[/TABLE]


Hi Rick,

Apologies for the delay. I was about to post a sample data but then I tried your code and it did the work perfectly. It also didn't cause the non bold text to split into multiple columns.

Thanks Rick and ZVI!
 
Upvote 0
Hi Rick and ZVI,

I am trying to use your code on my new excel file which now is the a full data base with over 10.000 records, and the lay out also looks like bold / non bold. Except for this time there is no number in the front but a "-".

But apparently when I try to use the bold to non bold split code a lot of my non bold text becomes bold and becomes part of the split in the first column. I can't seem to find the reason for this. See print screen: https://imgur.com/a/ebCIW

Perhaps any of you can help me out? I have added a small piece of the database to an excel and made it public, but it even happens when you try the code on the small piece.

Thank you in advance for the help! It is much appreciated.
URL Link: https://drive.google.com/file/d/1aagnDe8fPTLUs0egsPEa9NA2whbitG_P/view?usp=sharing
 
Upvote 0
Hi Rick and ZVI,

I am trying to use your code on my new excel file which now is the a full data base with over 10.000 records, and the lay out also looks like bold / non bold. Except for this time there is no number in the front but a "-".

But apparently when I try to use the bold to non bold split code a lot of my non bold text becomes bold and becomes part of the split in the first column. I can't seem to find the reason for this. See print screen: https://imgur.com/a/ebCIW
Unless you tell us to the contrary (which you didn't), we assume the data you show us is the data you have to work with and, hence, we design our code accordingly. In your original request, some of your beginning text (numbers) were not bold but some of the text after it was, so we needed to find a way to skip over the possible non-bold text. I used the dot that appeared after the number. Your new data does not have that structure, so the code is misidentifying the start point because it is finding the dot later in the text. Here is the code modified for your new data (but note that it will not work correctly on your originally posted data, only on this new data).
Code:
[table="width: 500"]
[tr]
	[td]Sub SplitAtBold()
  Dim R As Long, X As Long, LastRow As Long, Cell As Range, Result As Variant
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  ReDim Result(1 To LastRow, 1 To 2)
  Range("A1:A" & LastRow).Replace Chr(160), " ", xlPart, , , , False, False
  For R = 1 To LastRow
    With Cells(R, "A")
      If IsNull(.Font.Bold) Then
        For X = 1 To Len(.Value)
          If .Characters(X, 1).Text Like "[! ]" And Not .Characters(X, 1).Font.Bold Then
            Result(R, 1) = Trim(Left(.Value, X - 1))
            Result(R, 2) = Mid(.Value, X)
            Exit For
          End If
        Next
      Else
        Result(R, .Font.Bold + 2) = .Value
      End If
    End With
  Next
  Range("A1").Resize(UBound(Result), 2) = Result
End Sub[/td]
[/tr]
[/table]
 
Upvote 0
Here is modified version of my code. Seems that all features of your data are taken into the account.
Rich (BB code):
Sub SplitByEndOfBold_01()
' ZVI:2017-11-26 https://www.mrexcel.com/forum/excel-questions/1032185-split-bold-normal-text.html
' Splitting cells by their bold font ending
  Dim a, b(), i&, j&, s$, q$
  s = Chr$(1)
  q = Chr$(39)
  Application.ScreenUpdating = False
  With Range("A1", Cells(Rows.Count, "A").End(xlUp))
    .Value(11) = Replace(.Value(11), "<" & "/B" & ">", "<" & "/B" & ">" & s)
    a = .Value
    If Not IsArray(a) Then ReDim a(1 To 1, 1 To 1): a(1, 1) = .Value
    ReDim b(1 To UBound(a), 1 To 2)
    For i = 1 To UBound(a)
      If InStr(a(i, 1), Chr(160)) > 0 Then a(i, 1) = Replace(a(i, 1), Chr(160), " ")
      j = InStr(a(i, 1), s)
      If j = 0 Then
        b(i, 2) = a(i, 1)
      Else
        If Mid$(a(i, 1), j + 1, 1) Like "[.)]" Then j = j + 1
        b(i, 1) = Trim(Left(a(i, 1), j - 1))
        If InStr(b(i, 1), s) > 0 Then b(i, 1) = Replace(b(i, 1), s, vbNullString)
        If Left(b(i, 1), 1) Like "[-+]" Then b(i, 1) = q & b(i, 1)
        b(i, 2) = Trim(Mid$(a(i, 1), j + 1))
        If InStr(b(i, 2), s) > 0 Then b(i, 2) = Replace(b(i, 2), s, vbNullString)
        If Left(b(i, 2), 1) Like "[-+]" Then b(i, 2) = q & b(i, 2)
      End If
    Next
    .Resize(, 2).Value = b()
    .Font.Bold = False
    .Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub

Regards
 
Last edited:
Upvote 0
Taking my cue from ZVI, here is my code modified to work with both the data you posted originally and the new data you posted today...
Code:
[table="width: 500"]
[tr]
	[td]Sub SplitAtBold()
  Dim R As Long, X As Long, StartAt As Long, LastRow As Long, Cell As Range, Result As Variant
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  ReDim Result(1 To LastRow, 1 To 2)
  Range("A1:A" & LastRow).Replace Chr(160), " ", xlPart, , , , False, False
  For R = 1 To LastRow
    With Cells(R, "A")
      If IsNull(.Font.Bold) Then
        If IsNumeric(Left(.Value, 1)) Then StartAt = InStr(.Value, ".") Else StartAt = 0
        For X = 1 + StartAt To Len(.Value)
          If .Characters(X, 1).Text Like "[! ]" And Not .Characters(X, 1).Font.Bold Then
            Result(R, 1) = Trim(Left(.Value, X - 1))
            Result(R, 2) = Mid(.Value, X)
            Exit For
          End If
        Next
      Else
        Result(R, .Font.Bold + 2) = .Value
      End If
    End With
  Next
  Range("A1").Resize(UBound(Result), 2) = Result
End Sub[/td]
[/tr]
[/table]
 
Upvote 0
Please use this code instead of mine previous one. It works correctly with both your published data.
Rich (BB code):
Sub SplitByEndOfBold_02()
' ZVI:2017-11-26 https://www.mrexcel.com/forum/excel-questions/1032185-split-bold-normal-text.html
' Splitting cells by their bold font ending
  Dim a, b(), i&, j&, s$, q$
  s = Chr$(1)
  q = Chr$(39)
  Application.ScreenUpdating = False
  With Range("A1", Cells(Rows.Count, "A").End(xlUp))
    .Value(11) = Replace(.Value(11), "<" & "/B" & ">", "<" & "/B" & ">" & s)
    a = .Value
    If Not IsArray(a) Then ReDim a(1 To 1, 1 To 1): a(1, 1) = .Value
    ReDim b(1 To UBound(a), 1 To 2)
    For i = 1 To UBound(a)
      If InStr(a(i, 1), Chr(160)) > 0 Then a(i, 1) = Replace(a(i, 1), Chr(160), " ")
      j = InStr(a(i, 1), s)
      If j = 0 Then
        b(i, 2) = a(i, 1)
      Else
        If Mid$(a(i, 1), j + 1, 1) Like "[.)]" Then
          b(i, 1) = Trim(Left(a(i, 1), j + 1))
          b(i, 1) = Replace(b(i, 1), s, vbNullString)
          j = j + 1
        Else
          b(i, 1) = Trim(Left(a(i, 1), j - 1))
        End If
        If Left(b(i, 1), 1) Like "[-+=]" Then b(i, 1) = q & b(i, 1)
        b(i, 2) = Trim(Mid$(a(i, 1), j + 1))
        If InStr(b(i, 2), s) > 0 Then b(i, 2) = Replace(b(i, 2), s, vbNullString)
        If Left(b(i, 2), 1) Like "[-+=]" Then b(i, 2) = q & b(i, 2)
      End If
    Next
    .Resize(, 2).Value = b()
    .Font.Bold = False
    .Columns.AutoFit
  End With
  Application.ScreenUpdating = True
End Sub
Regards
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,911
Messages
6,175,324
Members
452,635
Latest member
laura12345

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top