split cell at specific character count but break at a whole word

ESACAWIP

New Member
Joined
Nov 9, 2020
Messages
25
Office Version
  1. 365
Platform
  1. Windows
Hoping someone can help me out, having a lot of trouble figuring out how to do this.


I have a large amount of text in a single cell

Example: (Cell A1)
All investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.

I need to break up the text into smaller chucks. Each chuck can only be 40 characters long and has to end on a whole word.

Output example:

OutputCharacter Count (len)
All investing is subject to risk, (Cell A2)33
including the possible loss of the money (Cell A3)40
you invest. (Cell A4)11




This is just an example. I actually have cells that have 40,000 character count in a single cell and I need to break it up into 10,000 character count chucks.


Please let me know if this is possible. Thank you so much!


excel question.JPG
 

Excel Facts

Square and cube roots
The =SQRT(25) is a square root. For a cube root, use =125^(1/3). For a fourth root, use =625^(1/4).
Hi,

This code should split your cell A1 up, and place the subtext in the rows below it starting row 3. NOTE: max string allowed is 32768 characters

Hope it helps - you can change the location accordingly.

Rgds
Rob

VBA Code:
Sub text_splitter()

Dim full_text, sub_text As String
Dim row_num, text_len, start_pos, end_pos As Long

full_text = Range("A1")  'note Excel allows String sizes to 32768 chars only.
text_len = Len(full_text)

row_num = 3 '(first row to add text to)
start_pos = 1

Do While start_pos < text_len

    If start_pos + 40 > text_len Then 'check for end of text
        end_pos = text_len
    Else
        end_pos = InStrRev(full_text, " ", start_pos + 40)
    End If
    sub_text = Mid(full_text, start_pos, end_pos - start_pos)
    Range("A" & row_num) = sub_text
    row_num = row_num + 1
    start_pos = end_pos + 1

Loop

End Sub
Book1
ABCDEFGHIJKLMNOPQ
1All investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.
2
3All investing is subject to risk,
4including the possible loss of the money
5you invest. All investing is subject to
6risk, including the possible loss of the
7money you invest
8
9
10173
11
1233
1340
1439
1540
1616
17
Sheet1
Cell Formulas
RangeFormula
A10,A12:A16A10=LEN(A1)
 
Upvote 0
Here's how I read your problem
Rich (BB code):
Sub test()
    Dim r As Range, s$, i&
    Const n& = 40
    With CreateObject("VBScript.RegExp")
        .Global = True
        For Each r In Range("a1", Range("a" & Rows.Count).End(xlUp))
            If r <> "" Then
                .Pattern = "([!?,.])"
                s = Join(WorksheetFunction.Unique(Application.Trim(Split(.Replace(r, "$1" & Chr(2)), Chr(2))), True))
                .Pattern = "\b(.{1," & n & "}\b)"
                For i = 0 To .Execute(s).Count - 1
                    r(, i + 3) = .Execute(s)(i).submatches(0)
                Next
            End If
        Next
    End With
End Sub

Result after macro.
Book1
ABCDE
1All investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.All investing is subject to risk, including the possible loss of the money you invest
Sheet1
 
Upvote 0
Here is a formula alternative. Max characters is 32766. Which is the maximum character of a cell (32767 i think).

You have to copy the formula in F till ALP.

Book5.xlsx
ABCDEFGHIJKLMN
1LengthEqual?Textjoined parts12345678910
2All investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.173TRUEAll investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.All investing is subject to risk,including the possible loss of the moneyyou invest. All investing is subject torisk, including the possible loss of themoney you invest.     
Sheet8
Cell Formulas
RangeFormula
E1:ALP1E1=SEQUENCE(, 1000)
B2B2=LEN(A2)
C2C2=A2=D2
D2D2=TEXTJOIN(" ", , E2:XFD2)
E2E2=LET( t, LEFT($A2, 41), max, MAX(IFERROR(FIND(" ", t, SEQUENCE(40,,41, -1)), 0)), TRIM(LEFT(t, max)) )
F2:N2F2=IFERROR(LET(nt, TRIM(RIGHT($A2, LEN($A2)-SUM(LEN($E2:E2&" "))+1)), t, LEFT(nt, 41), max, MAX(IFERROR(FIND(" ", t, SEQUENCE(40,,41, -1)), 0)), IF(LEN(nt)>40, TRIM(LEFT(t, max)), nt) ), "")
Dynamic array formulas.


I tested this:

1722344814509.png
 
Last edited:
Upvote 0
Here's how I read your problem
Rich (BB code):
Sub test()
    Dim r As Range, s$, i&
    Const n& = 40
    With CreateObject("VBScript.RegExp")
        .Global = True
        For Each r In Range("a1", Range("a" & Rows.Count).End(xlUp))
            If r <> "" Then
                .Pattern = "([!?,.])"
                s = Join(WorksheetFunction.Unique(Application.Trim(Split(.Replace(r, "$1" & Chr(2)), Chr(2))), True))
                .Pattern = "\b(.{1," & n & "}\b)"
                For i = 0 To .Execute(s).Count - 1
                    r(, i + 3) = .Execute(s)(i).submatches(0)
                Next
            End If
        Next
    End With
End Sub

Result after macro.
Book1
ABCDE
1All investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.All investing is subject to risk, including the possible loss of the money you invest
Sheet1
Here is a formula alternative. Max characters is 32766. Which is the maximum character of a cell (32767 i think).

You have to copy the formula in F till ALP.

Book5.xlsx
ABCDEFGHIJKLMN
1LengthEqual?Textjoined parts12345678910
2All investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.173TRUEAll investing is subject to risk, including the possible loss of the money you invest. All investing is subject to risk, including the possible loss of the money you invest.All investing is subject to risk,including the possible loss of the moneyyou invest. All investing is subject torisk, including the possible loss of themoney you invest.     
Sheet8
Cell Formulas
RangeFormula
E1:ALP1E1=SEQUENCE(, 1000)
B2B2=LEN(A2)
C2C2=A2=D2
D2D2=TEXTJOIN(" ", , E2:XFD2)
E2E2=LET( t, LEFT($A2, 41), max, MAX(IFERROR(FIND(" ", t, SEQUENCE(40,,41, -1)), 0)), TRIM(LEFT(t, max)) )
F2:N2F2=IFERROR(LET(nt, TRIM(RIGHT($A2, LEN($A2)-SUM(LEN($E2:E2&" "))+1)), t, LEFT(nt, 41), max, MAX(IFERROR(FIND(" ", t, SEQUENCE(40,,41, -1)), 0)), IF(LEN(nt)>40, TRIM(LEFT(t, max)), nt) ), "")
Dynamic array formulas.


I tested this:

View attachment 114715

I've tried both these solutions and they work great with smaller text, but I tried changing the vba / formulas so that instead of a split at 40 characters, the original text is 30,000 characters and I would like to split at every 10,000 characters.

However, I think i did something wrong because its not working like it does with smaller text. Maybe I edited it incorrectly?
 
Upvote 0
I've tried both these solutions and they work great with smaller text
Confusing.
My code works differently.

If you simply want to split the text by fixed length without removing duplicate sentence(s), then
Code:
Sub test()
    Dim r As Range, s$, x&, i&
    Const n& = 40
    For Each r In Range("a1", Range("a" & Rows.Count).End(xlUp))
        s = r: i = 2
        Do While Len(s) > n
            x = InStrRev(s, " ", n + 1)
            i = i + 1
            r(, i) = Left$(s, x - 1)
            s = Trim$(Mid$(s, x + 1))
        Loop
        If Len(s) Then i = i + 1: r(, i) = s
    Next
End Sub
 
Upvote 0
Here the formula to split into 10000 chars:

Cell Formulas
RangeFormula
E1:N1E1=SEQUENCE(, 10)
B2B2=LEN(A2)
C2C2=A2=D2
D2D2=TEXTJOIN(" ", , E2:XFD2)
E2E2=LET( t, LEFT($A2, 10001), max, MAX(IFERROR(FIND(" ", t, SEQUENCE(40,,10001, -1)), 0)), IF(LEN(t)>10000, TRIM(LEFT(t, max)), t) )
F2:N2F2=IFERROR(LET(nt, TRIM(RIGHT($A2, LEN($A2)-SUM(LEN($E2:E2&" "))+1)), t, LEFT(nt, 10001), max, MAX(IFERROR(FIND(" ", t, SEQUENCE(40,,10001, -1)), 0)), IF(LEN(nt)>10000, TRIM(LEFT(t, max)), nt) ), "")
Dynamic array formulas.


1722363403271.png
 
Upvote 0

Forum statistics

Threads
1,225,738
Messages
6,186,725
Members
453,368
Latest member
positivemind

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top