Stripping text in Excel

nevolex

New Member
Joined
Oct 10, 2019
Messages
6
Hi everyone,

Is there a chance somebody can please help with a formula which will basically remove all the text from this text block (example) pasted in excel:

In literary theory, a text is any object that can be "read", whether this object is a work of literature, TK8881122 a street sign, an arrangement of buildings on a city block, or styles of clothing.
It is a coherent set of signs that transmits some kind of informative message.[1]
This set of signs is considered in terms of the informative message's content, rather than in terms of its physical form or the medium in which it is represented.
Within the field of literary criticism, "text" also refers to the original information content of a NM12345 particular piece of writing; that is, the "text" of a work is that primal symbolic arrangement of letters as originally composed, apart from later alterations, deterioration, commentary, translations, paratext, etc.

What I am hopping to get is all the text to be remove leaving just:
TK8881122
NM12345

Thanks a lot


 

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.
Welcome to the MrExcel board!

What would be the verbal 'rule' for what you want out of any text?

Is it "any 2 upper case letters followed by 1 or more digits" or you specify your requirement.

What is the longest text that might need to be extracted? The longest of the examples is 9 characters.

Can there be more, or less, than two items to extract from a text?

Is the bold text you quoted all in a single cell?

Is a VBA solution acceptable?
 
Upvote 0
HI
how about
Code:
Sub test() 
Dim a As Variant
    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "[A-Z]+\d+"
        Set m = .Execute(Cells(1))
        ReDim a(1 To m.Count)
        For i = 0 To m.Count - 1
            a(i + 1) = m(i)
        Next
    End With
    Cells(1, 2).Resize(UBound(a)) = Application.Transpose(a)
End Sub
 
Last edited:
Upvote 0
Welcome to the MrExcel board!

What would be the verbal 'rule' for what you want out of any text?

Is it "any 2 upper case letters followed by 1 or more digits" or you specify your requirement.

What is the longest text that might need to be extracted? The longest of the examples is 9 characters.

Can there be more, or less, than two items to extract from a text?

Is the bold text you quoted all in a single cell?

Is a VBA solution acceptable?

Hi Peter, the verbal rule is "any amount of upper or lower case letters followed by 1 or more digits"
The longest text is probably no longer than 22 characters.
Yes it can be more or less than 2 items to extract from a text
The text is not necessary bold, can be just normal text too.
The text can be across the cells, but can be all copied in one cell if needed.
VBA solution might be accessible

Thank you
 
Upvote 0
Thanks for the clarifications. See if this user-defined function helps. To implement ..
1. Right click the sheet name tab and choose "View Code".
2. In the Visual Basic window use the menu to Insert|Module
3. Copy and Paste the code below into the main right hand pane that opens at step 2.
4. Close the Visual Basic window.
5. Enter the formula as shown in the screen shot below and copy across & down.
6. Your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

Rich (BB code):
Function GetParts(s As String, num As Long) As String
  Static RX As Object, m As Object
  
  If RX Is Nothing Then
    Set RX = CreateObject("VBScript.RegExp")
    RX.Global = True
  End If
  RX.Pattern = "[A-Z]+\d+"
  Set m = RX.Execute(s)
  If num <= m.Count Then GetParts = m.Item(num - 1)
End Function

Excel Workbook
ABCDE
1In literary theory, a text is any object that can be "read", whether this object is a work of literature, TK8881122 a street sign, an arrangement of buildings on a city block, or styles of clothing.It is a coherent set of signs that transmits some kind of informative message.[1]This set of signs is considered in terms of the informative message's content, rather than in terms of its physical form or the medium in which it is represented.Within the field of literary criticism, "text" also refers to the original information content of a NM12345 particular piece of writing; that is, the "text" of a work is that primal symbolic arrangement of letters as originally composed, apart from later alterations, deterioration, commentary, translations, paratext, etc.TK8881122NM12345
2Nothing here meets the pattern
3
4text HGFTY76509876 more text F0, and more G876567404323.HGFTY76509876F0G876567404323
Sheet1
 
Last edited:
Upvote 0
Hi again
What about
Code:
Sub test()
    Dim a, m As Variant
    Dim lr, t, i As Long
    lr = Cells(Rows.Count, 1).End(xlUp).Row
    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "[A-Z]+\d+"
        For t = 1 To lr
            If .test(Cells(t, 1)) Then
                Set m = .Execute(Cells(t, 1))
                ReDim a(1 To m.Count)
                For i = 0 To m.Count - 1
                    a(i + 1) = m(i)
                Next
                Cells(t, 2).Resize(, UBound(a)) = a
            End If
        Next
    End With
End Sub
 
Upvote 0
Thanks a lot for the code Peter, I just wanted to ask if you can please modify the code for "any amount of upper or lower case letters followed by 1 or more digits or symbols like - _ / ; \ # etc"

Also for each number found and filtered it will be shown in individual cell, however is there a chance all those numbers can be shown in one cell at once with space between them? Only if that is possible. Thank you very much

Thanks for the clarifications. See if this user-defined function helps. To implement ..
1. Right click the sheet name tab and choose "View Code".
2. In the Visual Basic window use the menu to Insert|Module
3. Copy and Paste the code below into the main right hand pane that opens at step 2.
4. Close the Visual Basic window.
5. Enter the formula as shown in the screen shot below and copy across & down.
6. Your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

Rich (BB code):
Function GetParts(s As String, num As Long) As String
  Static RX As Object, m As Object
  
  If RX Is Nothing Then
    Set RX = CreateObject("VBScript.RegExp")
    RX.Global = True
  End If
  RX.Pattern = "[A-Z]+\d+"
  Set m = RX.Execute(s)
  If num <= m.Count Then GetParts = m.Item(num - 1)
End Function

Sheet1

ABCDE
In literary theory, a text is any object that can be "read", whether this object is a work of literature, TK8881122 a street sign, an arrangement of buildings on a city block, or styles of clothing.
It is a coherent set of signs that transmits some kind of informative message.[1]
This set of signs is considered in terms of the informative message's content, rather than in terms of its physical form or the medium in which it is represented.
Within the field of literary criticism, "text" also refers to the original information content of a NM12345 particular piece of writing; that is, the "text" of a work is that primal symbolic arrangement of letters as originally composed, apart from later alterations, deterioration, commentary, translations, paratext, etc.
TK8881122NM12345
Nothing here meets the pattern
text HGFTY76509876 more text F0, and more G876567404323.HGFTY76509876F0G876567404323

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:501px;"><col style="width:136px;"><col style="width:87px;"><col style="width:120px;"><col style="width:68px;"></colgroup><tbody>
[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=cacaca]#cacaca[/URL] , align: center"]1[/TD]

[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=cacaca]#cacaca[/URL] , align: center"]2[/TD]

[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=cacaca]#cacaca[/URL] , align: center"]3[/TD]

[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=cacaca]#cacaca[/URL] , align: center"]4[/TD]

</tbody>

Spreadsheet Formulas
CellFormula
B1=GetParts($A1,COLUMNS($B1:B1))

<tbody>
</tbody>

<tbody>
</tbody>


Excel tables to the web >> Excel Jeanie HTML 4
 
Upvote 0
Hi what about
Code:
Sub test()
    Dim a, m As Variant
    Dim lr, t, i As Long
    lr = Cells(Rows.Count, 1).End(xlUp).Row
    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "[.A-Za-z]+.?\d+."
        For t = 1 To lr
            If .test(Cells(t, 1)) Then
                Set m = .Execute(Cells(t, 1))
                ReDim a(1 To m.Count)
                For i = 0 To m.Count - 1
                    a(i + 1) = m(i)
                Next
                Cells(t, 3) = Join(a, " ")
            End If
        Next
    End With
End Sub
 
Upvote 0
I just wanted to ask if you can please modify the code for "any amount of upper or lower case letters followed by 1 or more digits or symbols like - _ / ; \ # etc"

I think you will need to tighten up the above rule somewhat or else explain why message.[1] was not outputted from this sentence in your cell A1 example text...

"It is a coherent set of signs that transmits some kind of informative message.[1]"
 
Upvote 0
... followed by 1 or more digits or symbols like - _ / ; \ # etc"
First thing for me is to clarify this part.
a) Is that red "-" part of the list of characters to look for or not?
b) What exactly does "etc" mean at the end. We don't know what symbols you want to include in your search and what to exclude. You have to tell us.

A few more short samples with expected results might also help clarify.

BTW, best not to fully quote long posts as it makes the thread harder to read/navigate. If you want to quote, quote small, relevant parts only.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,885
Messages
6,175,182
Members
452,615
Latest member
bogeys2birdies

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top