Extract a string of text from a cell

RattlingCarp3048

Board Regular
Joined
Jan 12, 2022
Messages
202
Office Version
  1. 365
Platform
  1. Windows
this one is tripping me up big time. via formula or VBA. Column P contains numerous rows ranging from 1-1500 in any given month. Each cell in column P contains a paragraph of information. Somewhere within each paragraph is a brief statement "x of x cores". i need to extract that statement from each cell. unfortunately "x of x cores" can be any combination of "zero of one core" or "two of four cores" or "four of four cores" or "zero of three cores" etc all the way up to four of four and the numbers are always spelled out.

any ideas how to make this happen?

Column P random paragraphextracted result
;jlasijfaoij asolifdj ; ashdfa hfosdf ;oa ufoa two of two cores aosihfo ahsodfh sofah ashdf otwo of two cores
klaslfjas sldkjf asjd asljfoij fofheoiwehf J;OASDHF OW j;asld % aofh # of % one of three oiaskdn aoshf a oiasofione of three cores
oiafj;as oasfan aofa oai foajf oaiaf two of four cores aosifal oais f oai f oaif oj;oiashodfu %two of four
 
disregard the part about all the variations.... i have a solution to make those spacing corrections on our end.

if there is a way to account for the statement being more than once i think that would give me the best outcome.
So, the new requirements are:
1. there are 2 patternn: "X of x core(s)" & "X out of x core(s)"
2. there could be multiple occurrences in one cell, like: This is one of one core & that's two of three cores.

Is that correct?
 
Upvote 0

Excel Facts

Format cells as date
Select range and press Ctrl+Shift+3 to format cells as date. (Shift 3 is the # sign which sort of looks like a small calendar).
Upload Mini-sheet
So, the new requirements are:
1. there are 2 patternn: "X of x core(s)" & "X out of x core(s)"
2. there could be multiple occurrences in one cell, like: This is one of one core & that's two of three cores.

Is that correct?
Yes. I wasn't aware of the 2nd pattern or the multiple occurrences. We have apparently been missing Datta as we haven't been searching for the additional patterns or occurrences
 
Upvote 0
Is it ok that the result:

1. always show "core" instead of "cores", like:

two of four cores somethingtwo of four core


2. not always in the same order as the sentence, like:

This is three of four cores & that's two of three cores.two of three core, three of four core
 
Upvote 0
Is it ok that the result:

1. always show "core" instead of "cores", like:

two of four cores somethingtwo of four core


2. not always in the same order as the sentence, like:

This is three of four cores & that's two of three cores.two of three core, three of four core
yes "core" in the final result is perfectly fine.
 
Upvote 0
In regards to post #9, Why did you put the function in a sub? That's your problem.
 
Upvote 0
Try this one:

VBA Code:
Sub RattlingCarp3048_4()
Dim i As Long, j As Long, k As Long
Dim tx As String
Dim va, vx, ary, vb
Dim regEx As Object

    ary = Split("zero one two three four")
    ReDim vx(1 To 30, 1 To 1)
    For i = 0 To UBound(ary)
        For j = i To UBound(ary)
            k = k + 1
            vx(k, 1) = ary(i) & " of " & ary(j) & " core"
            k = k + 1
            vx(k, 1) = ary(i) & " out of " & ary(j) & " core"
        Next
    Next
    
    va = Range("A1", Cells(Rows.Count, "A").End(xlUp))
    ReDim vb(1 To UBound(va, 1), 1 To 1)
    
        Set regEx = CreateObject("VBScript.RegExp")
    
        For i = 1 To UBound(va, 1)
            tx = LCase(va(i, 1))
            For Each x In vx
                regEx.Pattern = "\b" & x & "[s]{0,1}\b"
                If regEx.test(tx) Then
                    If InStr(tx, x) > 0 Then
                        vb(i, 1) = vb(i, 1) & ", " & x
                    End If
                End If
            Next
            If vb(i, 1) <> "" Then vb(i, 1) = Mid(vb(i, 1), 3, 50000)
        Next
    
Range("B1").Resize(UBound(vb, 1), 1) = vb
End Sub
Example:
Please, check notes in col C.
RattlingCarp3048 - regex.xlsm
ABC
1data
2One of one core random paragraphone of one core
3Bone of one core random paragraph<-- rejected because of bone
4Y got what two of three cores he bid fortwo of three core
5Two out of four cores somethingtwo out of four core
6two out of four coress something<-- rejected because of coress
7random paragraph zero out of zero corezero out of zero core
8This is four of four cores & that's two of three cores.two of three core, four of four core<-- order of occurrences is different from the sentence
9This is two of three cores & that's four of four cores.two of three core, four of four core
10two of three cores & two of three corestwo of three core<-- Multiple occurrences are counted as one occurrence.
11something of this core
12This is the cores of his message.
13The score gap is too high.
Sheet1
 
Upvote 0
I've amended my code a bit because InStr can be replaced with matches:
VBA Code:
Sub RattlingCarp3048_5()
Dim i As Long, j As Long, k As Long
Dim tx As String
Dim va, vx, ary, vb
Dim regEx As Object, Matches As Object

    ary = Split("zero one two three four")
    ReDim vx(1 To 30, 1 To 1)
    For i = 0 To UBound(ary)
        For j = i To UBound(ary)
            k = k + 1
            vx(k, 1) = ary(i) & " of " & ary(j) & " core"
            k = k + 1
            vx(k, 1) = ary(i) & " out of " & ary(j) & " core"
        Next
    Next
    
    va = Range("A1", Cells(Rows.Count, "A").End(xlUp))
    ReDim vb(1 To UBound(va, 1), 1 To 1)
    
        Set regEx = CreateObject("VBScript.RegExp")
        With regEx
        For i = 1 To UBound(va, 1)
            tx = LCase(va(i, 1))
            For Each x In vx
                .Pattern = "\b" & x & "[s]{0,1}\b"
                If .test(tx) Then
                       Set Matches = .Execute(tx)
                       vb(i, 1) = vb(i, 1) & ", " & Matches(0)
                End If
            Next
            If vb(i, 1) <> "" Then vb(i, 1) = Mid(vb(i, 1), 3, 50000)
        Next
        End With
Range("B1").Resize(UBound(vb, 1), 1) = vb
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,886
Messages
6,175,189
Members
452,616
Latest member
intern444

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top