How to extract multiple copies of the same word from a single cell?

DataQuestioner

Board Regular
Joined
Sep 12, 2013
Messages
115
This request is pretty simple >
I want to extract every occurrence of a word, via a search string, from a single cell with text.
The word(s) could occur anywhere in the cell text.
The word could occur many times, or even not at all!
The returned result should display the total count occurrence.
The returned result must be case sensitive.

Using FIND & REPLACE will not solve this - it only returns the cell count, not the word count.
Using SORT & FILTER will not solve this - it only returns the cell count, not the word count.
Using the FIND or SEARCH Functions (on their own) doesn't solve this - they only find the first occurrence within a cell.

I'm guessing that either a stacked user defined function or an array formula will provide the answer.

e.g. Cell A1 contains "This is a test which will test how to test this request."
a) If the search string was "test" then I would be expecting a returned result of "3".
b) If the search string was "this" then I would be expecting a returned result of "1" (the initial "This" would be ignored because of the case sensitive search).
c) If the search string was "testing" then I would be expecting a "word not found" response. I presume this can be done by using the IFERROR function.

Thanks.
 
Another question for you >
Is it possible to replace the "search text" in quotes, as part of the UDF syntax, by pointing it to another cell where the actual search text is typed in?
When you say "search text", you mean the second argument (TextToFind), correct? Yes, just put the text in the cell without quote marks.
 
Upvote 0

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Thanks Rick.
Yes, the second argument. I've tried it and it works - great!

I've done some more testing with your UDF and I have noticed one little flaw...sorry!

If the TextToFind is the first word in the RangeToSearch then it is not being returned in the final word count.

Try this little A1 cell text that I use to experiment with > "Testing this test is testing! Test this TEST? Yes, this test is tested as a testimony!"

If the TextToFind was "Testing" then the result should be "1" - the UDF is returning "0" - I believe this is because it is not including the first word in the cell, which just happens to be the actual word we're looking for.
Let me know what you think.
 
Upvote 0
I've done some more testing with your UDF and I have noticed one little flaw...sorry!

If the TextToFind is the first word in the RangeToSearch then it is not being returned in the final word count.
That would have been true for single cell as the first argument... the code would have worked correctly if multiple cells were specified as the first argument. The fix is easy as it was a simple oversight on my part...
Code:
Function CountExactWords(RangeToSearch As Variant, TextToFind As String, _
                         Optional CaseSensitive As Boolean) As Long
  Dim X As Long, Combined As String, Words() As String
  If RangeToSearch.Columns.Count > 1 Then
    MsgBox "Single column data only!", vbCritical
  ElseIf RangeToSearch.Rows.Count = 1 Then
    Combined = " " & RangeToSearch.Value & " "
  Else
    Combined = " " & Join(Application.Transpose(RangeToSearch.Value), " ") & " "
  End If
  Words = Split(Combined, TextToFind, , 1 + CaseSensitive)
  For X = 0 To UBound(Words) - 1
    If Right(Words(X), 1) Like "[!A-Za-z0-9]" And _
       Left(Words(X + 1), 1) Like "[!A-Za-z0-9']" Then
      CountExactWords = CountExactWords + 1
    End If
  Next
End Function
 
Upvote 0
XOR LX,

Were you able to amend your nested UDF to allow for the case sensitive scenario?
I'm still interested in your solution.

Apologies. On holiday at the moment. Won't be able to have a look for another week.

Looks like you've got some far more efficient VBA solutions in any case. However, if you're still interested, reply to this thread in one week's time and I'll see what I can do.

Regards
 
Upvote 0
With the text in A2 and the search string in B2, does =(LEN(A2)-LEN(TRIM(SUBSTITUTE(A2," "&B2&" "," "))))/(LEN(B2)+1) work
 
Last edited:
Upvote 0
With the text in A2 and the search string in B2, does =(LEN(A2)-LEN(TRIM(SUBSTITUTE(A2," "&B2&" "," "))))/(LEN(B2)+1) work
The OP wants to search multiple cells, not just a single cell (see Message #24).
 
Upvote 0
The function in post #10 was reported to be very slow when working with real data. This code – still based on a regular expression – processes the cells A1:A31000 each with the red sentence in post #1 and puts the result in B1 in about 1 sec.


Code:
Sub WordCount()
Dim c As Range
Dim fc, lc As String
Dim wcount As Long
wd = "test"   'Search word
fc = "A1"      'First cell
lc = "A31000"    'Last cell
rc = "B1"   'Cell with the result
 
        With CreateObject("VBScript.Regexp")
                .Global = True
                 .Pattern = "\b" & wd & "\b" & "[^']"
                   If wd Like "*'" Then .Pattern = "\b" & wd
                    For Each c In Range(fc, lc)
                        Set mymatches = .Execute(c & " ")
                        wcount = wcount + mymatches.Count
                    Next
       End With
Range(rc) = wcount
End Sub
 
Upvote 0
Rick,
I've ran a test on just over 31,000 cells of text and your UDF works really well.
It takes between 8-10 secs to complete the search.
How many times did you run it? Just once for that time?? I set up a test sheet with 31000 text strings and put the words one, two, three, four and five randomly throughout and then counted the word "four"... there were 6221 of them and my code took took between 0.06 and 0.07 seconds to find them on my computer (I repeated the test 10 times to get that time range). Unless you have a very old computer, I cannot believe my code took that long to do a single count. By comparison, István's latest code (Message #38) took between 0.09 and 0.11 seconds to do the same count. I would be interested to know how his latest code performed on your computer doing the same count as you reported the 8-10 second for using my code.
 
Upvote 0
Hello Rick.
Yes I have re-ran the UDF many times now and discovered the following re "run time speed" >
When Excel 2010 is run for the first time (from app launch) the UDF is super fast...less than a second.
If Excel has ran multiple searches with the UDF, about 20 or more, the run time slowed to 8-10secs.
I think it's something to do with the Excel cache files being stored after each search, which for a 31,000 cell search must be accummulating quite a lot of "back-up" data.

Anyhow, if I reboot Excel, to clear the cache, or whatever else is running in the background, the problem goes away and we're back to super fast searches again.
Hope this clarifies for you.

And one other thing, I'm using an i7 quad processor.
 
Upvote 0

Forum statistics

Threads
1,225,743
Messages
6,186,770
Members
453,370
Latest member
juliewar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top