Possible to find frequency of words used in a worksheet, sans specifying any word to be found?

Jennifre

Board Regular
Joined
Jan 11, 2011
Messages
160
I have a worksheet full of data, across which I'd like to see what words are used most often in it?

Then to see all results, from most frequently used, to least frequently used words?

*I'd like to be able to do this both across a worksheet, as well as along a column, or row?


I know the stars here at Mr Excel can figure anything out, so I thank you in advance: Thank you!! <3
___________________________________________________________________________________
There used to be a post here like this, with a file attached demonstrating this. The post was called, below, but is no longer in existence at its link.
Finding most common words in column WITHOUT specifying what words


(Simplified my last question. Do not see how to delete it?)
 

Excel Facts

Using Function Arguments with nested formulas
If writing INDEX in Func. Arguments, type MATCH(. Use the mouse to click inside MATCH in the formula bar. Dialog switches to MATCH.
Rick Rothstein wrote an interest UDF I saw once (and kept) that will find the unique words from a column of sentences.
If your sentences start in A2 and continue in the A column, use this UDF in, say, C2:C40 (or something save). It's an array formula (CTRL+SHIFT+Enter).
You would have to select the range C2:C40, enter this in the formula bar and resolve with CTRL+Shift+Enter.

Code:
=ListOfWords($A$2:$A$4,TRUE)

Rick's slick function is:

Code:
Function ListOfWords(Rng As Range, Optional CaseSensitive As Boolean) As Variant
  Dim X As Long, Index As Long, List As String, Words() As String, LoW As Variant
  With WorksheetFunction
    Words = Split(.Trim(Replace(Join(.Transpose(Rng)), Chr(160), " ")))
    LoW = Split(Space(.Max(UBound(Words), Application.Caller.Count) + 1))
    For X = 0 To UBound(Words)
      If InStr(1, Chr(1) & List & Chr(1), Chr(1) & Words(X) & Chr(1), 1 - Abs(CaseSensitive)) = 0 Then
        List = List & Chr(1) & Words(X)
        If CaseSensitive Then
          LoW(Index) = Words(X)
        Else
          LoW(Index) = StrConv(Words(X), vbProperCase)
        End If
        Index = Index + 1
      End If
    Next
    ListOfWords = .Transpose(LoW)
  End With
End Function

For the frequency, you can put this in D2, for example, and fill down:

Code:
=IFERROR(SUMPRODUCT(LEN($A$2:$A$4)-LEN(SUBSTITUTE($A$2:$A$4,C2,"")))/LEN(C2),"")

Hope this gets you started.
 
Upvote 0
An alternative for you

Code:
Sub CountWords()
    Dim arr As Variant, a As Long, cel As Range
    With CreateObject("Scripting.Dictionary")
        For Each cel In Range("A2", Cells(Rows.Count, "A").End(xlUp))
            arr = Split(cel.Value, " ")
            For a = LBound(arr) To UBound(arr)
                If Not .exists(arr(a)) Then .Add arr(a), 1 Else .Item(arr(a)) = .Item(arr(a)) + 1
            Next a
        Next cel
        Range("B2").Resize(.Count) = Application.Transpose(.keys)
        Range("C2").Resize(.Count) = Application.Transpose(.items)
        ActiveSheet.Range("B2").Resize(.Count, 2).Sort key1:=ActiveSheet.Range("C2"), order1:=2 key2:=ActiveSheet.Range("B2"), order2:=1
    End With
End Sub


BEFORE

Excel 2016 (Windows) 32 bit
[Table="width:, class:head"][tr=bgcolor:#E0E0F0][th] [/th][th]
A
[/th][th]
B
[/th][th]
C
[/th][/tr]
[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
1
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
2
[/td][td]The cat sat on the mouse[/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
3
[/td][td]the cat ate the mouse[/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
4
[/td][td]the mouse sat under the cat[/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
5
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
6
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
7
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
8
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
9
[/td][td][/td][td][/td][td][/td][/tr]
[/table]
[Table="width:, class:grid"][tr][td]Sheet: Sheet3[/td][/tr][/table]

AFTER

Excel 2016 (Windows) 32 bit
[Table="width:, class:head"][tr=bgcolor:#E0E0F0][th] [/th][th]
A
[/th][th]
B
[/th][th]
C
[/th][/tr]
[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
1
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
2
[/td][td]The cat sat on the mouse[/td][td]the[/td][td]
5​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
3
[/td][td]the cat ate the mouse[/td][td]cat[/td][td]
3​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
4
[/td][td]the mouse sat under the cat[/td][td]mouse[/td][td]
3​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
5
[/td][td][/td][td]sat[/td][td]
2​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
6
[/td][td][/td][td]The[/td][td]
1​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
7
[/td][td][/td][td]on[/td][td]
1​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
8
[/td][td][/td][td]ate[/td][td]
1​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
9
[/td][td][/td][td]under[/td][td]
1​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
10
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
11
[/td][td][/td][td][/td][td][/td][/tr]
[/table]
[Table="width:, class:grid"][tr][td]Sheet: Sheet3[/td][/tr][/table]
 
Last edited:
Upvote 0
oops
- some finger trouble when pasting!!
- a comma has disappeared

Code:
ActiveSheet.Range("B2").Resize(.Count, 2).Sort key1:=ActiveSheet.Range("C2"), order1:=2[SIZE=3][COLOR=#ff0000],[/COLOR] [/SIZE]key2:=ActiveSheet.Range("B2"), order2:=1
 
Last edited:
Upvote 0
Slight mod if you prefer it to be case insensitive

Code:
Sub CountWords()
    Dim arr As Variant, a As Long, cel As Range
    With CreateObject("Scripting.Dictionary")
        For Each cel In Range("A2", Cells(Rows.Count, "A").End(xlUp))
            arr = Split(cel.Value, " ")
            For a = LBound(arr) To UBound(arr)
                If Not .exists(LCase(arr(a))) Then .Add LCase(arr(a)), 1 Else .Item(arr(a)) = .Item(arr(a)) + 1
            Next a
        Next cel
        Range("B2").Resize(.Count) = Application.Transpose(.keys)
        Range("C2").Resize(.Count) = Application.Transpose(.items)
        Range("B2").Resize(.Count, 2).Sort key1:=Range("C2"), order1:=2, key2:=Range("B2"), order2:=1
    End With
End Sub

Returns this

Excel 2016 (Windows) 32 bit
[Table="width:, class:head"][tr=bgcolor:#E0E0F0][th] [/th][th]
A
[/th][th]
B
[/th][th]
C
[/th][/tr]
[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
1
[/td][td][/td][td][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
2
[/td][td]The cat sat on the mouse[/td][td]the[/td][td]
6​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
3
[/td][td]the cat ate the mouse[/td][td]cat[/td][td]
3​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
4
[/td][td]the mouse sat under the cat[/td][td]mouse[/td][td]
3​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
5
[/td][td][/td][td]sat[/td][td]
2​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
6
[/td][td][/td][td]ate[/td][td]
1​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
7
[/td][td][/td][td]on[/td][td]
1​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
8
[/td][td][/td][td]under[/td][td]
1​
[/td][/tr]
[/table]
[Table="width:, class:grid"][tr][td]Sheet: Sheet3[/td][/tr][/table]
 
Upvote 0
Thank you!!

Initially I'd like to find words.

However, could this be modified to find most-used phrases -- both specified + not specified?

Thank you!!! :)
 
Upvote 0
Slight mod if you prefer it to be case insensitive

Rich (BB code):
Sub CountWords()
    Dim arr As Variant, a As Long, cel As Range
    With CreateObject("Scripting.Dictionary")
        For Each cel In Range("A2", Cells(Rows.Count, "A").End(xlUp))
            arr = Split(cel.Value, " ")
            For a = LBound(arr) To UBound(arr)
                If Not .exists(LCase(arr(a))) Then .Add LCase(arr(a)), 1 Else .Item(arr(a)) = .Item(arr(a)) + 1
            Next a
        Next cel
        Range("B2").Resize(.Count) = Application.Transpose(.keys)
        Range("C2").Resize(.Count) = Application.Transpose(.items)
        Range("B2").Resize(.Count, 2).Sort key1:=Range("C2"), order1:=2, key2:=Range("B2"), order2:=1
    End With
End Sub
The line of code I highlighted in red above does not need to be that complex. It appears that if a dictionary key does not exist, it will be created automatically when referenced. Given that, the above highlighted line of code can be replace by this one...

.Item(LCase(arr(a))) = .Item(LCase(arr(a))) + 1
 
Last edited:
Upvote 0
Hi – I’m trying to find the same solution but have struggled to get either of these options to work. I have a long list of sentences (4k rows) that I want to find the frequency of words.

I have tried both options but not seeing any values being returned - just getting a #NAME? error to Rick’s function – I have the sentences in ColA; the ListOfWords function in ColB (and set up as an array); and the frequency formula in ColC. All of this is in a new XLS workbook with the ListOfWords function in the only VBA module in the workbook.

I’m a VBA novice so I may be missing a step and am using excel365 on a windows laptop.

Any ideas???

Thanks!

wordcount test2.xlsm
ABC
2StoriesList of WordsFrequency
3If winning Wimbledon was the Holy Grail, then enduring fulfilment was secured in January#NAME? 
4Perhaps the decision has been accelerated by the Covid-19 pandemic#NAME? 
5Anna Sorokin, an intern at a fashion magazine, becomes Anna Delvey, a wealthy heiress.#NAME? 
6Carlos Ortiz was "nailed with a beer can" after making a hole-in-one on the par-three 16th at Scottsdale#NAME? 
7The Phoenix Open, held in February, has developed a reputation for having more of a 'party atmosphere' than other PGA Tour events.#NAME? 
8Michelle Visage goes deep with comedy legend Dawn French on doing things her way.#NAME? 
9Everton said the 30-year-old "will undergo surgery next week before starting his rehabilitation".#NAME? 
10The Colombia international was ruled out for up to 10 weeks after suffering the injury in a league defeat by Newcastle United on 8 February.#NAME? 
11Last week, his lawyers requested for the case to be dismissed, according to documents obtained by the PA news agency.#NAME? 
12The second allegation was tweeted by a woman who claimed Bieber sexually assaulted her in a New York hotel in May 2015.#NAME? 
Sheet1
Cell Formulas
RangeFormula
B3:B12B3=ListOfWords(A3:A12,TRUE)
C3:C12C3=IFERROR(SUMPRODUCT(LEN($A$3:$A$12)-LEN(SUBSTITUTE($A$3:$A$12,B3,"")))/LEN(B3),"")
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 0
@jrshaw15
See if this helps:
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,286
Members
452,631
Latest member
a_potato

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top