most common text within cells

shakethingsup

Board Regular
Joined
May 21, 2017
Messages
64
Office Version
  1. 365
Platform
  1. Windows

I don't think this is possible with just formulas but let's say I have 10 cells, and I want to find the top 5 most common words

Hello my name is bob
This is great, I wonder what I'm doing here
What are you doing bob?
Bob's my uncle
Bob's burger
now it's getting confusing
Is it bob or bob's
Job Bob
Bob jobs
Jim bob joe bob

So is there a formula in excel to identify the most common word - for example, bob, Bob's, Bob, I, is, etc. I'm okay with making the assumption that a " " = start of a new word/end of the previous one.

or is the easiest thing to do each time is to:
1. text to columns
2. but then I have many columns and then what's next?
 

Excel Facts

Create a chart in one keystroke
Select the data and press Alt+F1 to insert a default chart. You can change the default chart to any chart type
Would you consider a macro instead?

This should list the top N plus ties.

VBA Code:
Sub TopNMostCommon()
  Dim d As Object, AL As Object
  Dim a As Variant, b As Variant, itm As Variant
  Dim i As Long
  
  Const TopN As Long = 5  '<- Edit as required
  
  Set d = CreateObject("Scripting.Dictionary")
  d.CompareMode = 1
  Set AL = CreateObject("System.Collections.ArrayList")
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value
  ReDim b(1 To TopN)
  For i = 1 To UBound(a)
    For Each itm In Split(a(i, 1))
      d(itm) = d(itm) + 1
    Next itm
  Next i
  For Each itm In d.Keys
    AL.Add Format(d(itm), "000000") & " " & itm
  Next itm
  AL.Add "000000 "
  AL.Sort
  AL.Reverse
  For i = 1 To TopN
    b(i) = AL.Item(i - 1)
  Next i
  Do Until i = AL.Count Or Split(AL.Item(i - 1))(0) < Split(AL.Item(i - 2))(0)
    ReDim Preserve b(1 To i)
    b(i) = AL.Item(i - 1)
    i = i + 1
  Loop
  With Range("B2").Resize(UBound(b))
    .Value = Application.Transpose(b)
    .TextToColumns DataType:=xlDelimited, Semicolon:=False, Comma:=False, Space:=True, Other:=False
    .Cells(0).Resize(, 2).Value = Array("Count", "Word")
  End With
End Sub

Results with your sample data & Top N set to 5
shakethingsup.xlsm
ABC
1DataCountWord
2Hello my name is bob6bob
3This is great, I wonder what I'm doing here3is
4What are you doing bob?3Bob's
5Bob's my uncle2what
6Bob's burger2my
7now it's getting confusing2doing
8Is it bob or bob's
9Job Bob
10Bob jobs
11Jim bob joe bob
Sheet1
 
Upvote 0
Would you consider a macro instead?

This should list the top N plus ties.

VBA Code:
Sub TopNMostCommon()
  Dim d As Object, AL As Object
  Dim a As Variant, b As Variant, itm As Variant
  Dim i As Long
 
  Const TopN As Long = 5  '<- Edit as required
 
  Set d = CreateObject("Scripting.Dictionary")
  d.CompareMode = 1
  Set AL = CreateObject("System.Collections.ArrayList")
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value
  ReDim b(1 To TopN)
  For i = 1 To UBound(a)
    For Each itm In Split(a(i, 1))
      d(itm) = d(itm) + 1
    Next itm
  Next i
  For Each itm In d.Keys
    AL.Add Format(d(itm), "000000") & " " & itm
  Next itm
  AL.Add "000000 "
  AL.Sort
  AL.Reverse
  For i = 1 To TopN
    b(i) = AL.Item(i - 1)
  Next i
  Do Until i = AL.Count Or Split(AL.Item(i - 1))(0) < Split(AL.Item(i - 2))(0)
    ReDim Preserve b(1 To i)
    b(i) = AL.Item(i - 1)
    i = i + 1
  Loop
  With Range("B2").Resize(UBound(b))
    .Value = Application.Transpose(b)
    .TextToColumns DataType:=xlDelimited, Semicolon:=False, Comma:=False, Space:=True, Other:=False
    .Cells(0).Resize(, 2).Value = Array("Count", "Word")
  End With
End Sub

Results with your sample data & Top N set to 5
shakethingsup.xlsm
ABC
1DataCountWord
2Hello my name is bob6bob
3This is great, I wonder what I'm doing here3is
4What are you doing bob?3Bob's
5Bob's my uncle2what
6Bob's burger2my
7now it's getting confusing2doing
8Is it bob or bob's
9Job Bob
10Bob jobs
11Jim bob joe bob
Sheet1
let me try this on Monday/Tuesday and get back to you! open to Macros, good to learn for me as well!
 
Upvote 0

Forum statistics

Threads
1,224,814
Messages
6,181,120
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top