Macro for finding most frequent occuring strings in a list

nn992

New Member
Joined
Jul 28, 2016
Messages
47
Hi everyone,

I have a project where I am working with really large datasets, and I have to use excel for it.

One of the tasks to do is to extract most frequently occuring string, second most frequent string, third etc...

I know that there is a formula to do this, but I would like to keep it simpler and do it with macro..
Any experts here who can help?

Thanks in advance
 

Excel Facts

Get help while writing formula
Click the italics "fx" icon to the left of the formula bar to open the Functions Arguments dialog. Help is displayed for each argument.
Here's one way of doing it.


Code:
Sub print_unique()


Const oprange = "H1"


' Lists DISTINCT list of items on spreadsheet using columns H:I in order found in source data; Col H = Item, Col I =  Frequency it appears in the source data
    
    Dim v
    
    v = getUniqueArray(Range("b2:b30"))
    If IsArray(v) Then
       Range(oprange).Resize(UBound(v), 2) = v
    End If


End Sub


Sub test()


' Takes the source data and asks for the highest 10 frequent things.
' Output is shown as a text string showing: [Item Name] is in position [Order of Frequency] (i.e. Highest = 1, Second Highest = 2 and so on) with a count of [No of times found]
' IMPORTANT NOTE: Because I used a RANGE to sort the data (rather than, say, a sort routine) the GetMFOS routine this calls uses range K:L


' This can be avoided by adding a sort routine to the array and not using the range object. Additionally, this means a UDF (User Defined Function can be used!)


    Const TempRange = "K1"
    Const oprange = "F"


    Dim OP As String
    Dim FreqCntr As Integer
    
    For FreqCntr = 1 To 10
        GetMFOS Range("b2:b30"), FreqCntr, OP, TempRange
        Range(oprange & FreqCntr).Value = OP
    Next FreqCntr




End Sub




Sub GetMFOS(ByVal StrRange As Range, ByVal FreqSought As Integer, ByRef OutputStr As String, ByVal TempRange As String)


    Dim v
    Dim r As Range
    
    v = getUniqueArray(StrRange)
    If IsArray(v) Then
    
        If FreqSought > UBound(v) Then
            OutputStr = "There are only " & UBound(v) & " items. Cannot find item at position " & FreqSought
            Exit Sub
        End If
    
    
        Set r = Range(TempRange).Resize(UBound(v), 2)
    
        For cntr = 1 To UBound(v)
            r.Cells(cntr, 1).Value = v(cntr, 1)
            r.Cells(cntr, 2).Value = v(cntr, 2)
        Next cntr
        
        r.Sort Key1:=r.Cells(1, 2), order1:=xlDescending, Header:=xlNo
    
        OutputStr = r.Cells(FreqSought, 1) & " is in position " & FreqSought & " with a count of " & r.Cells(FreqSought, 2) & " Items Found"
    


    End If




End Sub






Public Function getUniqueArray(inputRange As Range, _
                                Optional skipBlanks As Boolean = True, _
                                Optional matchCase As Boolean = True, _
                                Optional prepPrint As Boolean = True _
                                ) As Variant
               
Dim vDic As Object
Dim tArea As Range
Dim tArr As Variant, tVal As Variant, tmp As Variant
Dim noBlanks As Boolean
Dim cnt As Long
Dim CntArray()
                      
On Error GoTo exitFunc:
If inputRange Is Nothing Then GoTo exitFunc


With inputRange
    If .Cells.Count < 2 Then
        ReDim tArr(1 To 1, 1 To 1)
        tArr(1, 1) = .Value2
        getUniqueArray = tArr
        GoTo exitFunc
    End If


    Set vDic = CreateObject("scripting.dictionary")
    If Not matchCase Then vDic.compareMode = vbTextCompare
    
    noBlanks = True
    
    For Each tArea In .Areas
        tArr = tArea.Value2
        For Each tVal In tArr
            If tVal <> vbNullString Then
                vDic.Item(tVal) = vDic.Item(tVal) + 1
                If vDic.Exists(tVal) Then Debug.Print vDic(tVal)
            ElseIf noBlanks Then
                noBlanks = False
            End If
        Next
    Next
End With


If Not skipBlanks Then If Not noBlanks Then vDic.Item(vbNullString) = Empty


If prepPrint Then
    ReDim tmp(1 To vDic.Count, 1 To 2)
    For Each tVal In vDic.Keys
        cnt = cnt + 1
        tmp(cnt, 1) = tVal
    Next
    cnt = 0
    For Each tVal In vDic.items
        cnt = cnt + 1
        tmp(cnt, 2) = tVal
    Next
    getUniqueArray = tmp


End If


exitFunc:
Set vDic = Nothing
End Function




Excel 2010
ABCDEFGHIJKL
Dog Ate Postman is in position 1 with a count of 9 Items FoundExplosionDog Ate Postman
ExplosionSaturns RingsExplosion is in position 2 with a count of 6 Items FoundDog Ate PostmanExplosion
Dog Ate PostmanJoe Blogs EngineeringLate Delivery is in position 3 with a count of 5 Items FoundLate DeliveryLate Delivery
ExplosionEuropaOrder Lost is in position 4 with a count of 4 Items FoundOrder LostOrder Lost
Late DeliverySaturns RingsWrong Item is in position 5 with a count of 3 Items FoundWrong ItemWrong Item
Dog Ate PostmanJoe Blogs EngineeringWrong Planet is in position 6 with a count of 2 Items FoundWrong PlanetWrong Planet
ExplosionEuropaThere are only 6 items. Cannot find item at position 7
Dog Ate PostmanEuropaThere are only 6 items. Cannot find item at position 8
Order LostBetty CaféThere are only 6 items. Cannot find item at position 9
Order LostMoonThere are only 6 items. Cannot find item at position 10
Order LostJoe Blogs Engineering
Wrong ItemJoe Blogs Engineering
Wrong PlanetMars Station
Late DeliveryBetty Café
ExplosionBetty Café
ExplosionMoon
Dog Ate PostmanSaturns Rings
Wrong ItemSaturns Rings
Dog Ate PostmanMars Station
ExplosionSaturns Rings
Wrong ItemMars Station
Wrong PlanetSaturns Rings
Dog Ate PostmanEuropa
Dog Ate PostmanMars Station
Dog Ate PostmanEuropa
Late DeliveryJoe Blogs Engineering
Late DeliveryEuropa
Order LostEuropa
Dog Ate PostmanSaturns Rings
Late DeliveryMoon

<tbody>
[TD="align: center"]1[/TD]
[TD="bgcolor: #D9D9D9"]Date[/TD]
[TD="bgcolor: #D9D9D9"]Reason[/TD]
[TD="bgcolor: #D9D9D9"]Location[/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]

[TD="align: right"]6[/TD]
[TD="align: right"][/TD]

[TD="align: right"]9[/TD]

[TD="align: center"]2[/TD]
[TD="align: center"]22/01/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]

[TD="align: right"]9[/TD]
[TD="align: right"][/TD]

[TD="align: right"]6[/TD]

[TD="align: center"]3[/TD]
[TD="align: center"]17/10/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]

[TD="align: right"]5[/TD]
[TD="align: right"][/TD]

[TD="align: right"]5[/TD]

[TD="align: center"]4[/TD]
[TD="align: center"]20/09/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]

[TD="align: right"]4[/TD]
[TD="align: right"][/TD]

[TD="align: right"]4[/TD]

[TD="align: center"]5[/TD]
[TD="align: center"]01/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]

[TD="align: right"]3[/TD]
[TD="align: right"][/TD]

[TD="align: right"]3[/TD]

[TD="align: center"]6[/TD]
[TD="align: center"]09/03/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]

[TD="align: right"]2[/TD]
[TD="align: right"][/TD]

[TD="align: right"]2[/TD]

[TD="align: center"]7[/TD]
[TD="align: center"]24/08/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]8[/TD]
[TD="align: center"]21/06/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]9[/TD]
[TD="align: center"]08/09/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]10[/TD]
[TD="align: center"]01/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]11[/TD]
[TD="align: center"]11/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]12[/TD]
[TD="align: center"]25/08/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]13[/TD]
[TD="align: center"]14/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]14[/TD]
[TD="align: center"]03/04/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]15[/TD]
[TD="align: center"]16/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]16[/TD]
[TD="align: center"]25/02/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]17[/TD]
[TD="align: center"]21/11/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]18[/TD]
[TD="align: center"]28/11/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]19[/TD]
[TD="align: center"]23/10/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]20[/TD]
[TD="align: center"]15/12/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]21[/TD]
[TD="align: center"]08/12/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]22[/TD]
[TD="align: center"]13/11/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]23[/TD]
[TD="align: center"]05/01/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]24[/TD]
[TD="align: center"]14/03/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]25[/TD]
[TD="align: center"]19/12/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]26[/TD]
[TD="align: center"]05/10/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]27[/TD]
[TD="align: center"]21/02/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]28[/TD]
[TD="align: center"]25/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]29[/TD]
[TD="align: center"]11/05/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

[TD="align: center"]30[/TD]
[TD="align: center"]03/07/2017[/TD]

[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]
[TD="align: right"][/TD]

</tbody>







There's data in columns A:C

Find the frequency of the items in column B


Run the print_unique macro and you get the output data in H:I showing a DISTINCT list of entries and the number of times each was found.



Run the test macro and you get output data in K:L (a byproduct of using a range to sort the data).. and the output (asks for ten highest frequencies), outputs in column F





Much Kudos goes to CHIRP from just over 5 years ago... whose distinct list code I amended in order to get the frequency and quantity.

I did add a link to the original code but the forum won't let me post - something about [FONT=&quot]Chrome detected unusual code on this page and blocked it to protect your personal information (for example, passwords, phone numbers and credit cards).[/FONT]
 
Upvote 0

Forum statistics

Threads
1,224,827
Messages
6,181,195
Members
453,021
Latest member
pingpong7117

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top