Frequency of values in a huge asymmetrical range/ table

sauronbaggins

New Member
Joined
May 7, 2020
Messages
5
Office Version
  1. 2016
Platform
  1. Windows
  2. MacOS
Hi guys,

I have tried googling and the things that I know. However, I am not getting the solution. I have an asymmetrical data set of names. It has approx 20000 rows. Column A is never empty for any of the rows. The farthest column that holds a value in it is "HZ" (not all rows have values until HZ, meaning a LOT of the columns are empty. I made it a "Table" and tried to summarize with a pivot table but that doesn't work. What's the best way to find frequency of all the names in such a huge range?

To give you an idea of how the data looks, please see this table.

GameColumn 1Column 2Column 3
TennisBadmintonChess
Golf
TennisBaseball
BasketballTable Tennis
TennisGolf
Hockey

The output I want is this:

GameCount
Tennis3
Golf2
Hockey1
Badminton1
Baseball1
Basketball1
Chess1
Table Tennis1

If there is a way to consolidate/ transpose all the values across this range in a single column, that might also work. I do not have the list of all the unique "sport" values. The unique values run in a few thousands, so it is not possible to do it manually. I first need to create a list of unique values and then find the frequency of each of those across the range.
 

Excel Facts

Format cells as time
Select range and press Ctrl+Shift+2 to format cells as time. (Shift 2 is the @ sign).
Welcome to the forum!

You'll need a macro to do this. Try this:

Open a copy of your workbook. Press Alt-F11 to open the VBA editor. Press Alt-IM to Insert a Module. Paste the following code into the window that opens:

Rich (BB code):
Sub GetUniques()
Dim ur As Variant, op() As Variant, r As Long, c As Long
Dim dict As Object, dicta As Variant, dictb As Variant

    ur = Sheets("Sheet1").UsedRange.Value
    Set dict = CreateObject("Scripting.Dictionary")
    
    For r = 1 To UBound(ur)
        For c = 1 To UBound(ur, 2)
            If ur(r, c) <> "" Then
                dict(ur(r, c)) = dict(ur(r, c)) + 1
            End If
        Next c
    Next r
    
    ReDim op(1 To dict.Count, 1 To 2)
    dicta = dict.keys
    dictb = dict.items
    For r = 1 To UBound(op)
        op(r, 1) = dicta(r - 1)
        op(r, 2) = dictb(r - 1)
    Next r
    
    Sheets("Sheet2").Range("A1").Resize(dict.Count, 2) = op
    
End Sub

Change the sheet name in red to the sheet with the data, and the sheet name and cell in blue to where you want the results to go. Press Alt-Q to close the editor.

In Excel, press Alt-F8 to open the macro selector. Select GetUniques and click Run. That should do it. Take a look and see what you think. We can ignore a header row if there is one, and we can sort the results alphabetically or by frequency too. Hope this helps.
 
Upvote 0
Welcome to the forum!

You'll need a macro to do this. Try this:

Open a copy of your workbook. Press Alt-F11 to open the VBA editor. Press Alt-IM to Insert a Module. Paste the following code into the window that opens:

Rich (BB code):
Sub GetUniques()
Dim ur As Variant, op() As Variant, r As Long, c As Long
Dim dict As Object, dicta As Variant, dictb As Variant

    ur = Sheets("Sheet1").UsedRange.Value
    Set dict = CreateObject("Scripting.Dictionary")
   
    For r = 1 To UBound(ur)
        For c = 1 To UBound(ur, 2)
            If ur(r, c) <> "" Then
                dict(ur(r, c)) = dict(ur(r, c)) + 1
            End If
        Next c
    Next r
   
    ReDim op(1 To dict.Count, 1 To 2)
    dicta = dict.keys
    dictb = dict.items
    For r = 1 To UBound(op)
        op(r, 1) = dicta(r - 1)
        op(r, 2) = dictb(r - 1)
    Next r
   
    Sheets("Sheet2").Range("A1").Resize(dict.Count, 2) = op
   
End Sub

Change the sheet name in red to the sheet with the data, and the sheet name and cell in blue to where you want the results to go. Press Alt-Q to close the editor.

In Excel, press Alt-F8 to open the macro selector. Select GetUniques and click Run. That should do it. Take a look and see what you think. We can ignore a header row if there is one, and we can sort the results alphabetically or by frequency too. Hope this helps.


This worked perfectly! Thanks so much! :)
 
Upvote 0
This is an alternative code that is shorter & might be slightly faster. This will ignore header row data ...

VBA Code:
Sub CntUnique()
Dim a: a = [A1].CurrentRegion.Offset(1)
    With CreateObject("scripting.dictionary")
        For Each i In a
            If Not IsEmpty(i) Then If Not .exists(i) Then .Add i, 1 Else .Item(i) = .Item(i) + 1
        Next
        Sheets.Add(, ActiveSheet).[A1].Resize(.Count, 2) = Application.Transpose(Array(.keys, .items))
    End With
End Sub
 
Upvote 0
This is an alternative code that is shorter & might be slightly faster. This will ignore header row data ...

It's true that your code is somewhat shorter, but it appears that the biggest difference is your use of the Application.Transpose function. I don't know if you're aware, but that function has a limit of 32K values. Given that the OP said that the sheet has over 20K rows, with multiple entries per row, I considered it possible that the 32K limit would be exceeded. That's why I eschewed the Transpose function and used a longer but more robust method.
 
Upvote 0
I don't know if you're aware, but that function has a limit of 32K values.

Hi Eric, I was just experimenting to see how would excel behave if I have data with over 32k records but it was working fine ?. I did some search & it seems that the limit is 65,536 which once exceeded, excel stars to act weird as it shows some records until some point (way before the 65K limit) then it shows #N/A. Just wanted to share :)
 
Upvote 0
It's true that your code is somewhat shorter, but it appears that the biggest difference is your use of the Application.Transpose function. I don't know if you're aware, but that function has a limit of 32K values. Given that the OP said that the sheet has over 20K rows, with multiple entries per row, I considered it possible that the 32K limit would be exceeded. That's why I eschewed the Transpose function and used a longer but more robust method.

Good point. For sure it would have exceeded 32k. The total unique values I got was ~51k.
 
Upvote 0
What about if we want to extract the last 2 comma separated values from right from each row?

For example, of the table is like the one below:

GameColumn 1Column 2Column 3
TennisBadmintonChess
Golf
TennisBaseball
BasketballTennisTable Tennis
TennisGolf
Hockey

The number of comma separated values in the rows is not constant, i.e., some row might have as many as 100 comma separated values and other might only have 1. Most have more than 1 though.

Now, I need to extract the first 2 (from left) and the last 2 (from right) comma separated values for all rows. Once I have that for all the rows, I will once again do a frequency calculation.

Getting the 1st 2 values is fairly easy - just "distribute to columns" and copy the 1st 2 columns. How do I get the last 2 ones from the right (given that the number of comma separated values is not constant)?
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,175
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top