VBA - Identify / mark distinct values in a range without COUNTIF

bartmaster

New Member
Joined
Jan 30, 2019
Messages
21
Office Version
  1. 365
  2. 2016
Platform
  1. Windows
Please see below example table.

Column A has random numbers between 1 to 100
Column B has the following formula copied down starting from B2:
Excel Formula:
=IF(COUNTIF($A$2:A2,A2)=1,1,0)

I would like to get value of 1 in column B if the row in column A is distinct, else 0.

This works well only for small data sets but I need this to work efficiently for data sets exceeding 100k rows.

I have tried the below VBA code for 50k rows that would basically do the same thing as the formula but also replace formula with the cell value to speed up calculations (I think this is correct).
I have also printed how much time it took to get to a particular row number.

VBA Code:
Sub get_unique()

Dim startTime As Double

startTime = Now

Dim cell As Range, xRng As Range, lr As Long
lr = Sheet1.Cells(Sheet1.Rows.Count, 4).End(xlUp).Row
Set xRng = Sheet1.Range("b2:b50000")

For Each cell In xRng.Cells
    cell.Formula = "=IF(COUNTIF($A$2:A" & cell.Row & ",A" & cell.Row & ")=1,1,0)"
    cell.Value = cell.Value
    Select Case cell.Row
        Case Is = 100
            
            Range("e2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
        Case Is = 1000
            Range("f2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
        Case Is = 10000
            Range("g2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
        Case Is = 20000
            Range("h2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
        Case Is = 30000
            Range("i2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
        Case Is = 40000
            Range("j2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
        Case Is = 50000
            Range("k2").Value = Round(((Now - startTime) * 60 * 60 * 24), 0)
            DoEvents
    End Select
    
Next cell

MsgBox "refresh time: " & Round(((Now - startTime) * 60 * 60 * 24), 0) & " seconds"



End Sub



Processing Time:
cell E1 - row 1000: 1 sec
cell F1 - row 5'000: 6 sec
cell G1 - row 10'000: 69 sec
cell H1 - row 20'000: 148 sec
cell I1 - row 30'000: 237 sec
cell J1 - row 40'000: 338 sec
cell K1 - row 50'000: 450 sec

Is there any quicker way to achieve checking if a value in a range is a distinct value (that would work for both: numeric and string values)?

I looked through the forum and only found how to extract unique values from a range.
I do not want to copy my data anywhere or delete duplicated rows.
Data set that is being checked (column A) must remain unchanged.

Attached is the screenshot of the example table.

Thanks
 

Attachments

  • TABLE.jpg
    TABLE.jpg
    64.2 KB · Views: 18

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Just to let you know I have tried running second version of my code but this time replaced this:
VBA Code:
cell.Formula = "=IF(COUNTIF($A$2:A" & cell.Row & ",A" & cell.Row & ")=1,1,0)"
    cell.Value = cell.Value
for this:
VBA Code:
    cell.Value = Application.WorksheetFunction.CountIf(checkRange, CurrentValue)
    If cell.Value = 1 Then
        cell.Value = 1
    Else
        cell.Value = 0
    End If


The total refresh time has been reduced by about 16%: from 450 seconds to 377.
Is there any way to improve this even more using some "advanced" code / functions / dictionary or anything else?
 

Attachments

  • processing_Times.jpg
    processing_Times.jpg
    35.4 KB · Views: 14
Upvote 0
Try this code which uses the dictionary object and is very fast, it gives you a count of how many times a value occurs obviously you can modify it to get 0 and 1 if you prefer that. I thought the count might useful. It is very fast!!!
VBA Code:
Sub test()
ary = Worksheets("Sheet1").Range("A1:B50000")
   Dim Dic As Object
   Set Dic = CreateObject("Scripting.dictionary")
   For i = 1 To UBound(ary, 1)
       If Not (Dic.exists(ary(i, 1))) Then
         Dic(ary(i, 1)) = i
         ary(i, 2) = 1
       Else
        ary(Dic(ary(i, 1)), 2) = ary(Dic(ary(i, 1)), 2) + 1
        ary(i, 2) = ary(Dic(ary(i, 1)), 2)
       End If
   Next i
 Worksheets("Sheet1").Range("A1:B50000") = ary
 
End Sub
 
Upvote 0
The counter was giving me some unexpected results so I tweaked @offthelip's code to give you the 1 for unique and 0 for not unique.

VBA Code:
Sub test_Mod()

    Dim ary As Variant
    Dim i As Long
    Dim sht As Worksheet
    
    Set sht = ActiveSheet
    
    ary = sht.Range("A2:B" & sht.Cells(Rows.Count, "A").End(xlUp).Row)
    Dim Dic As Object
    Set Dic = CreateObject("Scripting.dictionary")
    For i = 1 To UBound(ary, 1)
        If Not (Dic.exists(ary(i, 1))) Then
          Dic(ary(i, 1)) = i
          ary(i, 2) = 1
        Else
         ary(Dic(ary(i, 1)), 2) = 0
         ary(i, 2) = 0
        End If
    Next i
    sht.Range("A2").Resize(UBound(ary, 1), 2) = ary
 
End Sub
 
Upvote 0
@offthelip
Thank you so much, the dictionary is exactly what I was looking for. I will read more to understand it better as it has got enormous potential.
I've tested the use of dict for 83k rows.
Using my old method (with formulas) it took 21 mins.
Using dict took 39 seconds!
Unbelievable! ;)

@Alex Blakenburg

To be exact, I wanted to place "1" next to a value that appeared for the first time and 0 if it already exists in dict.

I tweaked @offthelip code to get the following that returned exactly the same result as I was expecting.

shLines on my code is ThisWorkbook.Worksheets ("Sheet1")

VBA Code:
ary = shLines.Range("b5:c82880")
   Dim Dic As Object
   Set Dic = CreateObject("Scripting.dictionary")
   For i = 1 To UBound(ary, 1)
       If Not (Dic.exists(ary(i, 1))) Then
         Dic(ary(i, 1)) = i
         ary(i, 2) = 1
       Else
        Dic(ary(i, 1)) = i
        ary(i, 2) = 0
       End If
   Next i
shLines.Range("b5:c82880") = ary

Many thanks to both of you for your input, much appreciated!
 
Upvote 0
Solution
@Alex Blakenburg
I can see my efforts to count the number of occurrences falls over in that it doesn't upate the second occurence when the third occurence happens. Thanks for pointing that out.
@bartmaster
Dictionaries are worth learning about, I don't think they are a difficult or a complicated as the documentation seems to make out. I always think of them as being a single dimension array with a variable index instead of a integer index.
 
Upvote 0

Forum statistics

Threads
1,225,734
Messages
6,186,715
Members
453,369
Latest member
positivemind

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top