VBA: Check for duplicates in many to one (or two) relationships

SteveOranjin

Board Regular
Joined
Dec 18, 2017
Messages
170
Hello,

I would like to be able to write a macro that looks for duplicates across "Models". What I mean by this specifically is that, there are certain properties that can exist within our products that must of necessity only occur once. However, due to a different concept of what constitutes a model, often times companies that send us data do not respect this mutually exclusive relationship. I'd like some help uncovering these violations and reporting them. Let me explain what I mean:

A product must of necessity have only one occurrence of any given finish across a model. When we say model, we mean "Subaru Impreza". When we refer to a particular model, or a SKU, we mean "Subaru Impreza - Blue". A model may have 1,2,5 or 10 variations in "SKUs" that represent that model. Often times however, due to poor data standardizations that exist within my industry, companies have different concepts of what a model is. As a result, often times, I will load up a brand's products, and will find that a given model of product has duplicates of the same finish. So in certain sections of our website, where users are not supposed to be able to find such duplicates as these, they are often seeing, "Subaru Impreza - Blue, Subaru Impreza - Green, Subaru Impreza - Blue".

An example of this is below:

[TABLE="width: 500, align: center"]
<tbody>[TR]
[TD][/TD]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Model[/TD]
[TD]SKU[/TD]
[TD]Name[/TD]
[TD]Finish[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]111[/TD]
[TD]111-21[/TD]
[TD]Toilet[/TD]
[TD]Blue[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]111[/TD]
[TD]111-22[/TD]
[TD]Toilet[/TD]
[TD]Green[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]111[/TD]
[TD]111-23[/TD]
[TD]Toilet[/TD]
[TD]Purple[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]111[/TD]
[TD]111-24[/TD]
[TD]Toilet[/TD]
[TD]Blue[/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD]222[/TD]
[TD]222-21[/TD]
[TD]Large Toilet[/TD]
[TD]Blue[/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]222[/TD]
[TD]222-22[/TD]
[TD]Large Toilet[/TD]
[TD]Green[/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]222[/TD]
[TD]222-23[/TD]
[TD]Large Toilet[/TD]
[TD]Purple[/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD]222[/TD]
[TD]222-24[/TD]
[TD]Large Toilet[/TD]
[TD]Blue[/TD]
[/TR]
</tbody>[/TABLE]

What I'd like help writing a macro with is a macro that can spot when there are duplicates of finish type within a model. So in this case, I'm hoping it is possible to create a macro that will spot the duplicates of a blue finish and return a count of those occurrences to the user, so that they can know to go and look for them. Hoping for some help. If there is a good samaritan who is more skilled than me in VBA and is kind enough to help me, then I would be greatly appreciative of it.

Thanks
 

Excel Facts

How to show all formulas in Excel?
Press Ctrl+` to show all formulas. Press it again to toggle back to numbers. The grave accent is often under the tilde on US keyboards.
To be clear on this:

"As a user, I would like to be able to spot duplicates of a particular finish in a model and have a count of those returned to me"


  • The data the macro will be scanning can have as few as say, three columns, and as many as 60.
  • It should be able to spot more than one duplicate of finish within a model. So in our example above, if there were two blues and two greens for each model than it would be able to detect that.
  • The data it will be looking through can have as few as say, twenty lines, and could have as many as 30,000 thousand.
  • The columns it will be analyzing will be analyzed by header. It will be looking to compare the relationship between the column with the header, "Model" and "mFinish"
 
Last edited:
Upvote 0
Hello SteveOranjin,

The macro below will search "Sheet1" row 1 for the "Model" and "mFinish". "Sheet2" will contain the duplicate information about these two columns. The report will list the model, finish, number of duplicates, and the row number where the duplicates were found.

You can download an example workbook from my MediaFire account. You should be able to adapt this code to your workbook easily. If you have any problems, please let me know.

Here is the link to the workbook Find Duplicates Using Column Labels

Code:
Sub FindDuplicates()


    Dim cnt     As Long
    Dim Data    As Variant
    Dim DupeCnt As Long
    Dim Dict    As Object
    Dim Finish  As Range
    Dim Key     As Variant
    Dim Headers As Range
    Dim Model   As Range
    Dim Rng     As Range
    Dim RowCnt  As Long
    Dim WksIn   As Worksheet
    Dim WksOut  As Worksheet
    
        Set WksIn = ThisWorkbook.Worksheets("Sheet1")
        Set WksOut = ThisWorkbook.Worksheets("Sheet2")
        
        Set Rng = WksIn.Range("A1").CurrentRegion
        
        Set Headers = Rng.Rows(1).Cells
        
        Set Model = Headers.Find("Model", , xlValues, xlWhole, xlByColumns, xlNext, False, False, False)
            If Model Is Nothing Then
                MsgBox "Model Header was Not Found.", vbCritical: Exit Sub
            Else
                Set Model = Rng.Columns(Model.Column).Cells
            End If
        
        Set Finish = Headers.Find("mFinish")
            If Finish Is Nothing Then
                MsgBox "mFinish Header was Not Found.", vbCritical: Exit Sub
            Else
                Set Finish = Rng.Columns(Finish.Column).Cells
            End If
        
        Set Dict = CreateObject("Scripting.Dictionary")
            Dict.CompareMode = vbTextCompare
            
            For RowCnt = 1 To Model.Rows.Count
                Key = Trim(Model.Cells(RowCnt, 1) & Finish.Cells(RowCnt, 1))
                If Not Dict.Exists(Key) Then
                    ReDim Data(1 To 4)
                    Dict.Add Key, Data
                Else
                    Data = Dict(Key)
                        If Data(1) = Empty Then Data(1) = Model.Cells(RowCnt, 1)
                        If Data(2) = Empty Then Data(2) = Finish.Cells(RowCnt, 1)
                        Data(3) = Data(3) + 1
                        If Data(4) = Empty Then
                            Data(4) = Model.Rows(RowCnt).Row & ""
                        Else
                            Data(4) = Data(4) & "," & Model.Rows(RowCnt).Row
                        End If
                    Dict(Key) = Data
                End If
            Next RowCnt
        
        Set Rng = Intersect(WksOut.UsedRange, WksOut.UsedRange.Offset(1, 0))
        If Not Rng Is Nothing Then Rng = Empty
        
        Application.ScreenUpdating = False
        
        For Each Key In Dict.Keys
            If Dict(Key)(3) > 0 Then
                WksOut.Range("A2:D2").Offset(cnt, 0).Value = Dict(Key)
                cnt = cnt + 1
            End If
        Next Key
        
        Application.ScreenUpdating = True
        
End Sub
 
Upvote 0
Also,

is there a way you can turn those results into hyperlink? So I can click each occurence and then investigate it.

Hope you are well,

Steve
 
Upvote 0
Hello Steve,

I downloaded the workbook. The problem is in Module3. You have defined a Sub Procedure called Trim. I disabled the Sub in my copy for two reasons. First, your Sub conflicts with the VBA keyword Trim used for string functions. Secondly, It does not appear that you use anywhere else in the project.

On the Data Sheet columns K and L are both labeled mFinish. My understanding was there would be only one column named mFinish. Was I wrong?
 
Upvote 0
You were right. I wasn't able to get those macros working before. So I just removed them and it is working.

Is there a way that the results on the "Finish Duplicates" tab could be turned into hyperlinks?

Hope you are well
 
Upvote 0
Also, you were right. There is only supposed to be one column. I removed that and with those two steps it worked.
 
Upvote 0
Hello Steve,

Do you want the worksheet "Finish Duplicates" to be a single column with hyperlinks to each duplicate?
 
Upvote 0
Hello Steve,

Do you want the worksheet "Finish Duplicates" to be a single column with hyperlinks to each duplicate?

Is there a way to set it up so the line numbers you are providing all links? That is, they are hyperlinks that I can select and it will automatically take me to the right place? If not, that might be better. Having said that, I like how it returns some information to me regarding that what/where.

If there is some alternative that is easier for you, and the hyperlinks for each line number are too hard, than let's go with that.

Hope you are well,

Steve
 
Upvote 0

Forum statistics

Threads
1,223,239
Messages
6,170,947
Members
452,368
Latest member
jayp2104

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top