Macro to search duplicates by matching any 3 of the 5 fields in a record.

Abhijeet_Chib

New Member
Joined
Feb 27, 2014
Messages
1
Hello

I have a database with a lot of records and each record has 5 fields namely INVOICE NUMBER, DATE, AMOUNT, SHIPMENT NUMBER and QUANTITY.

How do i write a Macro that searches for duplicates. The way to searching duplicates is that if any three fields of the record match then the record is highlighted in a separate colour. The matching fields are not specified, they can be any three fields.

For example, if INVOICE NUMBER, SHIPMENT NUMBER and QUANTITY match in some records then its shown as duplicates and highlighted. Similarly if INVOICE NUMBER, QUANTITY and DATE matches in a few records then they are listed as duplicates as well.

So in the database, out of the 5 fields if any 3 match then they are listed as duplicates and highlighted in a different colour.

Can anyone please help me write a Macro that does that.
Cheers !!
Abhii
 

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"
Hello Abhii and welcome to the Board

If your database looks like this:

Plan1

*ABCDE

<colgroup><col style="font-weight:bold; width:30px; "><col style="width:112px;"><col style="width:85px;"><col style="width:76px;"><col style="width:128px;"><col style="width:64px;"></colgroup><tbody>
[TD="bgcolor: #cacaca, align: center"]1[/TD]
[TD="align: center"]Invoice Number[/TD]
[TD="align: center"]Date[/TD]
[TD="align: center"]Amount[/TD]
[TD="align: center"]Shipment Number[/TD]
[TD="align: center"]Quantity[/TD]

[TD="bgcolor: #cacaca, align: center"]2[/TD]
[TD="align: center"]1W6[/TD]
[TD="align: center"]25/07/2012[/TD]
[TD="align: center"]500[/TD]
[TD="align: center"]1[/TD]
[TD="align: center"]10[/TD]

[TD="bgcolor: #cacaca, align: center"]3[/TD]
[TD="align: center"]6T4[/TD]
[TD="align: center"]07/11/2013[/TD]
[TD="align: center"]700[/TD]
[TD="align: center"]23[/TD]
[TD="align: center"]120[/TD]

[TD="bgcolor: #cacaca, align: center"]4[/TD]
[TD="align: center"]9G5[/TD]
[TD="align: center"]20/02/2015[/TD]
[TD="align: center"]2500[/TD]
[TD="align: center"]21[/TD]
[TD="align: center"]30[/TD]

[TD="bgcolor: #cacaca, align: center"]5[/TD]
[TD="align: center"]9T6[/TD]
[TD="align: center"]04/04/2014[/TD]
[TD="align: center"]1100[/TD]
[TD="align: center"]27[/TD]
[TD="align: center"]40[/TD]

[TD="bgcolor: #cacaca, align: center"]6[/TD]
[TD="align: center"]6T4[/TD]
[TD="align: center"]17/09/2017[/TD]
[TD="align: center"]1300[/TD]
[TD="align: center"]9[/TD]
[TD="align: center"]50[/TD]

[TD="bgcolor: #cacaca, align: center"]7[/TD]
[TD="align: center"]8H7[/TD]
[TD="align: center"]31/12/2018[/TD]
[TD="align: center"]1500[/TD]
[TD="align: center"]13[/TD]
[TD="align: center"]70[/TD]

[TD="bgcolor: #cacaca, align: center"]8[/TD]
[TD="align: center"]8H7[/TD]
[TD="align: center"]14/04/2020[/TD]
[TD="align: center"]1700[/TD]
[TD="align: center"]13[/TD]
[TD="align: center"]70[/TD]

[TD="bgcolor: #cacaca, align: center"]9[/TD]
[TD="align: center"]9G5[/TD]
[TD="align: center"]04/04/2014[/TD]
[TD="align: center"]1900[/TD]
[TD="align: center"]27[/TD]
[TD="align: center"]40[/TD]

[TD="bgcolor: #cacaca, align: center"]10[/TD]
[TD="align: center"]6T4[/TD]
[TD="align: center"]10/11/2022[/TD]
[TD="align: center"]1300[/TD]
[TD="align: center"]17[/TD]
[TD="align: center"]50[/TD]

[TD="bgcolor: #cacaca, align: center"]11[/TD]
[TD="align: center"]1W6[/TD]
[TD="align: center"]25/07/2012[/TD]
[TD="align: center"]500[/TD]
[TD="align: center"]19[/TD]
[TD="align: center"]100[/TD]

[TD="bgcolor: #cacaca, align: center"]12[/TD]
[TD="align: center"]9G5[/TD]
[TD="align: center"]07/06/2025[/TD]
[TD="align: center"]2500[/TD]
[TD="align: center"]21[/TD]
[TD="align: center"]110[/TD]

[TD="bgcolor: #cacaca, align: center"]13[/TD]
[TD="align: center"]6T4[/TD]
[TD="align: center"]20/09/2026[/TD]
[TD="align: center"]2700[/TD]
[TD="align: center"]23[/TD]
[TD="align: center"]120[/TD]

[TD="bgcolor: #cacaca, align: center"]14[/TD]
[TD="align: center"]9G5[/TD]
[TD="align: center"]03/01/2028[/TD]
[TD="align: center"]500[/TD]
[TD="align: center"]1[/TD]
[TD="align: center"]10[/TD]

[TD="bgcolor: #cacaca, align: center"]15[/TD]
[TD="align: center"]7S4[/TD]
[TD="align: center"]04/04/2014[/TD]
[TD="align: center"]3100[/TD]
[TD="align: center"]27[/TD]
[TD="align: center"]40[/TD]

</tbody>


Excel tables to the web >> Excel Jeanie HTML 4

then this code should do it:

Code:
Option Explicit: Option Base 1


Sub GroupRecords()
Dim comb, vis As Range, j%, i%, lr%, r%, g%, b%, dup As Boolean, k%
' the combinations
comb = Array(1, 2, 3, 1, 2, 4, 1, 2, 5, 1, 3, 4, 1, 3, 5, _
1, 4, 5, 2, 3, 4, 2, 3, 5, 2, 4, 5, 3, 4, 5)
lr = Range("a" & Rows.Count).End(xlUp).Row
Range("a1:e" & lr).Font.Color = RGB(0, 0, 0)
For k = 2 To lr         ' the records
    On Error Resume Next
    ActiveSheet.ShowAllData
    On Error GoTo 0
    dup = False
    For j = 1 To 10     ' the combinations
        Cells(1, 8).Value = Cells(1, comb(3 * j - 2)).Value
        Cells(1, 9).Value = Cells(1, comb(3 * j - 1)).Value
        Cells(1, 10).Value = Cells(1, comb(3 * j)).Value
        Cells(2, 8).Value = Cells(k, comb(3 * j - 2)).Value
        Cells(2, 9).Value = Cells(k, comb(3 * j - 1)).Value
        Cells(2, 10).Value = Cells(k, comb(3 * j)).Value
        Range("A1:E" & lr).AdvancedFilter Action:=xlFilterInPlace, _
        CriteriaRange:=Range("H1:J2"), Unique:=False
        Set vis = Range("a1:e" & lr).SpecialCells(xlCellTypeVisible)
        If vis.Areas.Count > 2 Then dup = True
        If vis.Areas.Count = 2 Then
            If vis.Areas(1).Rows.Count > 1 Or vis.Areas(2).Rows.Count > 1 Then dup = True
        End If
        If dup Then             ' a group was detected
            r = Int(250 * Rnd)  ' random colors
            g = Int(250 * Rnd)
            b = Int(250 * Rnd)
            For i = 1 To vis.Areas.Count
                If vis.Areas(i).Cells(1, 1).Font.Color = RGB(0, 0, 0) Then _
                vis.Areas(i).Font.Color = RGB(r, g, b)
            Next
        End If
        If dup Then Exit For
    Next
Next
Range("a1:e1").Font.Color = RGB(0, 0, 0)
On Error Resume Next: ActiveSheet.ShowAllData
End Sub
 
Upvote 0
@Abhii
If Worf's code does not do what you want, you may need to provide us with a small set of sample data and the expected results with explanation in relation to that sample.


@Worf
I must admit I'm not quite sure what the OP expects or what the data is like but I've changed your sample data little and run your code over it, see below.
I've manually highlighted the two yellow rows and they certainly have at least 3 columns in common (A, B, E).
Similarly the two greeny rows have at least 3 columns in common (A, B, D).
However, rows 6 and 10 also have 3 columns in common (A, C, D) but have been coloured differently by your code.

What the OP would expect if that data is possible I don't know as ..
row 10 is a 'duplicate' of row 3,
row 6 is a 'duplicate' of row 10,
row 13 is a 'duplicate' of row 6,
but
row 13 is not a 'duplicate' of row 3

Excel Workbook
ABCDE
1Invoice NumberDateAmountShipment NumberQuantity
21W625/07/12500110
36T47/11/1370023120
49G520/02/1525002130
59T64/04/1411002740
66T417/09/1713001750
78H731/12/1815001370
88H714/04/2017001370
99G54/04/1419002740
106T47/11/13130017120
111W625/07/1250019100
129G57/06/25250021110
136T417/09/1727001722
149G53/01/28500110
157S44/04/1431002740
Match at least 3




@VBA Geek
Would you care to share what Conditional Formatting you have in mind?
 
Upvote 0
Hello Peter,

from what I understand, his wish is to highlight the duplicate rows, rows are duplicate if 3 or more fields match. so i would use

=ISNUMBER(MATCH(1,IF(((($A2=$A$2:$A$15)+($B2=$B$2:$B$15)+($C2=$C$2:$C$15)+($D2=$D$2:$D$15)+($E2=$E$2:$E$15))>=3)*((ROW($A$2:$A$15)-ROW($A$1))<>ROWS(A$2:A2)),1),0))

if instead he wants to have for each duplicate a diffent colour then of course vba is the way to go :)
 
Upvote 0
from what I understand, his wish is to highlight the duplicate rows, rows are duplicate if 3 or more fields match. so i would use

=ISNUMBER(MATCH(1,IF(((($A2=$A$2:$A$15)+($B2=$B$2:$B$15)+($C2=$C$2:$C$15)+($D2=$D$2:$D$15)+($E2=$E$2:$E$15))>=3)*((ROW($A$2:$A$15)-ROW($A$1))<>ROWS(A$2:A2)),1),0))

if instead he wants to have for each duplicate a diffent colour then of course vba is the way to go :)
I'd have to say I wouldn't call that a "simple conditional format" ;)

But thanks for sharing - now we can see if it suits the OP's requirements.
 
Last edited:
Upvote 0
Hi Peter

What I understood from the OP’s description was:

- If any three fields on a record match another record, they belong to the same family
- Each family will have a different color, and it will apply to the whole record
- The database is to be flagged in place
- If a record can belong to more than one family, I simply choose one, because a record cannot have two colors at the same time
- So per the above, the following pairings would be acceptable, (6,13) or (3,10) or (6,10).How would you treat this situation?

- I guess we will have to wait and see what Abhii has to say…
 
Upvote 0
How would you treat this situation?

- I guess we will have to wait and see what Abhii has to say…
I don't have an answer to the first point here, so I'm agreeing with the second at this stage. :)
 
Upvote 0

Forum statistics

Threads
1,223,911
Messages
6,175,324
Members
452,635
Latest member
laura12345

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top