VBA for finding duplicated rows of data with changes

Abgar

Active Member
Joined
Jun 20, 2009
Messages
265
Hi all,

Hoping someone can help with a quick VBA code (or point me in the right direction).

I have a data worksheet, which contains many rows of data, spreading across multiple columns. Each day, a new extract of data is added to the spreadsheet (added as new rows at the bottom), and there will be duplicated data.
We have been using conditional formatting on Column A (a unique ID number) to show the duplicates, and then we delete the duplicates.
When adding the new data, sometimes the Column A will be the same, but there will be a different value in Column H (based on changes in the system we extract the data from).

I'm hoping to essentially have a VBA code that will run through each row and check if ALL data for the row in columns A-H is the same. If it is all the same, highlight the duplicated row of data in Fill Colour RED
If Column A is the same, but there is any difference in data from Column B-H, then highlight the row in Fill Colour Green.
And if Column A is unique (not duplicated), then don't highlight that row at all.

Per the below example, the top 8 rows are the original data, then the new data is added below. Rows 11, 13 and 15 are highlighted green, becuase the ID in column A is a duplicate, but some of the other data is different.
Rows 12 and 14 are highlighted red becuase Column A is a duplicate, but all the other data is the same.
Columns 16-18 are not highlighted, becuase Column A is a new unique value, and is not duplicated.

Book1
ABCDEFGH
1Unique IDSurnameFirst nameClientStateCategoryContactDate
2302383SlaterTomBills BoatsTasmaniaLockyer17/03/2021
3297842SmithPeterBills BoatsQueenslandLockyer17/03/2021
4302097TemplePaulToms TomatosNew South WalesBiggestLockyer17/03/2021
5302272RoughMarkBetteys BeerTasmaniaTrident17/03/2021
6299573GroutJohnBills BoatsTasmaniaBaddestTrident17/03/2021
7302900WilsonLukeToms TomatosQueenslandGoodall17/03/2021
8301106SmithMatthewBills BoatsNew South WalesPeters17/03/2021
9302163BleekerSallyBetteys BeerTasmaniaSmith17/03/2021
10
11302383SlaterTomBills BoatsTasmaniaLockyer18/03/2021
12297842SmithPeterBills BoatsQueenslandLockyer17/03/2021
13302097TemplePaulToms TomatosNew South WalesBiggestSmith17/03/2021
14302272RoughMarkBetteys BeerTasmaniaTrident17/03/2021
15299573GroutJohnToms TomatosTasmaniaBaddestTrident17/03/2021
16123456WilsonLukeToms TomatosQueenslandGoodall17/03/2021
17654321SmithMatthewBills BoatsNew South WalesPeters17/03/2021
18987645BleekerSallyBetteys BeerTasmaniaSmith17/03/2021
Sheet1



Thanks in advance for any info you can share.

Cheers,
ABGar
 

Excel Facts

VLOOKUP to Left?
Use =VLOOKUP(A2,CHOOSE({1,2},$Z$1:$Z$99,$Y$1:$Y$99),2,False) to lookup Y values to left of Z values.
I loop from last row so that I would mark the duplicate or almost similar at later rows and not in the early rows. It works for this small sample

VBA Code:
Sub MarkDupe()

Dim n As Long, m As Long, k As Long, eRow As Long
Dim ArryData(), Element As Variant

eRow = Range("A" & Rows.Count).End(xlUp).Row

For n = eRow To 2 Step -1
    For m = eRow To 2 Step -1
        If Range("A" & m).Interior.ColorIndex = xlColorIndexNone And Not n = m Then
            If Range("A" & m) = Range("A" & n) Then
                For k = 2 To 8
                    If Not Cells(m, k) = Cells(n, k) Then
                        Range("A" & n, "H" & n).Interior.ColorIndex = 10
                        GoTo NotAll
                    End If
                Next
NotAll:
                If Range("A" & n).Interior.ColorIndex = xlColorIndexNone Then
                Range("A" & n, "H" & n).Interior.ColorIndex = 3
                End If
            End If
        End If
    Next
Next

End Sub
 
Upvote 0
@Zot - sorry, when processing on the live data (much larger data range), we realised that some of the data in columns E and F will not match, (this is due to our systems) whereas all other data does match.
Is it possible to amend the script to exclude comparing columns E and F, but still compare A-D and G-H?
 
Upvote 0
@Zot - sorry, when processing on the live data (much larger data range), we realised that some of the data in columns E and F will not match, (this is due to our systems) whereas all other data does match.
Is it possible to amend the script to exclude comparing columns E and F, but still compare A-D and G-H?
The column correspond to variable k in the code. B = 2 (second column) to H = 8 (eight column). Now need to skip 5 and 6.

Instead of using IF k= 5 0r 6, I use array to define value of k. You can modify the Arryk for column to be compared.

VBA Code:
Sub MarkDupe()

Dim n As Long, m As Long, k As Long, eRow As Long
Dim ArryData(), Arryk(), Element As Variant

eRow = Range("A" & Rows.Count).End(xlUp).Row
Arryk = Array(2, 3, 4, 7, 8)

For n = eRow To 2 Step -1
    For m = eRow To 2 Step -1
        If Range("A" & m).Interior.ColorIndex = xlColorIndexNone And Not n = m Then
            If Range("A" & m) = Range("A" & n) Then
                For k = 0 To 4
                    If Not Cells(m, Arryk(k)) = Cells(n, Arryk(k)) Then
                        Range("A" & n, "H" & n).Interior.ColorIndex = 10
                        GoTo NotAll
                    End If
                Next
NotAll:
                If Range("A" & n).Interior.ColorIndex = xlColorIndexNone Then
                Range("A" & n, "H" & n).Interior.ColorIndex = 3
                End If
            End If
        End If
    Next
Next

End Sub
 
Upvote 0
Solution
Thanks Zot - that works perfectly now :)

Can you just confirm for me, this line:
For k = 0 To 4

Is that now referencing the columns listed in the array? i.e. k0 = array column 2, and k4 = array column 8?
 
Upvote 0
Thanks Zot - that works perfectly now :)

Can you just confirm for me, this line:
For k = 0 To 4

Is that now referencing the columns listed in the array? i.e. k0 = array column 2, and k4 = array column 8?
k is not the index number for array content where array starts from Arryk(0), Arryk(1), Arryk(2), Arryk(3) and Arryk(4) correspond to columns 2, 3, 4, 7, 8 in this code. Array by default starts with 0 unless you have Option Base 1 in declaration.
 
Upvote 0
G
k is not the index number for array content where array starts from Arryk(0), Arryk(1), Arryk(2), Arryk(3) and Arryk(4) correspond to columns 2, 3, 4, 7, 8 in this code. Array by default starts with 0 unless you have Option Base 1 in declaration.
Great, thanks so much for the helpful education. Really appreciate your time/efforts to this for me.

Have a wonderful day.
 
Upvote 0
G

Great, thanks so much for the helpful education. Really appreciate your time/efforts to this for me.

Have a wonderful day.
You too. Right now I'm stuck on my own code too. ?

Thanks for update
 
Upvote 0

Forum statistics

Threads
1,225,760
Messages
6,186,870
Members
453,380
Latest member
ShaeJ73

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top