Delete duplicate rows based on more than one column with excel macro

Shwapx

New Member
Joined
Sep 28, 2022
Messages
48
Office Version
  1. 365
Platform
  1. Windows
I'm trying to create an excel macro that can delete duplicate rows and their original ones based on more than one column.

In the table below you will find the example data and the output for which I'm looking. I need to delete all duplicate rows but only look at some columns like in the example the first 3 headers.

Input table

header1header2header3header4
Test50201
Test50202
Test30203
110204
220305
Test23556
Test23557

Output table after executing the macro

header1header2header3header4
Test30203
110204
220305
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop
This should be OK if you don't have too many rows - if you do there are faster ways. Assumes your headers are in row 1, and header 1 is in column A. Just change the sheet name to suit.

VBA Code:
Option Explicit
Sub Shwapx()
    Dim ws As Worksheet
    Dim LRow As Long, LCol As Long
    Set ws = Worksheets("Sheet1")   '<< change to actual sheet name
    LRow = ws.Cells.Find("*", , xlFormulas, , 1, 2).Row
    LCol = ws.Cells.Find("*", , xlFormulas, , 2, 2).Column + 1
    Application.ScreenUpdating = False
   
    With ws.Range(ws.Cells(2, LCol), ws.Cells(LRow, LCol))
        .Value = ws.Evaluate(Replace("A2:A#&"" | ""&B2:B#&"" | ""&C2:C#", "#", LRow))
    End With
   
    Dim ArrIn, ArrOut, i As Long
    ArrIn = ws.Range(ws.Cells(1, LCol), ws.Cells(LRow + 1, LCol))
    ReDim ArrOut(1 To UBound(ArrIn, 1) + 1, 1 To 1)
   
    For i = 2 To UBound(ArrIn, 1) - 1
        If ArrIn(i, 1) = ArrIn(i + 1, 1) Or ArrIn(i, 1) = ArrIn(i - 1, 1) Then ArrOut(i, 1) = 1
    Next i
    ws.Cells(1, LCol).Resize(UBound(ArrOut, 1)).Value = ArrOut
   
    With ws.Cells(1).CurrentRegion
        .AutoFilter LCol, 1
        .Offset(1).EntireRow.Delete
        .AutoFilter
    End With
    Application.ScreenUpdating = True
End Sub

Before:
Shwapx.xlsm
ABCD
1Header1Header2Header3Header4
2Test50201
3Test50202
4Test30203
5110204
6220305
7Test23556
8Test23557
Sheet1


After:
Shwapx.xlsm
ABCD
1Header1Header2Header3Header4
2Test30203
3110204
4220305
5
6
7
8
Sheet1
 
Upvote 0
Hello, @kevin9999 first I want to say thank you for that. I just have a couple of questions.

I need to make that macro in a lot of excel files in which I need to compare different columns in which I need to compare more than 2, 3,4, or even 5,6.
Can you help me explain how I can modify it? My actual data starts usually at row 6 not like in the example I gave row 2 and actually when I run the macro I'm not getting the expected result.
But that should be something I need to find how to adjust to each table.

So in below is an example of a real-world table that I have. So in that table, I need to look out for duplicates in Header3, Header6, Header7, Header8, and Header9.

I was thinking I might be able to modify the one which you gave me but seems too complicated for me. Thanks in advance.

A1B1C1D1E1F1G1H1I1J1K1
A2B2C2D2E2F2G2H2I2J2K2
some info about the date
Header1Header2Header3Header4Header5Header6Header7Header8Header9Header10Header11
20210801Addtesttest
20000​
Test23Test44Test55Test66
5​
10​
20210801Deletetesttest1
10000​
Test23Test44Test55Test66
10​
20​
 
Upvote 0
See how you go with this. If you can't make the changes work, I will need a copy of your actual sheet using the XL2BB add in before I'm willing to assist you further. Any line marked as 'irrelevant' shouldn't need to be changed.

VBA Code:
Option Explicit
Sub Shwapx_Explained()
    Dim ws As Worksheet                                         '<< irrelevant
    Dim LRow As Long, LCol As Long                              '<< irrelevant
    Set ws = Worksheets("Sheet1")                               '<< *** change to actual sheet name
    LRow = ws.Cells.Find("*", , xlFormulas, , 1, 2).Row         '<< irrelevant
    LCol = ws.Cells.Find("*", , xlFormulas, , 2, 2).Column + 1  '<< irrelevant
    Application.ScreenUpdating = False                          '<< irrelevant
    
    '*** The next line starts at row 2 (i.e. "Cells(2,..)
    '*** Change the 2 to whatever row your data actually starts on - not the header row
    With ws.Range(ws.Cells(2, LCol), ws.Cells(LRow, LCol))
    '*** The following line is joining the values in columns A, B and C
    '*** You can change these columns to whatever you want - add some, remove some
    '*** It currrently refers to row 2 (e.g. "A2:..) - Change this to the first row that
    '*** data starts on - not the header row, the actual first row of data
        .Value = ws.Evaluate(Replace("A2:A#&"" | ""&B2:B#&"" | ""&C2:C#", "#", LRow))
    End With
    
    Dim ArrIn, ArrOut, i As Long                                    '<< irrelevant
    '*** The next line gets the array values starting from the header row
    '*** or at least the row ABOVE the first row of data - Change this to suit
    ArrIn = ws.Range(ws.Cells(1, LCol), ws.Cells(LRow + 1, LCol))   '<< SEE BELOW
    ReDim ArrOut(1 To UBound(ArrIn, 1) + 1, 1 To 1)                 '<< irrelevant
    
    '*** The next block of code should not need to be changed
    For i = 2 To UBound(ArrIn, 1) - 1
        If ArrIn(i, 1) = ArrIn(i + 1, 1) Or ArrIn(i, 1) = ArrIn(i - 1, 1) Then ArrOut(i, 1) = 1
    Next i
    '*** The 1 used here (i.e. "Cells(1,..) refers to the row of the headers
    '*** Or at least the row ABOVE the first row of data
    '*** Make sure it matches the row you indicated in the line above I marked "SEE BELOW"
    ws.Cells(1, LCol).Resize(UBound(ArrOut, 1)).Value = ArrOut
    
    '*** This may need to be changed.  The ws.Cells(1) refers to cell A1 in worksheet ws
    '*** For example, if your header row is on row 6, change it to ws.Cells(6,1)
    '*** This method won't work if there's data in the row above the headers
    '*** In which case you'll need to use something like the following line (assumes your headers
    '*** are on row 5 and there isn't a blank row between the headers and the data
    '   With ws..Range(Cells(6, 1), Cells(6, LCol))
    With ws.Cells(1).CurrentRegion
        .AutoFilter LCol, 1
        .Offset(1).EntireRow.Delete
        .AutoFilter
    End With
    Application.ScreenUpdating = True       '<< irrelevant
End Sub
 
Upvote 0
See how you go with this. If you can't make the changes work, I will need a copy of your actual sheet using the XL2BB add in before I'm willing to assist you further. Any line marked as 'irrelevant' shouldn't need to be changed.

VBA Code:
Option Explicit
Sub Shwapx_Explained()
    Dim ws As Worksheet                                         '<< irrelevant
    Dim LRow As Long, LCol As Long                              '<< irrelevant
    Set ws = Worksheets("Sheet1")                               '<< *** change to actual sheet name
    LRow = ws.Cells.Find("*", , xlFormulas, , 1, 2).Row         '<< irrelevant
    LCol = ws.Cells.Find("*", , xlFormulas, , 2, 2).Column + 1  '<< irrelevant
    Application.ScreenUpdating = False                          '<< irrelevant
   
    '*** The next line starts at row 2 (i.e. "Cells(2,..)
    '*** Change the 2 to whatever row your data actually starts on - not the header row
    With ws.Range(ws.Cells(2, LCol), ws.Cells(LRow, LCol))
    '*** The following line is joining the values in columns A, B and C
    '*** You can change these columns to whatever you want - add some, remove some
    '*** It currrently refers to row 2 (e.g. "A2:..) - Change this to the first row that
    '*** data starts on - not the header row, the actual first row of data
        .Value = ws.Evaluate(Replace("A2:A#&"" | ""&B2:B#&"" | ""&C2:C#", "#", LRow))
    End With
   
    Dim ArrIn, ArrOut, i As Long                                    '<< irrelevant
    '*** The next line gets the array values starting from the header row
    '*** or at least the row ABOVE the first row of data - Change this to suit
    ArrIn = ws.Range(ws.Cells(1, LCol), ws.Cells(LRow + 1, LCol))   '<< SEE BELOW
    ReDim ArrOut(1 To UBound(ArrIn, 1) + 1, 1 To 1)                 '<< irrelevant
   
    '*** The next block of code should not need to be changed
    For i = 2 To UBound(ArrIn, 1) - 1
        If ArrIn(i, 1) = ArrIn(i + 1, 1) Or ArrIn(i, 1) = ArrIn(i - 1, 1) Then ArrOut(i, 1) = 1
    Next i
    '*** The 1 used here (i.e. "Cells(1,..) refers to the row of the headers
    '*** Or at least the row ABOVE the first row of data
    '*** Make sure it matches the row you indicated in the line above I marked "SEE BELOW"
    ws.Cells(1, LCol).Resize(UBound(ArrOut, 1)).Value = ArrOut
   
    '*** This may need to be changed.  The ws.Cells(1) refers to cell A1 in worksheet ws
    '*** For example, if your header row is on row 6, change it to ws.Cells(6,1)
    '*** This method won't work if there's data in the row above the headers
    '*** In which case you'll need to use something like the following line (assumes your headers
    '*** are on row 5 and there isn't a blank row between the headers and the data
    '   With ws..Range(Cells(6, 1), Cells(6, LCol))
    With ws.Cells(1).CurrentRegion
        .AutoFilter LCol, 1
        .Offset(1).EntireRow.Delete
        .AutoFilter
    End With
    Application.ScreenUpdating = True       '<< irrelevant
End Sub
Thank you for the explanation, I've managed to make changes and make the macro start taking data from row6 and so on. But I got an error runtime error 13 type mismatch. And in the last column, I can see this value on each row - #VALUE!.

One thing about the last part:

'*** This may need to be changed. The ws.Cells(1) refers to cell A1 in worksheet ws
'*** For example, if your header row is on row 6, change it to ws.Cells(6,1)
'*** This method won't work if there's data in the row above the headers
'*** In which case you'll need to use something like the following line (assumes your headers
'*** are on row 5 and there isn't a blank row between the headers and the data
' With ws..Range(Cells(6, 1), Cells(6, LCol))
With ws.Cells(1).CurrentRegion
.AutoFilter LCol, 1
.Offset(1).EntireRow.Delete
.AutoFilter

My headers are at row 4 then I have one empty row 5 and then I have the actual data. I've changed that to:

With ws.Cells(5,1).CurrentRegion
.AutoFilter LCol, 1
.Offset(1).EntireRow.Delete
.AutoFilter

Thank you really much for your support you are awesome and you are helping me a lot!
 
Upvote 0
And well done to you for making the changes and getting it to work :)
 
Upvote 0
And well done to you for making the changes and getting it to work :)
Thank you, but in the end, it's not deleting the rows which should be deleted since I got a mismatch error and I got the same #VALUE! data inside the last column.
 
Upvote 0
Here is where I got that mismatch error:

1664436581459.png
 
Upvote 0

Forum statistics

Threads
1,223,228
Messages
6,170,871
Members
452,363
Latest member
merico17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top