Struggling With Duplicates

Rob1200

Board Regular
Joined
Mar 4, 2016
Messages
50
Hi guys

Long time lurker, got some really useful info from this site.

I have had a look round regarding removing duplicates for my needs but cannot find a precise match and I need to get this resolved ASAP.

I have an extract of data with multiple columns

Column A contains item number
Column B contains customer name
Column C contains the date valid
Column D contains the price

The way I have been provided the data I need to make sure there are only unique items per customer using the newest date example below

Item1 - London Clinic Hospital - 01/01/14 - 50
Item1 - London Clinic Hospital - 01/01/15 - 60
Item1 - London Clinic Hospital - 01/01/16 - 70


So I need a way to remove safely the 2014 and 2015 records and leave only the newest record i.e. in this example 2016

Any help much appreciated

Thanks
 
Can I confirm that it is just the latest date for each Customer that we need to keep, no matter the Item code?
So for ALDER001, we only need to keep that top row of those 6 rows for that customer?


(BTW, you cannot attach files but to post slightly better screen shots, see my signature block below.)


Edit: I don't want or need your actual sheet with 12,000 rows. What you have given should be fine, apart from maybe a question or two more.


Correct, the most recent date needs to be kept when there are same item numbers for the same customer.

For customer ALDER001 - as there are no duplicate rows for the same item all 6 would be kept.

If the same item appeared twice, so two rows, then the one with the newest date needs to be kept.
 
Upvote 0

Excel Facts

Shade all formula cells
To shade all formula cells: Home, Find & Select, Formulas to select all formulas. Then apply a light fill color.
they are array formula, enter without the curly brackets {} and with Ctrl+Shift+Enter.

that wouldn't work for you anyway as I've posted before your explanation in post #5.

your formula with no customer number did work however and returned me a value. i sorted via item number so that it groups them together and just manually removing the older entry
 
Upvote 0
..the most recent date needs to be kept when there are same item numbers for the same customer.

For customer ALDER001 - as there are no duplicate rows for the same item all 6 would be kept.
So the sample data provided is not very representative as there are no rows to delete in the whole sample?
Some sample data where there are at least some deletions to be done would have been more useful. ;)

Should have something shortly.
 
Upvote 0
Try this in a copy of your workbook. This is for the first problem.

Rich (BB code):
Sub KeepLatestCustomer()
  Dim nc As Long
  
  nc = Cells.Find(What:="*", After:=Cells(1, 1), LookIn:=xlValues, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious, SearchFormat:=False).Column + 1
  Application.ScreenUpdating = False
  With Range("A2", Range("A" & Rows.Count).End(xlUp)).Resize(, nc)
    .Sort Key1:=.Columns(2), Order1:=xlAscending, Key2:=.Columns(1), Order2:=xlAscending, Key3:=.Columns(3), Order3:=xlDescending, Header:=xlNo, _
          OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, DataOption1:=xlSortNormal, DataOption2:=xlSortNormal, DataOption3:=xlSortNormal
    .Columns(nc).Value = Evaluate(Replace(Replace(Replace(Replace("if(#=@,if(%=^,1,""""),"""")", _
                            "#", .Columns(2).Address), "@", .Columns(2).Offset(-1).Address), "%", .Columns(1).Address), "^", .Columns(1).Offset(-1).Address))
    .Sort Key1:=.Cells(1, nc), Order1:=xlAscending, Header:=xlNo, OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, DataOption1:=xlSortNormal
    On Error Resume Next
    .Columns(nc).SpecialCells(xlConstants, xlNumbers).EntireRow.Delete
    On Error GoTo 0
  End With
  Application.ScreenUpdating = True
End Sub
 
Upvote 0
second try, highlighted green cells are altered for testing.
again, its array formula enter with shift-control-enter without the {} brackets.


Excel 2012
ABCDEQRS
1Item CodeCustomerDatePricePresent in F4101
2DWF614BABBEY00102/10/20151,525DWF614BDWF614B
3STDLOANABBEY0010#N/ASTDLOAN
4CSI07ALDER00104/08/2013525CSI07CSI07
5CSI08ALDER00104/08/2013525CSI08CSI08
6CSI09ALDER00104/08/2013525CSI09
7CSI09ALDER00106/08/2013525CSI10
8CSI09ALDER00108/08/2013525CSI11
9CSI09ALDER00110/08/2013525CSI12CSI09
10DWH039BMI00902/10/20151,200DWH039DWH039
11DWH041BMI00902/10/20151,200DWH041DWH041
12DWH043BMI00902/10/20151,200DWH043DWH043
13DWH046BMI00902/10/20151,200DWH046DWH046
14DWH048BMI00902/10/20151,200DWH048DWH048
15DWH050BMI00902/10/20151,200DWH050DWH050
16DWH139BMI00902/10/20151,200DWH139DWH139
17DWH141BMI00902/10/20151,200DWH141
18DWH141BMI00903/10/20151,200DWH143
19DWH141BMI00904/10/20151,200DWH146
20DWH148BMI00902/10/20151,200DWH148DWH148
21DWH150BMI00902/10/20151,200DWH150DWH150
22BCP030320BURN00112/10/2012750#N/ABCP030320
23NSO2713CHARI00101/07/201688#N/ANSO2713
24DWH141BMI00904/11/201588#N/ADWH141
25VSO011CHARI00209/08/2014144#N/AVSO011
26VSO012CHARI00209/08/2014144#N/AVSO012
Sheet1
Cell Formulas
RangeFormula
S2{=IF($D2=MAX(IF($A$2:$A$26=$A2,IF($B$2:$B$26=B2,$D$2:$D$26))),$A2,"")}
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 0
similarily for data without customer name

{=IF($D2=MAX(IF($A$2:$A$26=$A2,$D$2:$D$26)),$A2,"")}
 
Upvote 0
Try this in a copy of your workbook. This is for the first problem.

Rich (BB code):
Sub KeepLatestCustomer()
  Dim nc As Long
  
  nc = Cells.Find(What:="*", After:=Cells(1, 1), LookIn:=xlValues, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious, SearchFormat:=False).Column + 1
  Application.ScreenUpdating = False
  With Range("A2", Range("A" & Rows.Count).End(xlUp)).Resize(, nc)
    .Sort Key1:=.Columns(2), Order1:=xlAscending, Key2:=.Columns(1), Order2:=xlAscending, Key3:=.Columns(3), Order3:=xlDescending, Header:=xlNo, _
          OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, DataOption1:=xlSortNormal, DataOption2:=xlSortNormal, DataOption3:=xlSortNormal
    .Columns(nc).Value = Evaluate(Replace(Replace(Replace(Replace("if(#=@,if(%=^,1,""""),"""")", _
                            "#", .Columns(2).Address), "@", .Columns(2).Offset(-1).Address), "%", .Columns(1).Address), "^", .Columns(1).Offset(-1).Address))
    .Sort Key1:=.Cells(1, nc), Order1:=xlAscending, Header:=xlNo, OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, DataOption1:=xlSortNormal
    On Error Resume Next
    .Columns(nc).SpecialCells(xlConstants, xlNumbers).EntireRow.Delete
    On Error GoTo 0
  End With
  Application.ScreenUpdating = True
End Sub


I pasted that into view code on the worksheet and ran, what should i expect to happen or see?

forgive my lack of understanding
 
Upvote 0
without customer number appears to be fine, however when i then filter on the item numbre column and remove duplicates i get 230 duplicates removed but nothing is actually deleted the row count remains the same, very strange.
 
Upvote 0
I pasted that into view code on the worksheet and ran, what should i expect to happen or see?

forgive my lack of understanding
You should expect to see any duplicates gone. :)

However, I think you don't have the code in the best place. Ideally it should be in a standard module, so once in the code window use the menu to Insert|Module and move the code to that module (probably Module1)

You'll need to save as a macro-enabled workbook and ensure you do have macros enabled.

Go back to your worksheet and press Alt+F8 to bring up the macro dialog, choose the macro and click 'Run'.

Here's my altered sample data with a few duplicates highlighted.

Excel Workbook
ABCDEFGHIJK
1Item CodeCustomerDatePricePresent in F4101
2DWF614BABBEY00110/2/20151,525NoYesCustomer1EACHYesDWF614B
3STDLOANABBEY0010NoYesCustomer0Yes#N/A
4CSI07ALDER0016/4/2013525NoYesCustomer1EACHYesCSI07
5CSI08ALDER0018/4/2013525NoYesCustomer1EACHYesCSI08
6CSI08ALDER0018/4/2013525NoYesCustomer1EACHYesCSI09
7CSI08ALDER0017/4/2013525NoYesCustomer1EACHYesCSI10
8CSI07ALDER0018/4/2013525NoYesCustomer1EACHYesCSI11
9CSI12ALDER0018/4/2013525NoYesCustomer1EACHYesCSI12
10DWH039BMI00910/2/20151,200NoYesCustomer1EACHYesDWH039
11DWH041BMI00910/2/20151,200NoYesCustomer1EACHYesDWH041
12DWH043BMI00910/2/20151,200NoYesCustomer1EACHYesDWH043
13DWH046BMI00910/2/20151,200NoYesCustomer1EACHYesDWH046
14DWH046BMI00910/2/20151,200NoYesCustomer1EACHYesDWH048
15DWH046BMI00910/2/20151,200NoYesCustomer1EACHYesDWH050
16DWH139BMI00910/2/20151,200NoYesCustomer1EACHYesDWH139
17DWH141BMI00910/2/20151,200NoYesCustomer1EACHYesDWH141
18BCP030320BURN00110/12/2012750NoYesCustomer1EACHYes#N/A
19NSO2713CHARI00131/1/201488NoYesCustomer1EACHYes#N/A
20NSO2713CHARI00131/1/201688NoYesCustomer1EACHYes#N/A
21VSO011CHARI0028/9/2014144NoYesCustomer1EACHYes#N/A
22VSO012CHARI0028/9/2014144NoYesCustomer1EACHYes#N/A
Sheet1




Here it is again after the code:

Excel Workbook
ABCDEFGHIJK
1Item CodeCustomerDatePricePresent in F4101
2DWF614BABBEY00110/2/20151,525NoYesCustomer1EACHYesDWF614B
3STDLOANABBEY0010NoYesCustomer0Yes#N/A
4CSI07ALDER0018/4/2013525NoYesCustomer1EACHYesCSI11
5CSI08ALDER0018/4/2013525NoYesCustomer1EACHYesCSI08
6CSI12ALDER0018/4/2013525NoYesCustomer1EACHYesCSI12
7DWH039BMI00910/2/20151,200NoYesCustomer1EACHYesDWH039
8DWH041BMI00910/2/20151,200NoYesCustomer1EACHYesDWH041
9DWH043BMI00910/2/20151,200NoYesCustomer1EACHYesDWH043
10DWH046BMI00910/2/20151,200NoYesCustomer1EACHYesDWH046
11DWH139BMI00910/2/20151,200NoYesCustomer1EACHYesDWH139
12DWH141BMI00910/2/20151,200NoYesCustomer1EACHYesDWH141
13BCP030320BURN00110/12/2012750NoYesCustomer1EACHYes#N/A
14NSO2713CHARI00131/1/201688NoYesCustomer1EACHYes#N/A
15VSO011CHARI0028/9/2014144NoYesCustomer1EACHYes#N/A
16VSO012CHARI0028/9/2014144NoYesCustomer1EACHYes#N/A
Sheet1
 
Last edited:
Upvote 0
when i... and remove duplicates...
If you are using Excel's Remove Duplicates from the Data ribbon, then I recommend you stop doing that as it is notoriously unreliable. For a simple, repeatable example of its failure, see post #8 here.
 
Upvote 0

Forum statistics

Threads
1,225,488
Messages
6,185,281
Members
453,285
Latest member
Wullay

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top