Remove reverse duplicate rows

Cwillson

New Member
Joined
Oct 1, 2015
Messages
36
Office Version
  1. 2019
Platform
  1. Windows
Urgh, my brain is starting to hurt on this one!! Need some help from the amazing hive mind please!

I have a table which contains pairs of records, identified using fuzzy logic. Each row contains a potential match, which needs to be manually reviewed.

Each Record has a reference, and the process has matched all records against all records, therefore some rows are repeated but reversed, i.e. Record A ref appears first then Record B, and the same records appear on another row with Record B reference first and Record A second. N.B. There are numerous other columns in my table, which are what need to be manually reviewed once I've whittled the list down to remove reverse duplicates.

I'm trying to find a way to highlight the first pair, irrespective of which direction it's in, so that I can filter out any reverse pairs. I was trying to do this by creating a concatenated GUID for each row A-to-B, and then searching for the reverse B-to-A using a countif statement. This clearly doesn't work though, as both the original and the reverse records will be highlighted.

Perhaps it's something to do with identifying the minimum row number for each pair and it's reverse? As part of the manual review, rows may be filtered and sorted, so row number probably wont work as this will change.

Here's an example of my data...

Book2.xlsx
BCDE
1Record ARecord BA-to-B ConcatB-to-A Concat
2abcdefabc|defdef|abc
3defabcdef|abcabc|def
4abcfghabc|fghfgh|abc
Sheet1
Cell Formulas
RangeFormula
D2:D4D2=CONCAT(B2,"|",C2)
E2:E4E2=CONCAT(C2,"|",B2)
 

Excel Facts

Which lookup functions find a value equal or greater than the lookup value?
MATCH uses -1 to find larger value (lookup table must be sorted ZA). XLOOKUP uses 1 to find values greater and does not need to be sorted.
In the above example, I'd like to highlight row 2 as being a reverse duplicate of row 1, but leave rows 1 and 3 as needing manual review.

Hope that all makes sense!!
 
Upvote 0
Not sure it's the most elegant solution, but I've found one....

Adding a GUID column (A) with increasing references, e.g.1, 2, 3 etc enabled me to use this formula in cell F2 to return the GUID for the first entry of a reverse pair;

=IFERROR(MIN(INDEX($A$2:$A$4,MATCH(D2,$E$2:$E$4,0),1),INDEX($A$2:$A$4,MATCH(E2,$E$2:$E$4,0),1)),A4)

I then used this formula in cell G2, to highlight the first instance of each unique row;

=IF(COUNTIF($F$2:$F$4,F2)>1,IF(COUNTIF($F$2:F2,F2)=1,"Y",""),"Y")

Copying down these formula in cells F2 and G2 allows me to filter for "Y" in column G - the unique pairs which require manual review! :)

Very happy if someone can propose an improvement though! :)
 
Upvote 0
Solution

Forum statistics

Threads
1,223,227
Messages
6,170,853
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top