Formula to remove duplicate co-ordinate pairs

eejwo

New Member
Joined
Nov 29, 2018
Messages
12
Office Version
  1. 365
Platform
  1. Windows
I have a list of x,y co-ordinate pairs (x1,y1 x2,y2) that describe the two end points of vertices for adjoining polygonal cells.

However within this list I have duplicates, as adjoining cells share vertices. Where the co-ordinate pairs are direct duplicates, obviously they are very easy to remove, but the x,y pairs can also be reversed and describe the same vertex. e.g. Vertex "x" for cell "x" maybe described in the list by x1,y1 x2,y2, whilst vertex "x" for cell "y" (i.e. the same vertex) might be in the list as x2,y2 x1,y1, therefore appearing as unique in the list but describing the same vertex.

I want to be left with a list of co-ordinate pairs that don't contain any duplicate vertices.

I can easily identify which pairs have duplicates by reversing the order of the x,y pairs and comparing to the original. But once identified, does someone have a method for removing and leaving unique values?

Thanks
 

Excel Facts

Excel Joke
Why can't spreadsheets drive cars? They crash too often!
could you post a small sample of the data here?
 
Upvote 0
Hi,

Some example data to hopefully make the problem clearer:

[TABLE="width: 500"]
<tbody>[TR]
[TD]x1[/TD]
[TD]y1[/TD]
[TD]x2[/TD]
[TD]y2[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]2[/TD]
[TD]3[/TD]
[TD]4[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]4[/TD]
[TD]1[/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]6[/TD]
[TD]7[/TD]
[TD]8[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]6[/TD]
[TD]7[/TD]
[TD]8[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]3[/TD]
[TD]5[/TD]
[TD]7[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]4[/TD]
[TD]6[/TD]
[TD]8[/TD]
[/TR]
</tbody>[/TABLE]

I can easily remove row 4, which is a direct duplicate of row 3. However rows 1 & 2 also describe the same vertex, so would like to be able to remove the duplicate here also.
 
Upvote 0
this will pick up the matched co-ords to the other pair, adjust ranges to suit


Book1
ABCDEF
1x1y1x2y2
21234TRUE
33412TRUE
45678
55678
61357
72468
Sheet5
Cell Formulas
RangeFormula
F2{=IFERROR(AND(MATCH(A2&"/"&B2,C$2:C$100&"/"&D$2:D$100,0),MATCH(C2&"/"&D2,A$2:A$100&"/"&B$2:B$100,0)),"")}
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 0
Thanks, this looks to correctly identify the matched pairs.

How would you then go about removing the duplicate pairs. i.e. remove row 2 or 3 in your example above. Is it possible to somehow mark and keep just the first occurrence?

I have ~ 35,000 rows of data with ~ 10,000 duplicates.
 
Upvote 0
this will mark up the first match then you can use a filter to get them


Book1
ABCDEF
1x1y1x2y2
21234
33412
45678TRUE
55678
61357
72468
83478
Sheet5
Cell Formulas
RangeFormula
F4{=IFERROR(AND(MATCH(A4&"/"&B4,A5:$A$10&"/"&B5:$B$10,0)=MATCH(C4&"/"&D4,$C5:C$10&"/"&$D5:D$10,0)),"")}
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 0
Cols F and G find the paired co-ord, and Col H combined them together


Book1
ABCDEFGH
1x1y1x2y2
21234 TRUETRUE
33412FALSE
45678TRUETRUE
55678FALSE
61357FALSE
72468FALSE
83478FALSE
Sheet5
Cell Formulas
RangeFormula
F2{=IFERROR(AND(MATCH(A2&"/"&B2,A3:$A$10&"/"&B3:$B$10,0)=MATCH(C2&"/"&D2,$C3:C$10&"/"&$D3:D$10,0)),"")}
G2{=IFERROR(AND(MATCH(A2&"/"&B2,C2:C10&"/"&D2:D10,0)=MATCH(C2&"/"&D2,A2:A10&"/"&B2:B10,0)),"")}
H2{=OR(IFERROR(AND(MATCH(A2&"/"&B2,A3:$A$10&"/"&B3:$B$10,0)=MATCH(C2&"/"&D2,$C3:C$10&"/"&$D3:D$10,0)),"")=TRUE,IFERROR(AND(MATCH(A2&"/"&B2,C2:C10&"/"&D2:D10,0)=MATCH(C2&"/"&D2,A2:A10&"/"&B2:B10,0)),"")=TRUE)}
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 0
I can't quite see how this works. Well, I see why your formula give the result they do...

However, the only reason that cell G3 isn't reading TRUE is because the lookup array in the MATCH formula isn't absolute? The lookup array needs to be absolute e.g. "$C$2:$C$10&"/"&$D$2:$D$10" otherwise the only reason it's not reading true is because the matched pair isn't within the lookup array?

I'm glad this problem isn't as simple as I feared it might be...
 
Upvote 0
yes, you're right. by using absolute you can pick up the both sets
 
Upvote 0
Sorry, I'm confused. Perhaps my example data was too simplified.

I don't want to pick up both sets. But if I use your suggested formulae over 30,000 rows of data, where the matching pairs of data are not necessarily in adjacent rows, then not having the absolute reference in the MATCH lookup array will mean I miss potential matches.
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,289
Members
452,631
Latest member
a_potato

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top