Identifying duplicate values in the same cell, where the data is separated by a pipe.

E2Marv

New Member
Joined
Nov 17, 2020
Messages
5
Office Version
  1. 365
Platform
  1. Windows
I have a large Excel file with over 85k rows, that I feed into our system to read and apply the data.

I have a field where my values are separated by a pipe as shown below:
NU001|NU002|NU003
BA001|BA001
BE001|CS001

I am trying to figure out how to identify rows where the data on either side of the pipe is the exact same (such as the second row, BA001|BA001).

I would prefer not to have to delimit the cells because ultimately, that is the format the data should be in, but I understand if that is not possible. I just need to find either a formula, or a way to conditional format the column to make it obvious to me which cells contain duplicate data.

sometimes the data has 3 values with 2 duplicates such as:

BA001|BA001|PR001

Thanks in advance
 
Last edited by a moderator:

Excel Facts

When did Power Query debut in Excel?
Although it was an add-in in Excel 2010 & Excel 2013, Power Query became a part of Excel in 2016, in Data, Get & Transform Data.
I have a large Excel file with over 85k rows, that I feed into our system to read and apply the data.

I have a field where my values are separated by a pipe as shown below:
NU001|NU002|NU003
BA001|BA001
BE001|CS001

I am trying to figure out how to identify rows where the data on either side of the pipe is the exact same (such as the second row, BA001|BA001).

I would prefer not to have to delimit the cells because ultimately, that is the format the data should be in, but I understand if that is not possible. I just need to find either a formula, or a way to conditional format the column to make it obvious to me which cells contain duplicate data.

sometimes the data has 3 values with 2 duplicates such as:

BA001|BA001|PR001

Thanks in advance

Try this:

Excel Formula:
=IF(LEN(A1)-LEN(SUBSTITUTE(A1,"|",""))=1,
LEFT(A1,SEARCH("|",A1)-1)=MID(A1,SEARCH("|",A1)+1,SEARCH("|",A1)-1),
AND(LEFT(A1,SEARCH("|",A1)-1)=MID(A1,SEARCH("|",A1)+1,SEARCH("|",A1)-1),
LEFT(A1,SEARCH("|",A1)-1)=RIGHT(A1,LEN(A1)- SEARCH("|",A1,SEARCH("|",A1)+1))))
 
Upvote 0
Another option
+Fluff v2.xlsm
AB
1
2NU001|NU002|NU003FALSE
3BA001|BA001TRUE
4BE001|CS001FALSE
5BA001|BA001|PR001TRUE
6
Sheet3
Cell Formulas
RangeFormula
B2:B5B2=OR(LEN(SUBSTITUTE(A2,LEFT(A2,5),""))<>LEN(A2)-5,LEN(SUBSTITUTE(A2,MID(A2,7,5),""))<>LEN(A2)-5)
 
Upvote 0
I'm not sure whether you have given us enough samples to show the possible variety, but if
  • all codes are 5 characters
  • there is a max of 3 codes per cell and a minimum of 2 like your samples
  • the dupes must be adjacent ("identify rows where the data on either side of the pipe"). That is you do not count BA001|BA002|BA001 as a dupe,
then try column B.
If my 3rd bullet point is incorrect but the others hold then try column C

20 11 18.xlsm
ABC
1
2NU001|NU002|NU003FALSEFALSE
3BA001|BA001TRUETRUE
4BE001|CS001FALSEFALSE
5BA001|BA001|PR001TRUETRUE
6PR001|BA001|BA001TRUETRUE
7BA001|BA002|BA001FALSETRUE
Dupes
Cell Formulas
RangeFormula
B2:B7B2=OR(MID(A2,7,5)=LEFT(A2,5),MID(A2,7,5)=MID(A2,13,5))
C2:C7C2=OR(MID(A2,7,5)=LEFT(A2,5),MID(A2,7,5)=MID(A2,13,5),LEFT(A2,5)=MID(A2,13,5))
 
Upvote 0
Solution
Try this:

Excel Formula:
=IF(LEN(A1)-LEN(SUBSTITUTE(A1,"|",""))=1,
LEFT(A1,SEARCH("|",A1)-1)=MID(A1,SEARCH("|",A1)+1,SEARCH("|",A1)-1),
AND(LEFT(A1,SEARCH("|",A1)-1)=MID(A1,SEARCH("|",A1)+1,SEARCH("|",A1)-1),
LEFT(A1,SEARCH("|",A1)-1)=RIGHT(A1,LEN(A1)- SEARCH("|",A1,SEARCH("|",A1)+1))))
Thank you so much for your help, this solution in combination with the other two was exactly what I needed!! Thank you so very much for this!
 
Upvote 0
Thank you all so much for your help with this! These formulas all actually worked to identify the duplicates. I ended up using a combination of Peter_SSs's formula, and the one from JB2020.

Thank you all again for your assistance, and for being so speedy! It is so appreciated.

All the best,
 
Upvote 0
This
I'm not sure whether you have given us enough samples to show the possible variety, but if
  • all codes are 5 characters
  • there is a max of 3 codes per cell and a minimum of 2 like your samples
  • the dupes must be adjacent ("identify rows where the data on either side of the pipe"). That is you do not count BA001|BA002|BA001 as a dupe,
then try column B.
If my 3rd bullet point is incorrect but the others hold then try column C

20 11 18.xlsm
ABC
1
2NU001|NU002|NU003FALSEFALSE
3BA001|BA001TRUETRUE
4BE001|CS001FALSEFALSE
5BA001|BA001|PR001TRUETRUE
6PR001|BA001|BA001TRUETRUE
7BA001|BA002|BA001FALSETRUE
Dupes
Cell Formulas
RangeFormula
B2:B7B2=OR(MID(A2,7,5)=LEFT(A2,5),MID(A2,7,5)=MID(A2,13,5))
C2:C7C2=OR(MID(A2,7,5)=LEFT(A2,5),MID(A2,7,5)=MID(A2,13,5),LEFT(A2,5)=MID(A2,13,5))
was incredible helpful! Column C was what I needed, thank you so much!
 
Upvote 0
Another option
+Fluff v2.xlsm
AB
1
2NU001|NU002|NU003FALSE
3BA001|BA001TRUE
4BE001|CS001FALSE
5BA001|BA001|PR001TRUE
6
Sheet3
Cell Formulas
RangeFormula
B2:B5B2=OR(LEN(SUBSTITUTE(A2,LEFT(A2,5),""))<>LEN(A2)-5,LEN(SUBSTITUTE(A2,MID(A2,7,5),""))<>LEN(A2)-5)
This formula was also very helpful, thank you so much for your assistance!
 
Upvote 0
Glad we could help & thanks for the feedback.
 
Upvote 0
was incredible helpful! Column C was what I needed, thank you so much!
You're welcome. Given the same two initial assumptions from before, here is a shorter one with less function calls required that I think also does the job?

20 11 18.xlsm
AB
1
2NU001|NU002|NU003FALSE
3BA001|BA001TRUE
4BE001|CS001FALSE
5BA001|BA001|PR001TRUE
6PR001|BA001|BA001TRUE
7BA001|BA002|BA001TRUE
8BA001|BA001|BA001TRUE
Dupes
Cell Formulas
RangeFormula
B2:B8B2=OR(FIND(MID(A2,7,5),A2)<7,FIND(MID(A2&"..",13,5),A2&"..")<12)
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,284
Members
452,630
Latest member
OdubiYouth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top