Hello everyone,
I am having trouble with counting instances where a specific value appears within a subgroup of a group.
An example of the data are below:
[TABLE="class: grid, width: 300"]
<tbody>[TR]
[TD]GROUP[/TD]
[TD]ID[/TD]
[TD]DATA[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]1[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]1[/TD]
[TD]1st St[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]2[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]2[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]2[/TD]
[TD]1st St[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]2[/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
Context:
I am resolving duplicates within my data, there are instances where the same person has two separate IDs and records. I ran an automated check to ID possible duplicates, which are tagged by the GROUP field; the ID field are the individual ID numbers. Address information is a very easy way to pick out duplicates.
I need to write something which does the following:
For each GROUP number, look at all the unique IDs within. For each unique ID within a group, look to see if the address in the DATA field appears for any other of the unique IDs within the group. If so, mark it 1 / TRUE / etc.
I have written a COUNTIFS formula that gets most of the way there, but it does not exclude the row of the formula.
It is COUNTIFS(A:A,A2, B:B,"<>&", C:C,C2)
It should, for group 1 row 2, produce a 1, b/c Main Street appears in Group1 ID1, and also 1x in Group1 ID2. I cannot get it to ignore row 2, I am stuck with the formula giving me a 2 for a result, b/c Main Street appears twice in Group1.
Any help is appreciated.
Thank you!
I am having trouble with counting instances where a specific value appears within a subgroup of a group.
An example of the data are below:
[TABLE="class: grid, width: 300"]
<tbody>[TR]
[TD]GROUP[/TD]
[TD]ID[/TD]
[TD]DATA[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]1[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]1[/TD]
[TD]1st St[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]2[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]2[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD]Main St[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]1[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]2[/TD]
[TD]1st St[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]2[/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
Context:
I am resolving duplicates within my data, there are instances where the same person has two separate IDs and records. I ran an automated check to ID possible duplicates, which are tagged by the GROUP field; the ID field are the individual ID numbers. Address information is a very easy way to pick out duplicates.
I need to write something which does the following:
For each GROUP number, look at all the unique IDs within. For each unique ID within a group, look to see if the address in the DATA field appears for any other of the unique IDs within the group. If so, mark it 1 / TRUE / etc.
I have written a COUNTIFS formula that gets most of the way there, but it does not exclude the row of the formula.
It is COUNTIFS(A:A,A2, B:B,"<>&", C:C,C2)
It should, for group 1 row 2, produce a 1, b/c Main Street appears in Group1 ID1, and also 1x in Group1 ID2. I cannot get it to ignore row 2, I am stuck with the formula giving me a 2 for a result, b/c Main Street appears twice in Group1.
Any help is appreciated.
Thank you!