Counting Unique and Duplicate number instances in a range

Professor_k

New Member
Joined
Mar 2, 2018
Messages
3
Hello peeps,

First time poster looking for some help :biggrin:

In my representative scenario I have a range of items (Apples, Oranges and pears), each repeated in a list with an associated quantity (number of pips).

I want to find out:
a) the number of unique entries for each item (Unique Apples, Unique Oranges, Unique Pears)
b) the number of duplicated entries for each item (duplicate Apples, duplicate Oranges, duplicate Pears)

and have constructed the sheet similar to the table below.


[TABLE="class: grid, width: 200"]
<tbody>[TR]
[TD][/TD]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[TD]E[/TD]
[TD]F[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Item[/TD]
[TD]Pips[/TD]
[TD][/TD]
[TD]Fruit[/TD]
[TD]No. of unique[/TD]
[TD]No. of Duplicate[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]Apple[/TD]
[TD]2[/TD]
[TD][/TD]
[TD]Apple[/TD]
[TD]2[/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]Apple[/TD]
[TD]3[/TD]
[TD][/TD]
[TD]Orange[/TD]
[TD]2[/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]Apple[/TD]
[TD]1[/TD]
[TD][/TD]
[TD]Pear[/TD]
[TD]4[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]Apple[/TD]
[TD]2[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD]Apple[/TD]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]Apple[/TD]
[TD]3[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]Orange[/TD]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD]Orange[/TD]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]10[/TD]
[TD]Orange[/TD]
[TD]2[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]11[/TD]
[TD]Orange[/TD]
[TD]3[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]12[/TD]
[TD]Pear[/TD]
[TD]2[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]13[/TD]
[TD]Pear[/TD]
[TD]3[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]14[/TD]
[TD]Pear[/TD]
[TD]4[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]15[/TD]
[TD]Pear[/TD]
[TD]5[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]


To populate cell E2, I have used the formula
Code:
{[SIZE=2]=SUM(--FREQUENCY(IF($A$2:$A$15=D2,$B$2:$B$15),$B$2:$B$15)=1))} [/SIZE]
with (possibly the incorrect) understanding that:


  • the IF function will pick out entries in cells A2 to A15 that match cell D2 (all "Apple") and return an array of values in B2 to B15 (pips) associated with the matching (Apple) row entries [2;3;1;2;4;3]
  • the FREQUENCY statement will then use this array of values and count how many times each value is repeated against the values in the same range (B2 to B15) but for each subsequent repeat value it will set that occurrence to 0 as it has already been accounted for. Calculated frequency array would be [2;2;1;0;1;0;0;0;0;0;0;0;0;0]
  • There is then a check to see if the frequency count is = 1, meaning that the value (pips) appears once (therefore unique) and the check is therefore TRUE. If the check returns a value > 1 then must be duplicated and therefore check would be FALSE. Array would now be [FALSE;FALSE;TRUE;FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE]
  • "--" turns the TRUE and FALSE'S, into numeric 1 and 0 respectively. Array is now [0;0;1;0;1;0;0;0;0;0;0;0;0;0]
  • SUM function adds up the 1's (TRUE's) in array which gives the sum of matching unique values! (=2)

This formula appears to work for the first item in the list (Apples) when entered into cell E2 and illustrates the requisite array values above when stepping through the calculation using the 'Evaluate Formula" in Formulas toolbar, however for the subsequent items (Oranges, Pears), it does not seem to work in same way :(

When copying the formula down to E3 to match "Orange" entries, the formula changes to
Code:
{=SUM(--(FREQUENCY(IF($A$2:$A$15=[COLOR=#ff0000]D3[/COLOR],$B$2:$B$15),$B$2:$B$15)=1)}
as expected.


  • the IF function will pick out entries in cells A2 to A15 that match cell D3 (all "Orange") and return an array of values in B2 to B15 (pips) associated with the matching (Orange) row entries [4;4;2;3]
  • the FREQUENCY statement will then use this array of values and count how many times each value is repeated against the values in the same range (B2 to B15) but for each subsequent repeat value it will set that occurrence to 0 as it has already been accounted for. Calculated frequency array I thought would be [0;0;0;0;0;0;2;0;1;1;0;0;0;0] (2x4's, 1x2's, 1x3's).

However,
when I look at 'Evaluate Formula" in Formulas toolbar, at this point, the array has values [1;1;0;0;2;0;0;0;0;0;0;0;0;0] which I don't understand, it looks like it is counting out of sequence or offset?

It is giving me the same end result but I don't follow the logic for this second item (Orange) and other items (Pears).

This is ever more significant as when I try to use a related formula in F2 to find the duplicates
Code:
{=SUM(--(FREQUENCY(IF($A$4:$A$17=D5,$B$4:$B$17),$B$4:$B$17)[COLOR=#ff0000]>1[/COLOR]))}
logic being that any duplicates would have a frequency of occurrence greater than 1, I get strange results for the subsequent items (Oranges and Pears) in the list, I don't get the expected end result.​


Hopefully the above isn't too complex to understand, would greatly appreciate some help with this as it is driving me mad :rofl:.

Many thanks in advance,

Professor
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
Maybe:
Formula for unique is an array formula and must be entered with CTRL-SHIFT-ENTER.
Drag formulas down as needed.
Excel Workbook
ABCDEF
1ItemPipsFruitNo. of uniqueNo. of Duplicate
2Apple2Apple42
3Apple3Orange31
4Apple1Pear40
5Apple2
6Apple4
7Apple3
8Orange4
9Orange4
10Orange2
11Orange3
12Pear2
13Pear3
14Pear4
15Pear5
Sheet
 
Upvote 0
Hello AhoyNC, thanks for your reply,

Unfortunately, this is not quite what I am looking for, in Column E I am trying to calculate truly unique entries - e.g. not including any the first instance of an entry that is duplicated.

For example:
Row 2 (A2:B2) has the same detail as Row 5 (A5:B5) (apples with 2 pips) and thus is not truly unique - I don't want this included in the unique calculation
Row 3 (A2:B2) has the same detail as Row 7 (A5:B5) (apples with 3 pips) and thus is not truly unique - I don't want this included in the unique calculation

Therefore truly unique entries with "Apple" item would be rows 4 and 6, meaning 2 unique entries in total - this is the calculation I am requiring.

I should've elaborated in my initial post, but the duplication calculation in column F, I want to be independent from Column E - that is, the value calculated in Column F is not derived from Column E but is independently derived from the source data.

Thanks again,
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top