Sumproduct return unique values only not working

doug firr

Board Regular
Joined
Mar 28, 2011
Messages
140
This formula is to return the unique values in a range. The range in question is a list of email addresses.

=SUMPRODUCT(1/COUNTIF($P$2:$P$70348,$P$2:$P$70348))

I have used this formula to count unique values before and have not had problems.

It returns 0 when it should be closer to 50 or 60k.

Any ideas?
 
Your gutless old machine is awesome at slightly less than 1 second. My work PC takes about 4 minutes. I guess that is why I don't use array formulas - stopped using them ~10 years ago in fact.

For me the UDF calculates as for you.
Related...

Find unique value

Demonstrates how testing for conditions that may not exist can be costly.
 
Last edited:
Upvote 0

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
Related...

Find unique value

Demonstrates how testing for conditions that may not exist can be costly.

I don't understand, sorry.

I had a sample of 60,000 rows of data. The formula that took about one second on your machine took about four minutes on mine.

If there are special conditions on the data I missed that bit. I had - values only, no formulas - 60,000 text entries (to simulate the email address). The data was not sorted or anything done to it at all.

Given the huge difference in calculation times I assume there is something different in out setups: I don't know what it is though.

regards
 
Upvote 0
I don't understand, sorry.

I had a sample of 60,000 rows of data. The formula that took about one second on your machine took about four minutes on mine.

If there are special conditions on the data I missed that bit. I had - values only, no formulas - 60,000 text entries (to simulate the email address). The data was not sorted or anything done to it at all.

Given the huge difference in calculation times I assume there is something different in out setups: I don't know what it is though.

regards
In the link, there is a big difference in calc times between these 2 formulas:

=SUM(IF(FREQUENCY(IF(A2:A1000000<>"",MATCH("~"&A2:A1000000,A2:A1000000&"",0)),ROW(A2:A1000000)-ROW(A2)+1),1))


=SUM(IF(FREQUENCY(IF(A2:A1000000<>"",MATCH(A2:A1000000,A2:A1000000,0)),ROW(A2:A1000000)-ROW(A2)+1),1))

Concatenating the tilde and the blank are expensive and probably not needed yet lot's of folks still do it simply because they've seen other folks do it and, they probably don't even know what those chars are doing.

Back to the OPs question, although they didn't specifically say there were no empty cells involved, based on the info they provided we can determine that there are no empty cells in the range so the solution for them should not inclued a test for empty cells which will naturally improve performance.

I guess what I'm trying to say is: use the minimum formula that will do the job!
 
Last edited:
Upvote 0
In the link, there is a big difference in calc times between these 2 formulas:

=SUM(IF(FREQUENCY(IF(A2:A1000000<>"",MATCH("~"&A2:A1000000,A2:A1000000&"",0)),ROW(A2:A1000000)-ROW(A2)+1),1))


=SUM(IF(FREQUENCY(IF(A2:A1000000<>"",MATCH(A2:A1000000,A2:A1000000,0)),ROW(A2:A1000000)-ROW(A2)+1),1))

Concatenating the tilde and the blank are expensive and probably not needed yet lot's of folks still do it simply because they've seen other folks do it and, they probably don't even know what those chars are doing.

Back to the OPs question, although they didn't specifically say there were no empty cells involved, based on the info they provided we can determine that there are no empty cells in the range so the solution for them should not inclued a test for empty cells which will naturally improve performance.

I guess what I'm trying to say is: use the minimum formula that will do the job!

Understood, now, thank you very much. Appreciate the explanation.

Though I'm still curious why it takes one second on your computer and four minutes for me!
 
Upvote 0
Though I'm still curious why it takes one second on your computer and four minutes for me!
In Excel 2007...

I filled the range P2:P70348 with random numbers from 0 to 100 (I have a macro that does this so it's easy to do!).

I entered this array formula in R2:

=SUM(IF(FREQUENCY(MATCH($P$2:$P$70348,$P$2:$P$70348,0),ROW($P$2:$P$70348)-ROW($P$2)+1),1))

I then ran Charles Williams range_timer procedure 5 times. The reported calc times were:

Book1
A
20.99241
30.99252
40.99851
50.99445
60.99436
Sheet1

And the average calc time was 0.99445 seconds.
 
Upvote 0
Thanks, again. That clarifies the issue.

You tested with numbers, and that was quick. Was it just 100 different integers?

I tested with text strings - like the email addresses in the original question, btw.

When I test with 100 different integers the formula is quick, as for you.

When I test with thousands of different strings, it takes some minutes.

regards
 
Last edited:
Upvote 0
Thanks, again. That clarifies the issue.

You tested with numbers, and that was quick. Was it just 100 different integers?

I tested with text strings - like the email addresses in the original question, btw.

When I test with 100 different integers the formula is quick, as for you.

When I test with thousands of different strings, it takes some minutes.

regards
Yeah, what I did may not be a good representative example.

With a total of 101 possible unique values distributed in 70347 rows, the MATCH function probably finds it's matches within the first 2000 rows which makes the formula relatively "fast".
 
Upvote 0

Forum statistics

Threads
1,223,958
Messages
6,175,636
Members
452,662
Latest member
Aman1997

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top