Aladin Akyurek
MrExcel MVP
- Joined
- Feb 14, 2002
- Messages
- 85,210
What follows is triggered by a recent thread on unique count:
http://www.mrexcel.com/board2/viewtopic.php?t=295052
As the above timing results with FastExcel based on three runs show, the formula with COUNTDIFF is the winner, the one with FREQUENCY/MATCH (a recent addition to the set of formulas for unique/distinct count) the second best. The formula which is apparently posted on "JWalk" site, that is,
{=SUM(IF(COUNTIF(A2:A13225,A2:A13225)=0,"",1/COUNTIF(A2:A13225,A2:A13225)))}
is the worst performer.
Note that COUNTDIFF is case-senstive. That is why it includes UPPER to simulate the others.
Note also that the CountDiff formula will yield correct a correct result in face of special meaning chars like<, *, ? that could figure in strings. The formulas with CountIf won't. The formula with Frequency/Match while less sensitive to such chars, it's not immune against trailing "*".
http://www.mrexcel.com/board2/viewtopic.php?t=295052
Performance of Formulas for Unique Count.xls | ||||||||
---|---|---|---|---|---|---|---|---|
A | B | C | D | E | F | |||
1 | {=COUNTDIFF(UPPER(A2:A13225),TRUE,"")} | |||||||
2 | Calc Time(Millisec) | % | Book | MicroSecs | ||||
3 | Re Calc | Full Calc | Volatile | OvHead | /Formula | |||
4 | 0.0 | 57.1 | 0.1% | 5.2 | 57,093.6 | |||
5 | 0.0 | 48.9 | 0.1% | None | 48,939.7 | |||
6 | 0.0 | 48.8 | 0.1% | None | 48,771.8 | |||
7 | MEDIAN | 0.0 | 48.9 | 0.1% | 515.9% | 48,939.7 | ||
8 | ||||||||
9 | {=SUM(IF(A2:A13225<>"",1/COUNTIF(A2:A13225,A2:A13225)))} | |||||||
10 | Calc Time(Millisec) | % | Book | MicroSecs | ||||
11 | Re Calc | Full Calc | Volatile | OvHead | /Formula | |||
12 | 0.0 | 8,537.2 | 0.0% | 75.9 | 8,537,211.3 | |||
13 | 0.0 | 8,524.5 | 0.0% | None | 8,524,544.3 | |||
14 | 0.0 | 8,417.5 | 0.0% | 15.6 | 8,417,450.5 | |||
15 | MEDIAN | 0.0 | 8,524.5 | 0.0% | 45.7 | 8,524,544.3 | ||
16 | ||||||||
17 | =SUMPRODUCT((A2:A13225<>"")/COUNTIF(A2:A13225,A2:A13225&"")) | |||||||
18 | Calc Time(Millisec) | % | Book | MicroSecs | ||||
19 | Re Calc | Full Calc | Volatile | OvHead | /Formula | |||
20 | 0.0 | 8,390.3 | 0.0% | None | 8,390,289.5 | |||
21 | 0.0 | 8,394.9 | 0.0% | 1.3 | 8,394,944.9 | |||
22 | 0.0 | 8,500.7 | 0.0% | 14.8 | 8,500,686.0 | |||
23 | MEDIAN | 0.0 | 8,394.9 | 0.0% | 8.1 | 8,394,944.9 | ||
24 | ||||||||
25 | {=SUM(IF(COUNTIF(A2:A13225,A2:A13225)=0,"", 1/COUNTIF(A2:A13225,A2:A13225)))} | |||||||
26 | Sheets | Calc Time(Millisec) | % | Book | MicroSecs | |||
27 | Count | Re Calc | Full Calc | Volatile | OvHead | /Formula | ||
28 | 3 | 0.0 | 16,890.6 | 0.0% | 85.6 | 16,890,555.1 | ||
29 | 3 | 0.0 | 16,938.9 | 0.0% | None | 16,938,900.1 | ||
30 | 3 | 0.0 | 16,937.9 | 0.0% | None | 16,937,873.7 | ||
31 | MEDIAN | 0.0 | 16,937.9 | 0.0% | 85.6 | 16,937,873.7 | ||
32 | ||||||||
33 | {=SUM(IF(FREQUENCY(IF(A2:A13225<>"", MATCH("~"&A2:A13225,A2:A13225&"",0)), ROW(A2:A13225)-ROW(A2)+1),1))} | |||||||
34 | Calc Time(Millisec) | % | Book | MicroSecs | ||||
35 | Re Calc | Full Calc | Volatile | OvHead | /Formula | |||
36 | 0.0 | 677.3 | 0.0% | 0.1 | 677,286.6 | |||
37 | 0.0 | 677.6 | 0.0% | 0.8 | 677,571.3 | |||
38 | 0.0 | 676.8 | 0.0% | None | 676,776.0 | |||
39 | MEDIAN | 0.0 | 677.3 | 0.0% | 0.4 | 677,286.6 | ||
FastXL |
As the above timing results with FastExcel based on three runs show, the formula with COUNTDIFF is the winner, the one with FREQUENCY/MATCH (a recent addition to the set of formulas for unique/distinct count) the second best. The formula which is apparently posted on "JWalk" site, that is,
{=SUM(IF(COUNTIF(A2:A13225,A2:A13225)=0,"",1/COUNTIF(A2:A13225,A2:A13225)))}
is the worst performer.
Note that COUNTDIFF is case-senstive. That is why it includes UPPER to simulate the others.
Note also that the CountDiff formula will yield correct a correct result in face of special meaning chars like<, *, ? that could figure in strings. The formulas with CountIf won't. The formula with Frequency/Match while less sensitive to such chars, it's not immune against trailing "*".