to Aladin Akyurek
Posted by kristel on December 06, 2001 5:01 AM
Aladin, as i was browsing this site, i read your solution for a multi conditional sum if, using sumproduct.
I didn't know one can include conditions in this function.
could you tell me where i can find some documentation about this subject ?
Is it possible to use thsi "trick" with other functions too?
thanks in advance!
Kristel
Posted by Aladin Akyurek on December 09, 2001 10:34 AM
Condition-driven computing
Kristel --
The day you posted your query I had to give a class on multiconditional counting and summing. What follows is more or less an extract from that class.
COUNT (or COUNTA) is used for counting, SUM for summing when no condition involved.
COUNTIF is used for counting, SUMIF for summing when a single condition involved.
SUMPRODUCT or an array formula is used for counting as well as for summing when two or more conditions involved. I call these respectively multiconditional count and multiconditional sum.
The syntax of COUNTIF:
COUNTIF(Range,Condition)
where Condition is built up with comparison operators and either a constant (in a cell) or a computation.
Examples:
=COUNTIF(A1:A10,">"&B1)
counts entries in A1:A10 greater than the value in B1;
=COUNTIF(A1:A10,">" & AVERAGE(A1:A10))
counts entries in A1:A10 greater than the average of A1:A10 (a computation).
The syntax of SUMIF:
SUMIF(Range1,Condition,Range2)
sums the cells in Range2 associated with (in the same row as) the cells of Range1 for which Condition holds. Note that Range2 can be the same as Range1.
Examples:
=SUMIF(A1:A10,D1,B1:B10)
sums each entry in B1:B10 when the associated entry in A1:A10 is equal to the value in D1;
=SUMIF(E1:E10,">" & AVERAGE(E1:E10),E1:E10)
which can be shortened just to
=SUMIF(E1:E10,">" & AVERAGE(E1:E10))
sums every number in E1:E10 greater than the average of E1:E10.
Syntax of SUMPRODUCT (from Excel's Index under Help):
" Multiplies corresponding components in the given arrays, and returns the sum of those products.
Syntax
SUMPRODUCT(array1,array2,array3, ...)
Array1, array2, array3, ... are 2 to 30 arrays whose components you want to multiply and then add.
¥ The array arguments must have the same dimensions. If they do not, SUMPRODUCT returns the #VALUE! error value.
¥ SUMPRODUCT treats array entries that are not numeric as if they were zeros.
Example
[ A1:B3 houses {3,4;8,6;1,9}, D1:E3 {2,7;6,7;5,3}. ]
The following formula multiplies all the components of the two arrays on the preceding worksheet [example] and then adds the products Ñ that is, 3*2 + 4*7 + 8*6 + 6*7 + 1*5 + 9*3.
SUMPRODUCT({3,4;8,6;1,9}, {2,7;6,7;5,3}) equals 156
Remark
The preceding example returns the same result as the formula SUM(A1:B3*D1:E3) entered as an array. Using arrays provides a more general solution for doing operations similar to SUMPRODUCT. For example, you can calculate the sum of the squares of the elements in A1:B3 by using the formula SUM(A1:B3^2) entered as an array. "
A larger example follows.
Suppose that A1:E24 houses the sample monthly sales data (took the sample from Walkenbach) in a worksheet named Data:
={"Month","Sales Rep","Region","Contacts", "Sales";"Jan","Bob","North",58,283800; "Jan","Frank","North",35,507200; "Jan","Paul","South",25,107600; "Jan","Randy","South",47,391600; "Jan","Mary","South",39,226700; "Feb","Bob","North",44,558400; "Feb","Jill","North",46,350400; "Feb","Frank","North",74,411800; "Feb","Paul","South",29,154200; "Feb","Randy","South",45,258000; "Feb","Mary","South",52,233800; "Mar","Bob","North",30,353100; "Mar","Jill","North",44,532100; "Mar","Frank","North",57,258400; "Mar","Paul","South",13,286000; "Mar","Randy","South",14,162200; "Mar","Mary","South",36,134300; "Apr","Bob","North",54,595500; "Apr","Jill","North",44,480100; "Apr","Frank","North",79,555500; "Apr","Paul","South",36,328200; "Apr","Randy","South",31,154200; "Apr","Mary","South",22,200600}
Note 1. Copy this including the =-sign. Activate A1 in a worksheet that you name Data, paste it in the Formula Bar, and hit enter. Activate A1, select the range A1:E24, go to the Formula Bar, and hit CONTROL+SHIFT+ENTER at the same time. Copy the range A1:E24 then activate just A1 and do Edit|Paste Special -> Values.
Notice that in month Jan the Sales Rep Bob had 58 clients (Contacts) in Region North, achieving a Sales amount of $ 283,800. Also, different reps are active in a region in each and every month.
Note 2. Give the following names to respective ranges via the Name Box or via Insert|Name|Define:
MONTHS for A2:A24
REPS for B2:B24
REGIONS for C2:C24
CONTACTS for D2:D24
SALES for E2:E24
Intermezzo: Suppose that we want a summary of Contacts and Sales per Sales Rep. SUMIF will serve this purpose tout court. The condition for totaling (Summing) is the name of the Sales Rep. In order to obtain this summary, create in a worksheet (which you can name SUMIF) the following in B4:B12:
{"Sales Rep"; "Bob"; "Frank"; "Paul"; "Randy"; "Mary"; "Jill"; "Ron"; "Tim"} [ a uniquified list of Sales Rep that can be extracted from the Sales Rep column of Data with Advanced Filter ]
and the following list of labels in C4:D4:
{"Contacts","Sales"}
In C5 enter: =SUMIF(REPS,B5,CONTACTS)
In D5 enter: =SUMIF(REPS,B5,SALES)
Select C5:D5 and copy down. You'll get:
{"Sales Rep","Contacts","Sales"; "Bob",560,4403700; "Frank",644,4373400; "Paul",392,2992300; "Randy",380,3394000; "Mary",476,2730300; "Jill",548,4153900; "Ron",51,297244; "Tim",73,400305}
End Intermezzo. Now back to SUMPRODUCT.
Create in yet another worksheet (which you can name SUMPRODUCT) the following list of months in B10:B21:
{"Jan"; "Feb"; "Mar"; "Apr"; "May"; "Jun"; "Jul"; "Aug"; "Sep"; "Oct"; "Nov"; "Dec"}
the following list of labels in C9:E9:
{"Sales Reps","Contacts","Sales"}
an in B1 the following label: Region.
The last sheet is prepared to create a summary per region per month.
If we enter "north" in B2, the summary will cover region "north". If we enter nothing in B2 (we leave it empty), the summary will cover all regions taken together. Now the SUMPRODUCT formulas:
In C10 enter and copy down till month Dec:
=IF(LEN($B$2),
SUMPRODUCT((MONTHS=B10)*(REGIONS=$B$2)),
COUNTIF(MONTHS,B10))
If B2 is 'north', the SUMPRODUCT part of this IF-formula will be executed, resulting in a count showing that 2 Sales Reps have been active in month 'Jan' in region 'north'.
The SUMPRODUCT part consists here of two boolean terms: (MONTHS=B10) and (REGIONS=$B$2) which are ANDed, that is, * between these terms is a boolean AND that is similar to logical function AND [ Remark: AND(MONTHS=B10,REGIONS=$B$2) is not allowed, becasue AND doesn't produce a constant array in Excel, just a single logical result. ]
What the SUMPRODUCT part above does is a multiconditional count. A row of a month and a region is counted if the month-value is "Jan" and if the region-value is "north". How it works?
Each cell in the MONTHS range is evaluated to see if it's "Jan", leading to the array:
{TRUE; TRUE; TRUE; TRUE; TRUE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE; FALSE}
and each cell in the REGIONS range is evaluated leading to the array:
{TRUE; TRUE; FALSE; FALSE; FALSE; TRUE; TRUE; TRUE; FALSE; FALSE; FALSE; TRUE; TRUE; TRUE; FALSE; FALSE; FALSE; TRUE; TRUE; TRUE; FALSE; FALSE; FALSE}
Multiplying (ANDing) these two arrays leads to the array:
{1; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0}
where TRUE*TRUE = 1, TRUE*FALSE = 0, and FALSE*FALSE = 0 [ that is, Excel converts TRUE to 1, FALSE to 0, and executes the multiplication ]
then the resulting array of 1's and 0's is summed, which generates 2, the count we are looking for.
I leave to you what the COUNTIF part of the IF-formula is supposed to do.
In D10 enter and copy down till month Dec:
=IF(LEN($B$2),
SUMPRODUCT((MONTHS=B10)*(REGIONS=$B$2),(CONTACTS)),
SUMIF(MONTHS,B10,CONTACTS))
The SUMPRODUCT part of the foregoing IF-formula produces a total of all clients/contacts per region and per month, which is a multiconditional sum. That is, it sums all values in CONTACTS if the associated MONTHS values are "Jan" and REGIONS values are "north".
If B2 is 'north', the SUMPRODUCT part will be executed, resulting in a total (sum) showing that 93 clients/Contacts have been served/invoiced in month 'Jan' in region 'north'.
We have already seen the result array from (MONTHS=B10)*(REGIONS=$B$2) :
{1; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0}
multiplied with
{58; 35; 25; 47; 39; 44; 46; 74; 29; 45; 52; 30; 44; 57; 13; 14; 36; 54; 44; 79; 36; 31; 22} consisting of values in CONTACTS, we get the array (see the MS explanation at the beginning) :
{58; 35; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0}
which is then summed, resulting in 93.
I leave again to you what the SUMIF part of the preceeding IF-formula is supposed to do.
Some more examples:
Mike Winters (see 8198.html ) wanted to compute the number of people who succedeed on 3 tests they have taken. Note that there are also blanks cells which indicate that a testee has not (yet) taken the target test. A row of 3 values count as one if each cell/test score is higher than or equal to 90, which is a clear case of a multiconditional count of ANDing type:
=SUMPRODUCT((AL4:AL110>=90) * (AM4:AM110>=90) * (AN4:AN110>=90))
where AL4:AL100 houses the scores on the first test, AM4:AM110 the scores of the second test, and AN4:AN110 the scores on the first test.
What if you want to compute the number of people that failed these tests (see: 6922.html). One fails a test if the score is lower than 60. This is also a multiconditional count, but an OR'ing type. One score below 60 on one of the three test scores is sufficient to be counted as failed. The formula that follows just does that:
=SUMPRODUCT( (ISNUMBER(AL4:AL110)) * (ISNUMBER(AM4:AM110)) * (ISNUMBER(AN4:AN110)) * (((AL4:AL110 < 60) + (AM4:AM110 < 60) + (AN4:AN110 < 60)) = {1,2,3}))
The ISNUMBER test is necessary because SUMPRODUCT treats a blank cell as zero. If we leave out the ISNUMBER test, the number of failed people will be conflated. And, two or three scores in a row should be treated as one failure. Whence + in the formula (boolean OR) along with ={1,2,3}.
Another nice question was posed by John A. McGraw (see: 6922.html). He wanted "to know the rank of 5 feet tall kids among only the 10 year old kids". The question can be paraphrased as a conditional PERCENTRANK as he nicely put it. The SUMPRODUCT formula that follows meets that need:
=SUMPRODUCT((A1:A100=10) * (B1:B100 < 5)) / (COUNTIF(A1:A100,10)-1)
where A1:A100 houses ages, B1:B100 heights. One can do everything reviewed above with the array formulas. But for this one I'd go for SUMPRODUCT.
Alan Bunyan wanted to get rid of the numeric part of a bunch of strings like
{"vbn52";
"vb263";
"s45"}
The desired result is:
{"vbn";
"vb";
"s"}
=SUBSTITUTE(A1, RIGHT(A1, SUMPRODUCT((LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9},""))))),"")
does the job. [ I believe I have been the first to set up the last formulas; you are free to use them as long as the due credit is given. :) ]
Some people were puzzled about the workings of the last formula. What follows is a nice explanation by Harlan Grove for which I'm grateful.
"As used in Aladin's formula, the SUBSTITUTE array result is an argument to
the LEN function, and that in turn is part of an array expression that's an
argument to SUMPRODUCT. SUMPRODUCT then reduces this to a single number. If
A1 contained 'VBN1234567890', Aladin's formula works as follows.
SUBSTITUTE(A1,{0,1,2,3,4,5,6,7,8,9},"") returns the array
{"VBN123456789", "VBN234567890", "VBN134567890", "VBN124567890", "VBN123567890",
"VBN123467890", "VBN123457890", "VBN123456890", "VBN123456790", "VBN123456780"}
LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9},"")) returns the array
{12, 12, 12, 12, 12, 12, 12, 12, 12, 12}
(LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9},""))) returns the array
{1,1,1,1,1,1,1,1,1,1}
SUMPRODUCT((LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9},"")))) returns
10
The rest I leave to you."
All this is too long. Hope it covers what you needed to know about multiconditional count and sum.
Regards,
Aladin
========