Multi-condition array formulas

artofchemistry

New Member
Joined
Jan 7, 2019
Messages
4
Hi all

I've been using multi-conditional array formulas to extract specific data from multiple tabs in a spreadsheet that meet certain criteria. They work absolutely perfect, however, they require so much processing power that I've had to switch from automatic to manual calculations for the entire workbook.

I work in the engineering industry so my application is a bit more complicated than the dataset below, but I'll just use this as a simple example.

Looking at the data-set, what I would want to calculate is, for example, the average income for employees whose postcode is between 2200 and 2500 AND who's age is less than 50. This is pretty easy to do with an '{=AVERAGE(IF((postcode>2200)*(postcode<2500)*(age<50),(INCOME)))}' where the words are just the cell-ref arrays.


[TABLE="width: 454"]
<tbody>[TR]
[TD]Employee Name[/TD]
[TD]Pet's Name[/TD]
[TD]Income[/TD]
[TD]Postcode[/TD]
[TD]Age[/TD]
[/TR]
[TR]
[TD]A[/TD]
[TD]Alpha[/TD]
[TD] $ 53,000[/TD]
[TD="align: right"]2215[/TD]
[TD="align: right"]45[/TD]
[/TR]
[TR]
[TD]B[/TD]
[TD]Bravo[/TD]
[TD] $ 74,000[/TD]
[TD="align: right"]2310[/TD]
[TD="align: right"]64[/TD]
[/TR]
[TR]
[TD]C[/TD]
[TD]Charlie[/TD]
[TD] $ 39,000[/TD]
[TD="align: right"]2165[/TD]
[TD="align: right"]21[/TD]
[/TR]
[TR]
[TD]D[/TD]
[TD]Delta[/TD]
[TD] $ 98,000[/TD]
[TD="align: right"]2946[/TD]
[TD="align: right"]88[/TD]
[/TR]
[TR]
[TD]E[/TD]
[TD]Echo[/TD]
[TD] $ 225,000[/TD]
[TD="align: right"]2000[/TD]
[TD="align: right"]33[/TD]
[/TR]
[TR]
[TD]F[/TD]
[TD]Foxtrot[/TD]
[TD] $ 48,000[/TD]
[TD="align: right"]2445[/TD]
[TD="align: right"]46[/TD]
[/TR]
[TR]
[TD]G[/TD]
[TD]Charlie[/TD]
[TD] $ 85,000[/TD]
[TD="align: right"]2646[/TD]
[TD="align: right"]30[/TD]
[/TR]
</tbody>[/TABLE]

So, my question is, is there a more efficient or less processing intensive way to perform this calculation in excel?

Look forward to any responses.

many thanks :)

For reference, an example of an actual formula I use looks something like this: "=IF(COUNTBLANK(D$9)=1,"",IFERROR(AVERAGE(IF(('Op Data'!$H$6:$EU$6='Summary'!$B18)*('Op Data'!$H$7:$EU$7='Summary'!$C18)*('Op Data'!$H$5:$EU$5='Summary'!D$9)*('Op Data'!$H$1:$EU$1='Summary'!$B$1)*('Op Data'!$H$8:$EU$1398<>0),'Op Data'!$H$8:$EU$1398)),"-"))"
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Welcome to the forum.

This is certainly simpler. The functions like AVERAGEIFS were created to be more effciient than the AVERAGE(IF...) constructions that required CTRL+Shift+Enter. I also note that workbooks containing large numbers of array formulas will be slow, as will ones with even moderate occurrences of IFERROR (which is notoriously slow.)


Book1
ABCDEFGHIJ
1Employee NamePet's NameIncomePostcodeAgePostal Code range22002500
2AAlpha$53,000221545Age50
3BBravo$74,00023106450500
4CCharlie$39,000216521
5DDelta$98,000294688
6EEcho$225,000200033
7FFoxtrot$48,000244546
8GCharlie$85,000264630
Sheet1
Cell Formulas
RangeFormula
J3=AVERAGEIFS(C2:C8,D2:D8,"<="&I1,D2:D8,">="&H1,E2:E8,"<="&H2)
<strike>
</strike>
 
Last edited:
Upvote 0
Hi DrSteele,

Thanks for your quick reply.

I remember trying to use AVERAGEIFS when I was developing the spreadsheet, and have tried again to use it. I don't think it can work in my situation as the range is a matrix (x rows by y columns where x and y are >1) not just a single range as per my simple example above. This is why I went with the arrays originally. A better example would be similar to above but rather than the range being "C2:C8" it would be "C2:D8" and there'd be an extra condition to return the average "Income" or "Bonus" in addition to the other criteria - any thoughts on this?

[TABLE="width: 546"]
<colgroup><col><col><col span="2"><col><col></colgroup><tbody>[TR]
[TD]Employee Name[/TD]
[TD]Pet's Name[/TD]
[TD]Income[/TD]
[TD]Bonus[/TD]
[TD]Postcode[/TD]
[TD]Age[/TD]
[/TR]
[TR]
[TD]A[/TD]
[TD]Alpha[/TD]
[TD] $ 53,000[/TD]
[TD] $ 5,000[/TD]
[TD="align: right"]2215[/TD]
[TD="align: right"]45[/TD]
[/TR]
[TR]
[TD]B[/TD]
[TD]Bravo[/TD]
[TD] $ 74,000[/TD]
[TD] $ 10,000[/TD]
[TD="align: right"]2310[/TD]
[TD="align: right"]64[/TD]
[/TR]
[TR]
[TD]C[/TD]
[TD]Charlie[/TD]
[TD] $ 39,000[/TD]
[TD] $ 3,500[/TD]
[TD="align: right"]2165[/TD]
[TD="align: right"]21[/TD]
[/TR]
[TR]
[TD]D[/TD]
[TD]Delta[/TD]
[TD] $ 98,000[/TD]
[TD] $ 535[/TD]
[TD="align: right"]2946[/TD]
[TD="align: right"]88[/TD]
[/TR]
[TR]
[TD]E[/TD]
[TD]Echo[/TD]
[TD] $ 225,000[/TD]
[TD] $ 300[/TD]
[TD="align: right"]2000[/TD]
[TD="align: right"]33[/TD]
[/TR]
[TR]
[TD]F[/TD]
[TD]Foxtrot[/TD]
[TD] $ 48,000[/TD]
[TD] $ 3,900[/TD]
[TD="align: right"]2445[/TD]
[TD="align: right"]46[/TD]
[/TR]
[TR]
[TD]G[/TD]
[TD]Charlie[/TD]
[TD] $ 85,000[/TD]
[TD] $ 8,000[/TD]
[TD="align: right"]2646[/TD]
[TD="align: right"]30[/TD]
[/TR]
</tbody>[/TABLE]


As for the IFERRORs being notorious for hogging processing power, thanks for the tip! I'll try cull these down as I do have ALOT of them in my spreadsheet. Do you think if I was to replace them with conditional formatting it would speed up? (ie anywhere there's an error, have the text colour take on the cell background colour. I don't want #N/As or DIVOs throughout my data tables).

Many thanks for your help :D
 
Upvote 0
At first glance, AVERAGEIFS looks suitable. (So long as no users of the spreadsheet have an Excel version that predates AVERAGEIFS.)

Another aproach is to use a query (table). for the specific example, the specific SQL would be
Code:
SELECT Avg(Income) AS [Average Income]
FROM YourDefinedName
WHERE Postcode BETWEEN 2200 AND 2500 AND Age < 50

Queries are excellent when there are large datasets.
VBA functions might be suitable too.

In practice, the approach that is best will depend on the specifics of what is required.
(BTW, I'm guessing that a solution with a lot of array formulas applied over large datasets will be the slowest approach.)
 
Last edited:
Upvote 0
Let's see. Function AGGREGATE is fast, does not require CSE and handles arrays well. It does not handle arrays for the AVERAGE sub-function, but we can trick it by using the PERCENTILE sub-function (that's the 18 in argument1) with the k value being set to 0.5. So, we get the average by creating an array of values AND errors (which AGGREGATE will ignore when we put 6 in argument2) and calculating the 50th percentile.

So with our above data that now includes Bonus in ColumnD and all other columns shifted rightways by one, try this formula:

Code:
[FONT=Verdana,Arial,Tahoma,Calibri,Geneva,sans-serif]=AGGREGATE(18,6,C2:D8/((E2:E8<=J1)*(E2:E8>=I1)*(F2:F8<=I2)),0.5)[/FONT]
 
Upvote 0
If you have Excel365 Insider edition, the new function FILTER makes this task simple and very efficient.

[FONT=Verdana,Arial,Tahoma,Calibri,Geneva,sans-serif]=AVERAGE(FILTER(C2:D8,(E2:E8<=J1)*(E2:E8>=I1)*(F2:F8<=I2)))[/FONT]
 
Upvote 0
Hi guys, sorry I meant to tag you in my reply but tagged you in this thread instead!

DrSteele,

Thanks once again for your response. The aggregate function works much better, but unfortunately it does not appear to return the true average of the dataset. The 50%ile returns the median number if the dataset has an odd number of data points or the average of the two central data points if there is an even number. So its CLOSE but not the true average, which unfortunately isn't OK for my application. Any idea of any other subfunctions we could use to get the actual average of the array?

Also, I don't have Excel365 insider edition so FILTER is not an option for me :(

Fazza,

Thanks for your response. I've considered using SQL but am kind of scared of having to learn how to use it. Would you recommend it for someone with decent excel skills and basic programming abilities? It may be worth the effort if its as efficient as i've heard.

Cheers and many thanks to you both.
 
Upvote 0
Sorry my mistake. That percentile operation does not work. Before the end of Winter/19, MS will have all of the new functions available to all 365 users. So if you can get 365, you'll have an efficient solution to your problem.
 
Upvote 0
SQL is nothing to be scared of. It is simple - I think the solution shows that.

For most queries (basically other than UPDATE, DELETE & INSERT), the basic form is

SELECT fields you want
FROM table
WHERE criteria

text is delimited like 'this'

The value in learning SQL will depend on what you need to do. Likely you will find on-line help - like this forum - to give you solutions without actually needing to learn the specifics. regards

PS. SQL can be used in Excel (directly without programming) in query tables and pivot tables. When used in VBA there are further options like ADO.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,828
Messages
6,181,214
Members
453,024
Latest member
Wingit77

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top