Hi All,
I'm looking for a fast formula to perform a lookup with multiple criteria, where one of the criteria is sorted and the other is not. Any ideas? Currently I have a formula that is giving the correct results, but with large datasets it is too slow.
The spreadsheet has a raw data sheet that is arranged in batches. Each row has a unique (but not always incrementing) BatchID, a Sample Date/Time that is always incrementing (well at least is never decrementing, it is possible for there to be two identical Sample Date/Time values), and then columns with the values. I have no control over the raw data sheet.
For a given date I want to lookup the values, so I have a formatted data sheet with a row for each day, I need to lookup the raw datasheet and find the ph, flow and other values from that day (irrespective of the time of day). What is making this tricky is:
• Not all dates are present in the raw data. In this case I want to display a blank cell in the formatted data sheet
• Some dates have multiple batches. For example the pH and Flow may have been entered at different times. So the lookup needs to check not only for the correct date, but also that there is data in the desired column.
Examples:
Raw data sheet:
Formatted data sheet
My current formula to lookup pH is (i.e. formula in cell B2):
=IFERROR(INDEX('Raw Data'!C$2:C$5000,MATCH(1,(TRUNC('Raw Data'!$B$2:$B$5000,0)=$A2)*('Raw Data'!C$2:C$5000<>""),0)),"")
Adding a helper column in an intermediate sheet to truncate the date/time reduced the execution time by about half. This helps, but I'm looking for about 100 x reduction in execution time.
Just looking up the date and using an approximate match gave the speed increase I'm looking for but can't handle the case of multiple batches on the same day.
Any ideas?
I'm not able to use vba/macros in the workbook.
I'm looking for a fast formula to perform a lookup with multiple criteria, where one of the criteria is sorted and the other is not. Any ideas? Currently I have a formula that is giving the correct results, but with large datasets it is too slow.
The spreadsheet has a raw data sheet that is arranged in batches. Each row has a unique (but not always incrementing) BatchID, a Sample Date/Time that is always incrementing (well at least is never decrementing, it is possible for there to be two identical Sample Date/Time values), and then columns with the values. I have no control over the raw data sheet.
For a given date I want to lookup the values, so I have a formatted data sheet with a row for each day, I need to lookup the raw datasheet and find the ph, flow and other values from that day (irrespective of the time of day). What is making this tricky is:
• Not all dates are present in the raw data. In this case I want to display a blank cell in the formatted data sheet
• Some dates have multiple batches. For example the pH and Flow may have been entered at different times. So the lookup needs to check not only for the correct date, but also that there is data in the desired column.
Examples:
Raw data sheet:
Problem example 2.xlsx | ||||||
---|---|---|---|---|---|---|
A | B | C | D | |||
1 | BatchID | Sample Date/Time | pH | Flow | ||
2 | 36 | 01/07/2013 10:30:00 | 6.59 | 254 | ||
3 | 39 | 02/07/2013 06:00:00 | 6.45 | |||
4 | 40 | 02/07/2013 09:30:00 | 236 | |||
5 | 41 | 03/07/2013 07:30:00 | 6.71 | 171 | ||
6 | 42 | 05/07/2013 08:00:00 | 6.35 | 191 | ||
7 | 43 | 06/07/2013 06:00:00 | 6.16 | 221 | ||
8 | 44 | 07/07/2013 08:00:00 | 6.15 | 233 | ||
Raw Data |
Formatted data sheet
Problem example 2.xlsx | |||||
---|---|---|---|---|---|
A | B | C | |||
1 | Date | pH | Flow | ||
2 | 1/07/2013 | 6.59 | 254 | ||
3 | 2/07/2013 | 6.45 | 236 | ||
4 | 3/07/2013 | 6.71 | 171 | ||
5 | 4/07/2013 | ||||
6 | 5/07/2013 | 6.35 | 191 | ||
7 | 6/07/2013 | 6.16 | 221 | ||
8 | 7/07/2013 | 6.15 | 233 | ||
Formatted Data |
Cell Formulas | ||
---|---|---|
Range | Formula | |
B2:C8 | B2 | =IFERROR(INDEX('Raw Data'!C$2:C$5000,MATCH(1,(TRUNC('Raw Data'!$B$2:$B$5000,0)=$A2)*('Raw Data'!C$2:C$5000<>""),0)),"") |
A3:A8 | A3 | =A2+1 |
My current formula to lookup pH is (i.e. formula in cell B2):
=IFERROR(INDEX('Raw Data'!C$2:C$5000,MATCH(1,(TRUNC('Raw Data'!$B$2:$B$5000,0)=$A2)*('Raw Data'!C$2:C$5000<>""),0)),"")
Adding a helper column in an intermediate sheet to truncate the date/time reduced the execution time by about half. This helps, but I'm looking for about 100 x reduction in execution time.
Just looking up the date and using an approximate match gave the speed increase I'm looking for but can't handle the case of multiple batches on the same day.
Any ideas?
I'm not able to use vba/macros in the workbook.