I'm sitting here dumbfounded at the idea that I might have been doing things the dumb/slow way for 10 years. I've always used the MATCH function with 0 (exact match) as the 3rd (match_type) variable, e.g.
MATCH(lookup_value, lookup_array, 0).
Not sure why, that's just how I was taught. But I've recently learned about Binary vs Linear search, specifically how much faster the former is for large data sets, but with the one important stipulation being that the lookup_array must be sorted in ascending (or A-Z) order or the lookup may not return the correct value.
The speed difference is striking (my lookup arrays are often 20,000 to 100,000 rows long). So much so that it has me wondering whether I should be using a Binary search 100% of the time, even when it means adding the extra step of sorting the lookup_array ascending/alphabetical. I found this page that seems to suggest/test that exact theory (and seems to advocate doing exactly that.)
- Is that more or less correct?
- Are there any pitfalls to doing it this way?
- My lookup array values sometimes don't start in Row 1...e.g. there might be 4 empty (or null > "") cells before my lookup array values begin in A5 for example. Could that compromise this process? Are empty/null-value cells counted as "high" or "low" values for the purposes of ascending/alphabetical sort order?
MATCH(lookup_value, lookup_array, 0).
Not sure why, that's just how I was taught. But I've recently learned about Binary vs Linear search, specifically how much faster the former is for large data sets, but with the one important stipulation being that the lookup_array must be sorted in ascending (or A-Z) order or the lookup may not return the correct value.
The speed difference is striking (my lookup arrays are often 20,000 to 100,000 rows long). So much so that it has me wondering whether I should be using a Binary search 100% of the time, even when it means adding the extra step of sorting the lookup_array ascending/alphabetical. I found this page that seems to suggest/test that exact theory (and seems to advocate doing exactly that.)
- Is that more or less correct?
- Are there any pitfalls to doing it this way?
- My lookup array values sometimes don't start in Row 1...e.g. there might be 4 empty (or null > "") cells before my lookup array values begin in A5 for example. Could that compromise this process? Are empty/null-value cells counted as "high" or "low" values for the purposes of ascending/alphabetical sort order?