I'm not Barry, but I will take a stab at explaining his brilliance (someone correct me if I mess anything up). This formula makes use of not one, but several obscure facets of Excel so it's easier to understand if you break it out into pieces and digest one piece at a time.
First, let's talk about SEARCH. A lot of us are already familiar with SEARCH; the SEARCH function returns the character position where a search value (“find_text”) is found at. So if you enter:
Code:
=SEARCH("Bar","Foo Bar Baz Bar")
SEARCH will return 5. This is because the first character of the
first instance of Bar occurs at the 5th character of "Foo
Bar Baz Bar" (“within_text”). The third argument of SEARCH won’t matter to this discussion so I am going to ignore that for now. Pretty simple so far, right?
Ok so let’s dig a little deeper. What few people know is that
the first parameter of SEARCH can take multiple find_text values. What!? It's true. If you pass the "find_text" an Array or a Range, SEARCH will check each entry against the "within_text" to see if any cell/element in the “find_text" Range/Array is found. Pretty neat right? There is a catch though: The return value comes in an array.
Allow me to illustrate. If you enter
Code:
=SEARCH({"Baz","Bar"},"Foo Bar Baz")
you would get a result (in the form of an array) reading: {9,5}. This is because "Baz" was found at the 9th character while "Bar" was found at the 5th.
So what can you do with that? Of course if you enter the formula in a cell, you will only see "9". This is because a cell may only have one value. If a value is an array, the cell will treat the first element of the array as it’s value.
However, any formula capable of receiving an array as a parameter can work with the full array.
For example: INDEX returns the
Nth element of an array. If you had an array: {6,3,9}
returns 3 (the 2nd element).
Code:
=INDEX(SEARCH({"Baz","Bar"},"Foo Bar Baz"),2)
returns 5 as this is the second element of the array returned by SEARCH.
Got all that? Eyes glazed over yet?
Good
Now that you know the pertinent details of SEARCH, let’s consider what we need to know about LOOKUP.
The first thing to know about lookup (this is important) is that if LOOKUP can't find the lookup_value,
it uses “the largest value in the array that is less than or equal to lookup_value”. Read it again, I’ll wait.
OK,so this means if you enter
the return value will be 4. 4 is an exact match for 4, so 4 is returned. If you were to enter
the return value will be 2. 2 is the greatest number in the array smaller than 4. So the return value is 2. This specific principle will be why you see the magic 2^15. More on that later.
Moving on. LOOKUP can take a third argument called “result_vector”. This argument is also an Array/Range. If the third argument of LOOKUP is used, rather than returning the “the largest value in the array that is
less than or equal to lookup_value” LOOKUP returns the element of the "result_vector" having the same index as the element found in the "lookup_vector". Did your mind just boggle? I’ll give an example:
Code:
=LOOKUP(4,{0,2,4,1,5},{4,7,2,1,0})
returns 2. 4 is an exact match for 4, found at the 3rd element of the lookup_vector parameter. Because the lookup_value was found at the 3rd position of the look_vector, the
return[/] value is the 3rd position of the return_vector. In this case that value is 2.
If you were to look up 3 instead of 4 the found element would be 2 (because there is no 3) so the return value would be... 7.
Alrighty, now that you know how all the pieces work, let's put it all together. To refresh your memory, here is Barry's very elegant formula again:
Code:
=LOOKUP(2^15,SEARCH(D$2:D$10,A4),E$2:E$10)
For the purposes of this explanation I will assume you pasted this formula into cell B2 of the example file and copied down.
Notice that SEARCH is looking for any/all of the colors in column D. You can also see that because each formula evaluates only one cell, SEARCH is looking for those colors in just the cell next to the formula. SEARCH will check the phrase for any color, and will return an array filled with the results. The results for B3 would look something like this: {14,#Value!,#Value!,#Value!,...}. Blue will be found at the 14th character, but every other color searched for will not be found (thus #Value!). This array will now be passed into LOOKUP as the “lookup_vector” argument.
Now about that magic “2^15”: LOOKUP has the results of SEARCH to sift through, but what should we be looking up? Because the array is a result of the SEARCH function, we know that the array’s values will be limited to numbers or the #Value! Error. And we know we don’t care about the colors that aren’t found, so we are looking for a number.
Well, that’s all well and good, but of course we have no way of knowing what that number will be! The color could be found anywhere in the phrase! And this is why it’s important to know that when LOOKUP can't make an Exact match it will return the next largest number. So if we feed it number that is guaranteed to be bigger than anything in the array, LOOKUP will ignore the #Value! Errors and find the biggest number in the array as the match.
So how big a number do we need? SEARCH returns the character it found the find_text at. So it can’t return any number larger than the maximum length of a cell. The maximum length is 32,767 characters. You may recognize this number as 1 less than 2 to the 15th power (2^15=32,768). So if we sift through array returned by SEARCH for 32,768 SEARCH will return the biggest number in the array. This nicely filters out all of those #Value! entries the represent value not found. And ensures that the element returned defiantly represent a found value. (Barry’s a clever guy.) Now that we have a found element, LOOKUP will return the corresponding lookup vector which happens to be E$2:E$10... And Viola! The formula is compete.
What? What happens if you have multiple values in a phrase? Darn, I thought you would miss that. Because LOOKUP is returning the biggest number smaller than an impossibly large number the entry matched will always be the one with the largest number. Right? Right?! Well… No. The matching part is true, but there is a caveat. If you are using the vector implementation of LOOKUP (the one with 3 arguments) then there is one other thing you need to know.
The vector version of LOOKUP assumes the values it gets are pre-sorted in ascending order. LOOKUP then begins checking entries from the final element backwards until it encounters a value less than or equal to the lookup_value. Do you see what happens? Because all the values are going to be less than 32,768 LOOKUP will stop at the first non-error value it encounters. In other words it will always match the color listed last. No matter where in the find_text the color is found, if it is found, the color position in the array returned by SEARCH is pegged to the order of the range. Thus Red takes precedence over blue, yellow over red etc. So if you need to, you can raise a color's precedence changing the list's order.
So that's it. Barry’s elegant little one-liner took my about 3 pages to explain (and possibly only slightly coherently) thanks to the hour. But that's the explanation. The only thing I can think to add is that if you are using this formula in very many cells, I would hard-code 2^15 so Excel doesn’t have to do that same math over and over again.
Once again kudos to Barry for such an outstanding solution.