Hi T.Valko
That is very kind of you and yes I do agree that it is the only way we will all learn.
It was just that I thought that as you were kind enough to post various solutions (as others have been) to my excel query problems that I thought I may be over stepping the mark in also asking for an explanation.
I look forward to reading your explanation and thanks once again. Top class in my opinion.
Kind Regards
It took me forever but I finally got around to it.
Hopefully this helps...
Ok here's how this works.
We want to extract the first group of consecutive numbers from a larger string.
A2 = Delgany Blaze 12 P
B2 = formula
=LOOKUP(1E100,--MID(SUBSTITUTE(SUBSTITUTE(UPPER(A2),"P","x"),"A","x"),MIN(FIND({0,1,2,3,4,5,6,7,8,9},A2&"0123456789")),ROW(INDIRECT("1:"&LEN(A2)))))
Returns 12
The first thing we need to know is that if there is no numeric string within the cell entry then the formula will return #N/A.
We'll assume that every cell entry contains at least one numeric substring.
In summary, what the formula does is it finds the first number in the string. Starting from that location it searches substrings of characters of increasing length until it finds the numeric string that we're interested in.
The first thing we need to do is find the starting position in the string where the numeric substring is located. Since we don't know what digits may be present in the string we have to search for all of them. We do that with the FIND function:
FIND({0,1,2,3,4,5,6,7,8,9}
FIND will return the position (the character number) in the string where the first instance of the lookup value is located. If a particular digit is not found in the string then FIND will return a #VALUE! error and that would be the result of the formula. Since the string probably won't contain every digit we need to prevent the FIND function from generating the #VALUE! error. To do that we need to make sure it can find all 10 of the digits even if the string doesn't contain all 10 of the digits.
We do that by concatenating all of the digits to the end of the string:
FIND({0,1,2,3,4,5,6,7,8,9},A2&"0123456789")
In the formula, the string now looks like this:
Delgany Blaze 12 P0123456789
FIND looks for each of the 10 digits and returns this array of results:
FIND(0,A2&"0123456789") = 19
FIND(1,A2&"0123456789") = 15
FIND(2,A2&"0123456789") = 16
FIND(3,A2&"0123456789") = 22
FIND(4,A2&"0123456789") = 23
FIND(5,A2&"0123456789") = 24
FIND(6,A2&"0123456789") = 25
FIND(7,A2&"0123456789") = 26
FIND(8,A2&"0123456789") = 27
FIND(9,A2&"0123456789") = 28
{19,15,16,22,23,24,25,26,27,28}
This array is then passed to the MIN function which will return the minimum value from the array:
MIN({19,15,16,22,23,24,25,26,27,28})
=15
15 is the character location in the string Delgany Blaze 12 P where the first digit is found.
Once we know where the first digit is located we use that in the MID function as the starting point of where to look in the string for the numeric string we want to extract.
MID(string, starting_positon, number_of_characters)
MID("Delgany Blaze 12 P", 15, number_of_characters)
The number_of_characters argument is defined as:
ROW(INDIRECT("1:"&LEN(A2)))
What this does is it generates an array of numbers from 1 to the total length of the string. The length of the string (number of characters) is measured using the LEN function:
LEN(A2) = 18
So, ROW(INDIRECT("1:"&LEN(A2))) generates the array:
{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18}
This means we're going to step through our string starting from character position 15 (the result of the MIN function) and return 1 character, then 2 characters, then 3 characters, then 4 characters, etc., etc. , then 18 characters.
Before we do that we have to account for that quirk in Excel where it just loves to identify times like you discovered when the formula should have returned 12 but instead it returned 0.5.
We do that by virtually replacing any Ps and As in the string. If there are no Ps or As then Excel can't get confused and evaluate 12 P as 12:00 PM or 1 A as 1:00 AM.
We do that using the SUBSTITUTE function:
SUBSTITUTE(SUBSTITUTE(UPPER(A2),"P","x"),"A","x")
Since we need to replace 2 characters we have to use one instance of SUBSTITUTE for each character. SUBSTITUTE is case sensitive so we have to virtually convert the string to be all the same case so we use the UPPER function.
UPPER("Delgany Blaze 12 P") returns DELGANY BLAZE 12 P
Then we replace all the uppercase Ps and As with characters that won't have any impact on the formula evaluation process. In this application I chose to replace them with "x" characters. So, from:
SUBSTITUTE(SUBSTITUTE("DELGANY BLAZE 12 P","P","x"),"A","x")
We end up with:
"DELGxNY BLxZE 12 x"
We now have defined all the parameters of our search function (finally!)
From the string "DELGxNY BLxZE 12 x" we start at character positon 15 (the result of the MIN function) and step through the remaning characters the number of characters defined by the expression ROW(INDIRECT("1:"&LEN(A2))).
The number of strings we generate is basically an arbitrary number. We don't want to do too many but we need to make sure we do enough to get our result. So, we generate a number based on how long the string is, in this case 18.
Here are the strings generated:
MID("DELGxNY BLxZE 12 x",15,1) = 1
MID("DELGxNY BLxZE 12 x",15,2) = 12
MID("DELGxNY BLxZE 12 x",15,3) = 12[space]
MID("DELGxNY BLxZE 12 x",15,4) = 12 x
MID("DELGxNY BLxZE 12 x",15,5) = 12 x
MID("DELGxNY BLxZE 12 x",15,6) = 12 x
MID("DELGxNY BLxZE 12 x",15,7) = 12 x
MID("DELGxNY BLxZE 12 x",15,8) = 12 x
MID("DELGxNY BLxZE 12 x",15,9) = 12 x
MID("DELGxNY BLxZE 12 x",15,10) = 12 x
MID("DELGxNY BLxZE 12 x",15,11) = 12 x
MID("DELGxNY BLxZE 12 x",15,12) = 12 x
MID("DELGxNY BLxZE 12 x",15,13) = 12 x
MID("DELGxNY BLxZE 12 x",15,14) = 12 x
MID("DELGxNY BLxZE 12 x",15,15) = 12 x
MID("DELGxNY BLxZE 12 x",15,16) = 12 x
MID("DELGxNY BLxZE 12 x",15,17) = 12 x
MID("DELGxNY BLxZE 12 x",15,18) = 12 x
As you can see, once we progress and use all the remaining characters the string just keeps repeating.
Now, we have a bunch of subtrings from which we'll find our number string. The MID function is a text function and the returned data type is TEXT even though it may look like a number. So, we want to convert the TEXT number strings to numeric numbers. One way to do that is to use the double unary minus --.
Text numbers will be converted to numeric numbers and text strings will generate #VALUE! errors.
--("100") = 100 as a numeric number
--("text") = #VALUE!
Here are the converted strings:
--MID("DELGxNY BLxZE 12 x",15,1) = 1 (as a numeric number)
--MID("DELGxNY BLxZE 12 x",15,2) = 12 (as a numeric number)
--MID("DELGxNY BLxZE 12 x",15,3) = 12 (as a numeric number)
--MID("DELGxNY BLxZE 12 x",15,4) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,5) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,6) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,7) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,8) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,9) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,10) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,11) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,12) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,13) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,14) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,15) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,16) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,17) = #VALUE!
--MID("DELGxNY BLxZE 12 x",15,18) = #VALUE!
We finally have something that looks like what we're trying to achieve! We can actually see where our result is located!
This array is then passed to the LOOKUP function.
LOOKUP(1E100,{1;12;12;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!;#VALUE!})
The way that LOOKUP works in this application is that it will return the *LAST* number in the array that is less than or equal to the lookup value and will ignore the errors (unless the array is all errors).
In this application the lookup value is 1E100 which is scientific notation, "shorthand", for the very large number 1 followed by 100 zeros. We don't know how big the numbers in the string will be so we need to use a number that is guaranteed to be greater than any numbers in the array. 1E100 is guaranteed to be greater than any numbers in the string.
The *LAST* number in the array is the 12 at the 3rd element.
So, we want to extract the first number string from the cell entry Delgany Blaze 12 P.
A2 = Delgany Blaze 12 P
=LOOKUP(1E100,--MID(SUBSTITUTE(SUBSTITUTE(UPPER(A2),"P","x"),"A","x"),MIN(FIND({0,1,2,3,4,5,6,7,8,9},A2&"0123456789")),ROW(INDIRECT("1:"&LEN(A2)))))
Returns 12
Ok, let's refresh our memories here. Remember when I told you that a string like 12 P would be evaluated as the time value 12:00 PM?
If we didn't use the SUBSTITUTE functions to replace the Ps and As here's what the strings returned by the MID function would look like:
MID("Delgany Blaze 12 P",15,1) = 1
MID("Delgany Blaze 12 P",15,2) = 12
MID("Delgany Blaze 12 P",15,3) = 12[space]
MID("Delgany Blaze 12 P",15,4) = 12 P
MID("Delgany Blaze 12 P",15,5) = 12 P
MID("Delgany Blaze 12 P",15,6) = 12 P
MID("Delgany Blaze 12 P",15,7) = 12 P
MID("Delgany Blaze 12 P",15,8) = 12 P
MID("Delgany Blaze 12 P",15,9) = 12 P
MID("Delgany Blaze 12 P",15,10) = 12 P
MID("Delgany Blaze 12 P",15,11) = 12 P
MID("Delgany Blaze 12 P",15,12) = 12 P
MID("Delgany Blaze 12 P",15,13) = 12 P
MID("Delgany Blaze 12 P",15,14) = 12 P
MID("Delgany Blaze 12 P",15,15) = 12 P
MID("Delgany Blaze 12 P",15,16) = 12 P
MID("Delgany Blaze 12 P",15,17) = 12 P
MID("Delgany Blaze 12 P",15,18) = 12 P
And then when we use the double unary minus on these strings we get:
--MID("Delgany Blaze 12 P",15,1) = 1
--MID("Delgany Blaze 12 P",15,2) = 12
--MID("Delgany Blaze 12 P",15,3) = 12
--MID("Delgany Blaze 12 P",15,4) = 0.5
--MID("Delgany Blaze 12 P",15,5) = 0.5
--MID("Delgany Blaze 12 P",15,6) = 0.5
--MID("Delgany Blaze 12 P",15,7) = 0.5
--MID("Delgany Blaze 12 P",15,8) = 0.5
--MID("Delgany Blaze 12 P",15,9) = 0.5
--MID("Delgany Blaze 12 P",15,10) = 0.5
--MID("Delgany Blaze 12 P",15,11) = 0.5
--MID("Delgany Blaze 12 P",15,12) = 0.5
--MID("Delgany Blaze 12 P",15,13) = 0.5
--MID("Delgany Blaze 12 P",15,14) = 0.5
--MID("Delgany Blaze 12 P",15,15) = 0.5
--MID("Delgany Blaze 12 P",15,16) = 0.5
--MID("Delgany Blaze 12 P",15,17) = 0.5
--MID("Delgany Blaze 12 P",15,18) = 0.5
The formula result will be the *LAST* number in that array that is less than or equal to the lookup value which is the number 0.5 at element 18.
You can test this behavior with this simple formula:
=--"12 P" = 0.5 and when formatted as time = 12:00 PM
This will happen with all numbers from 1 to 12 followed by either a P or an A.
=--"3 A"= 0.125 and when formatted as time = 3:00 AM