Ahmed asks how to do a lookup that only looks at the text portion of the lookup value. Mike and Bill compare methods in this Dueling Excel Episode 1169.
Transcript of the video:
Bill: Hey, I'm Bill Jelen from MrExcel.com, and I've got a cool Excel tip for you today!
Mike: Hey this is Mike Gel Girvin, I’m ExcelIsFun in YouTube, and I have a different way to do that!
Bill: Hey welcome back, I'm Bill Jelen from MrExcel.
Well here it is, the last podcast of 2009, of course it's a Dueling Excel podcast, so Mike Girvin will be joining us.
Now Mike had someone send in this question, I think was Ahmed, Ahmed needs to do a lookup, but he needs to do a lookup on only the text portion.
Alright, and Mike already gave me a heads up that the formula solution is horrible, so I'm going to do what I always do and that's VBA.
Let's switch over to VBA, Alt+F11, we’ll insert a module, and I'll say Function TextOnly(Entry), so Entry is going to be the argument that gets passed.
And what we're going to do is we're going to say For i = 1 To Len(Entry), so you know, depending on how long the entry is.
ThisChar = Mid(Entry),i,1 . So this is just like using the MID function in Excel.
And then we're going to do Select Case Asc(ThisChar), now the A-S-C function in VBA is just like the CODE function in Excel, see, I started to head off in the wrong direction there.
Select Case is pretty cool.
What I want to do is I want to find all the digits, so the digits are Case 48 (that's a 0), 49, 50, 51, 52, 53, 54, 55, 56, 57, alright!
If we get any of those, what do I want to do?
I don't want to do anything because I want to ignore the digits.
What I'm really interested in is everything else, Case Else, and we're going to say TextOnly = the old TextOnly value, which initially will be nothing, & ThisChar, alright, that's it, that's the code.
So this is going to run through each characters, go ahead, take a look, and see if it is one of the digits 0-9.
If it is, it's going to do nothing, otherwise it's going to build the string of text.
Let's try it here, we'll try our function, make sure that's working, so =TEXTONLY that value, ah look at that, It's getting just the letters, perfect, exactly what we need.
Now that it's, you know, from here it's a no-brainer, right?
So VBA, not many lines of code, you see, it didn't take long to knock it out, simple little loop, check to see if it's digits, seems to be working relatively fast.
Now, I'm like, it's going to have a much more elegant solution, I'm already going to give him the point.
Let's see what Mike has!
Mike: Thanks MrExcel!
Elegant?
That VBA blows away!
What I'm about to do right here, this is a messy formula, and here's the trouble with this: When you're trying to extract certain parts of a text string like this, usually we have something like a pattern we can exploit, right?
A dash, with a dash you could do something like this LEN-FIND and the LEFT.
If it was always the same 3 lengths for prefix, you just use LEFT(3), but in both these cases there's no dash, and it's not just LEFT(3), sometimes there's 2 characters, sometimes there’s 1.
Finally, you could use LOOKUP a maximum number of characters in the cell, This SEARCH construction, I've done video on this before.
But that's not going to work either because we have duplicate letters, there's an S there and an S there.
So when that construction there is looking for S, it gets a TRUE here and a TRUE here.
Oh!
So the only pattern were left with, is there's letters and there's numbers.
Now this is going to get messy, I'm going to start from the inside of the formula and build our way out.
The first thing is, we need to extract all the individual letters and numbers, and then start doing some analysis on that.
So I'm going to use ROW(INDIRECT, and I'm going to construct, “1:”&LEN, because I need a string of numbers exactly the length of this to then extract the letters.
So this way, it'll give us a varying length, sometimes it'll be 1-2-3-4-5, sometimes it'll be 1-2-3-4-5-6-7. “”, and if you highlight this and hit the F9 key, you can see we get that there, further down we'll get 1-2-3-4-5, Ctrl+Z.
What are we going to do with that, we’ll use the MID because our goal is to extract all the individual letters.
So we use MID of this, comma, and we usually say “Start in position 5 or 4.” But this is an array, and this is how we'll get all of them.
So there's all the starting positions, comma, and how long do we want each one, 1, so close parenthesis.
If we F9 this, you can see, we finally extracted the individual letters, Ctrl+Z, now we have letters.
What happens if I multiply this *1, I'm going to get where the numbers are, a number, but where the text is, I get an error!
So if I hit F9, ah, so now we have something that differentiates the text and the numbers, Ctrl+Z.
So from that I'm going to say “Which one are ISERROR?” And finally, now I have a text string, I'm sorry, an array of TRUEs and FALSEs, that's the first place where there's a number, that's the last place where there's a letter, Ctrl+Z.
So from that now, I need to find, let me just do that again, because here's the one conceptual trick, that FALSE, I need that position, and match when there's duplicates like FALSE, we’ll find the first one.
So we can use MATCH and look up FALSE, Ctrl+Z, so I use MATCH and look up FALSE, comma.
And I need to do an exact match because we want the first one, and exact match will find, with duplicates, it always finds the first one.
If I hit F9, finally, I get to a 4, but I really want LEFT how many, so I need to Ctrl+Z, -1, alright. and that's how many for the LEFT function.
So I'm going to come to the beginning, LEFT of what?
This!
Comma.
Alright, so let's see if this works, this is an array formula, Ctrl+Shift+Enter, double-click and send it down, and sure enough, you can see, it gets that now, I'm going to just slap that.
In essence, this big monster here, is the lookup value, so VLOOKUP, that'll be the lookup value, right?
So I come to the ending, comma, the table array (typing) that little thing right there, F4, comma, the second column is what we want to return, ,0 for FALSE, close parenthesis.
Ctrl+Shift+Enter, that’s looking up- Oh, so it's actually looking up.
But what do we want?
This one divided by that!
What a big nasty formula, Ctrl+Shift+Enter, double-click and send it down.
And there we have it, VBA wins the point and the elegant award.
Alright, see you next trick!
Bill: You have to love the way that Mike can explain that formula starting from the inside out, you can actually kind of follow what's going along.
Most people, I think, who encounter this formula would just be like “What the heck is this?” Now I'm interested to see which is faster, like using native array formulas here or the VBA.
So in the past podcast, we've had this little trick where we can turn things back to manual calculation, and actually time how long it takes to calculate a range.
So let's run that using Mike's method, 0.023 seconds, and let's try the VBA to see how that works out, choose the same range, “Fx”, 0.04.
So look at that, Mike's method calculates in just about half the time as the VBA method.
Now you know, unless you have a hundred thousand rows, it's an almost immaterial difference, .02 of a second, but much faster to use that complicated array formula than a little bit of VBA.
Well hey, this is, uh, this was a great Dueling Excel podcast, two very different ways of going at it.
I would never think to start to write that formula, I would just switch over to VBA, Mike was able to figure out the formula, so that's excellent.
I want to thank everyone for stopping by, we'll see you next week for another Dueling Excel podcast from MrExcel and ExcelIsFun!
Mike: Hey this is Mike Gel Girvin, I’m ExcelIsFun in YouTube, and I have a different way to do that!
Bill: Hey welcome back, I'm Bill Jelen from MrExcel.
Well here it is, the last podcast of 2009, of course it's a Dueling Excel podcast, so Mike Girvin will be joining us.
Now Mike had someone send in this question, I think was Ahmed, Ahmed needs to do a lookup, but he needs to do a lookup on only the text portion.
Alright, and Mike already gave me a heads up that the formula solution is horrible, so I'm going to do what I always do and that's VBA.
Let's switch over to VBA, Alt+F11, we’ll insert a module, and I'll say Function TextOnly(Entry), so Entry is going to be the argument that gets passed.
And what we're going to do is we're going to say For i = 1 To Len(Entry), so you know, depending on how long the entry is.
ThisChar = Mid(Entry),i,1 . So this is just like using the MID function in Excel.
And then we're going to do Select Case Asc(ThisChar), now the A-S-C function in VBA is just like the CODE function in Excel, see, I started to head off in the wrong direction there.
Select Case is pretty cool.
What I want to do is I want to find all the digits, so the digits are Case 48 (that's a 0), 49, 50, 51, 52, 53, 54, 55, 56, 57, alright!
If we get any of those, what do I want to do?
I don't want to do anything because I want to ignore the digits.
What I'm really interested in is everything else, Case Else, and we're going to say TextOnly = the old TextOnly value, which initially will be nothing, & ThisChar, alright, that's it, that's the code.
So this is going to run through each characters, go ahead, take a look, and see if it is one of the digits 0-9.
If it is, it's going to do nothing, otherwise it's going to build the string of text.
Let's try it here, we'll try our function, make sure that's working, so =TEXTONLY that value, ah look at that, It's getting just the letters, perfect, exactly what we need.
Now that it's, you know, from here it's a no-brainer, right?
So VBA, not many lines of code, you see, it didn't take long to knock it out, simple little loop, check to see if it's digits, seems to be working relatively fast.
Now, I'm like, it's going to have a much more elegant solution, I'm already going to give him the point.
Let's see what Mike has!
Mike: Thanks MrExcel!
Elegant?
That VBA blows away!
What I'm about to do right here, this is a messy formula, and here's the trouble with this: When you're trying to extract certain parts of a text string like this, usually we have something like a pattern we can exploit, right?
A dash, with a dash you could do something like this LEN-FIND and the LEFT.
If it was always the same 3 lengths for prefix, you just use LEFT(3), but in both these cases there's no dash, and it's not just LEFT(3), sometimes there's 2 characters, sometimes there’s 1.
Finally, you could use LOOKUP a maximum number of characters in the cell, This SEARCH construction, I've done video on this before.
But that's not going to work either because we have duplicate letters, there's an S there and an S there.
So when that construction there is looking for S, it gets a TRUE here and a TRUE here.
Oh!
So the only pattern were left with, is there's letters and there's numbers.
Now this is going to get messy, I'm going to start from the inside of the formula and build our way out.
The first thing is, we need to extract all the individual letters and numbers, and then start doing some analysis on that.
So I'm going to use ROW(INDIRECT, and I'm going to construct, “1:”&LEN, because I need a string of numbers exactly the length of this to then extract the letters.
So this way, it'll give us a varying length, sometimes it'll be 1-2-3-4-5, sometimes it'll be 1-2-3-4-5-6-7. “”, and if you highlight this and hit the F9 key, you can see we get that there, further down we'll get 1-2-3-4-5, Ctrl+Z.
What are we going to do with that, we’ll use the MID because our goal is to extract all the individual letters.
So we use MID of this, comma, and we usually say “Start in position 5 or 4.” But this is an array, and this is how we'll get all of them.
So there's all the starting positions, comma, and how long do we want each one, 1, so close parenthesis.
If we F9 this, you can see, we finally extracted the individual letters, Ctrl+Z, now we have letters.
What happens if I multiply this *1, I'm going to get where the numbers are, a number, but where the text is, I get an error!
So if I hit F9, ah, so now we have something that differentiates the text and the numbers, Ctrl+Z.
So from that I'm going to say “Which one are ISERROR?” And finally, now I have a text string, I'm sorry, an array of TRUEs and FALSEs, that's the first place where there's a number, that's the last place where there's a letter, Ctrl+Z.
So from that now, I need to find, let me just do that again, because here's the one conceptual trick, that FALSE, I need that position, and match when there's duplicates like FALSE, we’ll find the first one.
So we can use MATCH and look up FALSE, Ctrl+Z, so I use MATCH and look up FALSE, comma.
And I need to do an exact match because we want the first one, and exact match will find, with duplicates, it always finds the first one.
If I hit F9, finally, I get to a 4, but I really want LEFT how many, so I need to Ctrl+Z, -1, alright. and that's how many for the LEFT function.
So I'm going to come to the beginning, LEFT of what?
This!
Comma.
Alright, so let's see if this works, this is an array formula, Ctrl+Shift+Enter, double-click and send it down, and sure enough, you can see, it gets that now, I'm going to just slap that.
In essence, this big monster here, is the lookup value, so VLOOKUP, that'll be the lookup value, right?
So I come to the ending, comma, the table array (typing) that little thing right there, F4, comma, the second column is what we want to return, ,0 for FALSE, close parenthesis.
Ctrl+Shift+Enter, that’s looking up- Oh, so it's actually looking up.
But what do we want?
This one divided by that!
What a big nasty formula, Ctrl+Shift+Enter, double-click and send it down.
And there we have it, VBA wins the point and the elegant award.
Alright, see you next trick!
Bill: You have to love the way that Mike can explain that formula starting from the inside out, you can actually kind of follow what's going along.
Most people, I think, who encounter this formula would just be like “What the heck is this?” Now I'm interested to see which is faster, like using native array formulas here or the VBA.
So in the past podcast, we've had this little trick where we can turn things back to manual calculation, and actually time how long it takes to calculate a range.
So let's run that using Mike's method, 0.023 seconds, and let's try the VBA to see how that works out, choose the same range, “Fx”, 0.04.
So look at that, Mike's method calculates in just about half the time as the VBA method.
Now you know, unless you have a hundred thousand rows, it's an almost immaterial difference, .02 of a second, but much faster to use that complicated array formula than a little bit of VBA.
Well hey, this is, uh, this was a great Dueling Excel podcast, two very different ways of going at it.
I would never think to start to write that formula, I would just switch over to VBA, Mike was able to figure out the formula, so that's excellent.
I want to thank everyone for stopping by, we'll see you next week for another Dueling Excel podcast from MrExcel and ExcelIsFun!