Find Last Dash
July 07, 2017 - by Bill Jelen
Today is a crazy question. You have a column of part numbers. There are anywhere from 4 to 7 dashes in the part number. You want to extract only the portion of the part number after the first dash and up to but not including the last dash. This is a dueling Excel episode.
Watch Video
- Goal is to find the first & last dash and keep everything in between
- The hard part here is finding the last dash
- Bill Method 1: Flash Fill
- Manually fill in the first few (including some with different numbers of dashes)
- Select the blank cell below that
- Ctrl + E to Flash Fill
- Mike Method 2:
- Use Power Query
- In Excel 2016, Power Query is in the Get & Transform group in Excel 2016
- In Excel 2010 & 2013, download Power Query from Microsoft. It creates a new Power Query tab in the Ribbon
- Convert your data to a table using Ctrl + T
- Use Split Data in Power Query - first to split at the leftmost dash, then to split at the right-most dash
- Bill Method 3:
- VBA Function that iterates from end of the cell backwards to find the last dash
- Mike Method 4:
- Use SUBSTITUTE to find the location of the Nth dash
- SUBSTITUTE is the only text function that allows you to specify an Instance number
- To find which instance number, use
=LEN(A2)-LEN(SUBSTITUTE)
Video Transcript
Bill: Hey. Welcome back. It’s time for another Dueling Excel podcast. I’m Bill Jelen from MrExcel. [I’ll be joined by Mike Girvin from ExcelIsFun. This is our – 00:03] episode 185: extract from the first - to the last -.
Alright. Today's question is sent in by Anvar on YouTube. How can I extract everything from the first - to the last -, and check out this data he has here. There are a huge number of dashes, anywhere from 3, 5, 6, 7 dashes, alright?
So, my first thought is, well, hey, it's really easy to find the first -, right? = left or = MID of the FIND of A2 and then the -, +1 alright, but to get to the last -, that's going to make my head hurt, right, because, well, how many dashes do we have? We could take the SUBSTITUTE of A2, replacing the dashes, and compare the length of that, the original length. That tells me the number of dashes, but now I know which - to find, the 2nd, 3rd, 4th, 5th, but do I use FIND?
I was ready to go to VBA, right? That's my knee-jerk reaction. I said, wait a second. I said, Anvar, what version of Excel are you in? He says, I'm in Excel 2016. I said, that's beautiful. If you're in Excel 2013 or newer, we could use this great new feature called flash fill. With flash fill, we just have to give it a pattern, and I'm going to give it enough of a pattern so it's not just that I'm taking one with two dashes and doing that a couple of times. I want to make sure that I have a few different dashes that way. Chad on the Excel team knows what I'm looking for. Chad's the guy that wrote the logic for flash fill. So, I get about 3 of them in there and then CONTROL+E is the shortcut for using DATA and then FLASH FILL, and, sure enough, it looks like it did the right thing. Alright, Mike. Let’s see what you have.
Mike: Thanks, MrExcel. Yeah. Flash fill wins. That feature right there, flash fill, is one of the modern Excel tools that is simply amazing. If it's a one-time deal and you have a consistent pattern, hey, that's the way I would do it.
Hey, let's go over to the next sheet. Now, instead of using flash fill, we can actually use power query. Now, I'm using Excel 2016 so I have the GET & TRANSFORM group. That's power query. In earlier versions, 2013 [to 10 – 2:30], you actually have to download the free power query add-in.
Now, in order to get power query to work, this has to be converted to an Excel table. Now, again, I would use flash fill if this was a one-time deal. When would you use power query? Well, if you had really big data or was coming from an external source, this would be the way to go, or you might even like this better than having to type 3 or 4 examples for flash fill because, with power query, we can specifically say find the first - and find the last -.
Now, I'm going to convert this to an Excel table. I have a single cell selected, empty cells all the way around. I go to INSERT, TABLE, or you use the keyboard, CONTROL+T. I can click OK or ENTER. I want to name this table, so I’ll go up to TABLE TOOLS, DESIGN, up into PROPERTIES. I’m going to call this STARTKEYTABLE and ENTER. Now I can go back to DATA, bring it into power query using the FROM TABLE button. There's my column. There's the name. I don't want to keep this name because the output will be exported to Excel and I want to give it a different name. So, I'll call it CLEANEDKEYTABLE. I don't need that CHANGED TYPE. I'm just looking at the source. Now I can click on the column and, right up in HOME, there's the SPLIT button. I can say SPLIT, BY DELIMITER. Looks like it already guessed. I'm going to say LEFT-MOST. Click OK.
Now, if I look over here I see CHANGED TYPE. I don't need that so I'm going to get rid of that step. I only have SPLIT COLUMN BY DELIMITER. Now, I'm going to do this again but, instead of using the SPLIT button up here, right click down to SPLIT COLUMN, BY DELIMITER, and look at that. We can choose to split it by the RIGHT-MOST DELIMITER. Click OK. Now, I don't need these two columns so I'm going to right click the column I want to keep, REMOVE OTHER COLUMNS. I'm actually going to X this CHANGED TYPE out. It's going to say ARE YOU SURE YOU WANT TO DELETE THIS? I'm going to say, yes, DELETE. There's my clean data.
Now I can come up to CLOSE & LOAD. CLOSE & LOAD TO. This is the new IMPORT dialog box. It used to say LOAD TO but I want to load it to a table, on an EXISTING WORKSHEET. Click the collapse button. I'm going to select C1, uncollapse, click OK, and there we go. Power query to clean our data and get just the data we want. Alright. I'll throw it back to MrExcel.
Bill: There’s the point right there, RIGHT-MOST DELIMITER in the SPLIT COLUMN BY DELIMITER, one of the cool features in power query. That's awesome.
Alright. My knee-jerk reaction -- VBA UDF [unintelligible – 05:34] really easy to do VBA. Switch over to ALT+F11. INSERT a MODULE. In that module, type this code. I'm going to [create a – 05:43] brand new function, I’m going to call it MIDPART, and I'm going to pass it some text, and then what I'm going to do is I'm going to go from the last character in that cell from the length of MYTEXT back to 1, STEP -1 and look at that character. So, the MID of MYTEXT, that variable i, tells us which character we're looking at for length of 1. Is it a -? As soon as I find a -, I'm going to take the LEFT of MYTEXT starting at character i – 1, so I get rid of everything for that last - all the way out, and then, make sure I don't go keep looking for more dashes, the EXIT FOR will get me out of this [unintelligible – 06:17] loop, and from there is the easy part. We're just going to take the MYTEXT, start at the MID of MYTEXT, [where I use the – 06:26] use the function FIND to find the first -, go 1 more than that, and return that back.
So, let's go back, ALT+Q, to return to Excel. = MIDPART tab of that, and it looks like it's working. Copy that down. Mike, do you have another one? [=MIDPart(A2)]
Mike: Well, I do have another one, MrExcel, but it's going to be one long formula -- not as short as that UDF. Alright, let's go over to the next sheet. Now, if we're going to do a formula and we have some text and there are always a different number of delimiters, somehow, I need to get the position of that last delimiter.
Now, this is going to take a few steps but I'm going to start with the SUBSTITUTE function. I'm going to look through that text, , the old text I want to find is in ”, that -, , and what do I want to put in its place or substitute? “”. That will put nothing in. Now, if I ) and CONTROL+ENTER, what is that going to do? [=SUBSTITUTE(A2,“-”,“”)]
Well, now I can take the length of this and subtract it from the length of this item. That will tell me how many delimiters there are. F2, and right at the beginning, I'm going to type the length of that. That will give me the full length – the length of that dashless text, ), CONTROL+ENTER, double click, and send it down. that tells me how many delimiters there are for this text. There are 6. [=LEN(A2)-LEN(SUBSTITUTE(A2,“-”,“”))]
Now I'm going to use that sixth now inside of substitute to put a different character right at the sixth listing of the delimiter, F2, and if I type SUBSTITUTE, what we want to notice is this function has an instance number. If you look at other text functions like search and find, they don't have an instance number. Substitute is the only one I can think of that actually lets you specifically say which instance of a delimiter you want to deal with. Here's the text, ,. Old text is in “ a -, and I need to pick for the new text some character that will never be in this text ring. I'm going to choose, like, ^ or something like that, , and that's where instance number comes in, ), CONTROL+ENTER, and there it is. If I double click and send it down, it's always putting that ^ in the position of the last delimiter. [=SUBSTITUTE(A2,“-”,“^”,LEN(A2)-LEN(SUBSTITUTE(A2,“-”,“”)))]
Now I need to figure out, in each one of these, what position it is in. F2. I'm going to use the SEARCH function. SEARCH. I type S and tab. Now, search and find are the same except for search is not case-sensitive. In this case, either one would be fine because the text I'm looking for is in “, that ^, ”, , within that text. By the way, the reason that I use search instead of find is because S tab gets me search but F I tab will get me find. So, it's like one character less when typing it out. CONTROL+ENTER, double click and send it down, and now it tells me, in the 27th position is that last delimiter. [=SEARCH(“^”,SUBSTITUTE(A2,“-”,“^”,LEN(A2)-LEN(SUBSTITUTE(A2,“-”,“”))))]
Now, I'm going to take this approach for these text items. I'm now going to use the left function and get everything from the very beginning all the way up to that position. That will get rid of that last little bit. Now, actually, search tells us 27 which is right there and we only want to go to 26. So, F2, and, at the end, I'm going to – 1, CONTROL+ENTER, double click and send it down. Now, I can use the left function. F2. LEFT. There it is, left of that, ,. That's how many characters. ), CONTROL+ENTER, double click and send it down. So, now, we have gotten rid of the last little bit after the last delimiter in every cell. [=LEFT(A2,SEARCH(“^”,SUBSTITUTE(A2,“-”,“^”,LEN(A2)-LEN(SUBSTITUTE(A2,“-”,“”))))-1)]
Now all I need to do is replace the first four characters, first four characters, first three characters. Now, I can use the search function on the original text because it can find the - which is three and I'll tell replace, please go, from the first character, three characters in and replace it with nothing, F2, and right at the beginning, I'm going to type REPLACE. There's the old text. Now watch this. I want to give myself a little bit more breathing room. I'm just going to artificially pick a space, ALT+ENTER. That's kind of like we do in DAX. Now I just have more breathing room. That's the old text, ,. The starting number, I need to always start at the first position so I simply type 1, , and I need to find that first - which represents number of characters. So, S tab, “-” , through…within that text, that search will find 4, 4, 3. That will work. ) and then , new text “”. That will put nothing in those first characters. ). I have the entire column highlighted so I can populate this edited formula with CONTROL+ENTER, and there we go. All the way down, we’re extracting everything between the first and the last -. [=REPLACE(LEFT(A2,SEARCH(“^”,SUBSTITUTE(A2,“-”,“^”,LEN(A2)-LEN(SUBSTITUTE(A2,“-”,“”))))-1),1,SEARCH(“-”,A2),“”)]
Now, the only reason we want to be crazy like that with formulas is if we wanted the formula result to instantly update whenever we changed anything, so if I type -00, instantly it updates. Power query and flash fill will not automatically update, alright? Send it back to MrExcel.
Bill: Well, that was one heck of a formula. Like, substitute was the trick. I had used substitute in the first step but didn't see that it had the instance number. Alright, so, we have four different methods here today. My first method is flash fill. Select first few, select the blank box below that, and then CONTROL+E to flash fill. Mike's method, use power query. I love that, especially the split data letting you use the leftmost - and then the rightmost -. My live seminars always talk about this one feature. Should be a finalist for the Nobel Prize for the best excel feature. It wouldn't win but it would be in one of the top five, I'm sure. My method number three, VBA function, a UDF user-defined function, that iterates from the end of the cell, and then, Mike's method, the awesome formula method. Use substitute to find the location of the nth - and then pass that answer back into substitute that tells you which instance number to look from. Brilliant.
Well, there you go. I want to thank everyone for stopping by. We’ll see you next time for another Dueling Excel podcast from MrExcel and ExcelIsFun.
Download File
Download the sample file here: Duel185.xlsm
Title Photo: Alexas_Fotos / Pixabay