Hamilton sends in a question about parsing data that has leader......lines between the columns. This ends up being trickier than you might think. Episode 782 shows you how I approached the problem.
This blog is the video podcast companion to the book, Learn Excel 97-2007 from MrExcel. Download a new two minute video every workday to learn one of the 377 tips from the book!
This blog is the video podcast companion to the book, Learn Excel 97-2007 from MrExcel. Download a new two minute video every workday to learn one of the 377 tips from the book!
Transcript of the video:
Hey. Alright. Welcome back to the MrExcel netcast. I'm Bill Jelen.
Question of the day from Hamilton.
Hamilton sent in this text file, said that he's trying to parse it out, and we basically have a description over here and then a P/N: and a part number, and, between those, we have the leader lines.
We have the row of dots.
Now, you know, Word can do this but it's causing just a real problem here.
If we would use DATA, TEXT TO COLUMNS, and say that our delimiter was a . -- those dots -- so we click the OTHER here and choose the dot, and then click FINISH, you're going to see what's going to happen is, because we have a different number of periods in each row, then the part numbers are going to just be all in different columns out here.
Not going to be a good situation at all.
The next thing I took a look at was, well, maybe these are really perfectly fixed width, and so, if we switch from ARIAL to a nice COURIER NEW font, I had hoped that things were going to line up perfectly but, in fact, they don't.
Another approach in DATA, TEXT TO COLUMNS is to say that it's DELIMITED by a . but we want to TREAT CONSECUTIVE DELIMITERS AS ONE.
That's going to force all of those leader lines to basically be treated as a single ., but when we click FINISH here, we still have a few problems because there were some periods in the product descriptions.
So, here, we had ELECT, period, REMOTE, and that forced…things did not work out exactly.
So, I think what I'm going to do, as I look at this data, we’ll see that after the part number, there is a : and I don't see any colons back here in the description.
So, I'm going to make my delimiter be a : and kind of attack this in a couple of steps.
So, DATA, TEXT TO COLUMNS, it’s DELIMITED by a :, and click FINISH.
Now, what we get is the part numbers are now perfectly over here in column B but we still have a lot of problems in column A.
At this point, I'm going to use EDIT, REPLACE, and I'm going to change every occurrence of .. to nothing and REPLACE ALL, and then, I'm going to change every occurrence of .P/N to nothing, REPLACE ALL, and then finally change every occurrence of just P/N, REPLACE ALL, and we end up with the final result.
Now, notice because I was able to ask for .P/N, that avoided some of the other periods -- like here, this period in cell A10 -- from being detected.
So, kind of a hassle that we need to go through 3 or 4 steps.
However, I suspect that Hamilton really has pages and pages and pages of this data, and certainly those 3 or 4 steps are a lot easier than going through and dealing with thousands or tens of thousands of rows when they all end up in the various columns.
So there you have it.
Thanks for stopping by.
We'll see you next time for another netcast from MrExcel.
Question of the day from Hamilton.
Hamilton sent in this text file, said that he's trying to parse it out, and we basically have a description over here and then a P/N: and a part number, and, between those, we have the leader lines.
We have the row of dots.
Now, you know, Word can do this but it's causing just a real problem here.
If we would use DATA, TEXT TO COLUMNS, and say that our delimiter was a . -- those dots -- so we click the OTHER here and choose the dot, and then click FINISH, you're going to see what's going to happen is, because we have a different number of periods in each row, then the part numbers are going to just be all in different columns out here.
Not going to be a good situation at all.
The next thing I took a look at was, well, maybe these are really perfectly fixed width, and so, if we switch from ARIAL to a nice COURIER NEW font, I had hoped that things were going to line up perfectly but, in fact, they don't.
Another approach in DATA, TEXT TO COLUMNS is to say that it's DELIMITED by a . but we want to TREAT CONSECUTIVE DELIMITERS AS ONE.
That's going to force all of those leader lines to basically be treated as a single ., but when we click FINISH here, we still have a few problems because there were some periods in the product descriptions.
So, here, we had ELECT, period, REMOTE, and that forced…things did not work out exactly.
So, I think what I'm going to do, as I look at this data, we’ll see that after the part number, there is a : and I don't see any colons back here in the description.
So, I'm going to make my delimiter be a : and kind of attack this in a couple of steps.
So, DATA, TEXT TO COLUMNS, it’s DELIMITED by a :, and click FINISH.
Now, what we get is the part numbers are now perfectly over here in column B but we still have a lot of problems in column A.
At this point, I'm going to use EDIT, REPLACE, and I'm going to change every occurrence of .. to nothing and REPLACE ALL, and then, I'm going to change every occurrence of .P/N to nothing, REPLACE ALL, and then finally change every occurrence of just P/N, REPLACE ALL, and we end up with the final result.
Now, notice because I was able to ask for .P/N, that avoided some of the other periods -- like here, this period in cell A10 -- from being detected.
So, kind of a hassle that we need to go through 3 or 4 steps.
However, I suspect that Hamilton really has pages and pages and pages of this data, and certainly those 3 or 4 steps are a lot easier than going through and dealing with thousands or tens of thousands of rows when they all end up in the various columns.
So there you have it.
Thanks for stopping by.
We'll see you next time for another netcast from MrExcel.