An annoying bug (?) when using text to columns

markaval

New Member
Joined
Nov 2, 2015
Messages
13
Hi,

First time on this forum so looking forward to improving my Excel skills here.

I'm not an Excel newbie but I'm not an expert either.

I've been building a tabular dataset using results posted on ResultsDB detailing English Premiership football results going back to season 2000-01. The ultimate aim is to have an extensive dataset to build a decent dashboard to practice my dashboard building skills.

I copied the data for the relevant season into a text file (380 rows per season) and then imported the file into a single column formatted as general with no delimiters. I built a macro to reformat the date to dd/mm/yyyy format, separate out teams into separate columns, and to split out score into home and away scores. The macro worked perfectly until I hit a particular variation on team name: Queens Park Rangers. This meant that my macro needed modifying to account for a team name that had three elements. I won't go into the details of the macro but I'll outline the issue I ran into:

1. I imported the text file into Excel (no problems); an example of a row of data is
[TABLE="width: 411"]
<tbody>[TR]
[TD] [TABLE="width: 433"]
<tbody>[TR]
[TD]April 27th 2013 Southampton - West Brom 0-3[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[/TR]
</tbody>[/TABLE]

2. I then did a Text to Columns: I used a space delimiter to separate out April-27th-2013 into three separate columns so I could reformat the date in dd/mm/yyyy.
3. I left out all the other columns.

I've used the text to columns tool many times but what happened next was weird: on a number of instances (but not all) where Queens Park Rangers was associated with the line of data, there would occasionally be an entry in the otherwise empty next column with either a score or a partial date in it. Here's an example:

February 23rd 2013 Queens Park Rangers - Manchester United 0-2 (original line of data)
February 23rd 2013 0-2 (extracted data using text to columns)

The weird bit is the 0-2 after the date. I couldn't understand why this was appearing as I only wanted the date elements. I thought I'd made a mistake in defining criteria in text to columns so I did it again but the same issue kept occurring. Whilst I was scratching my head trying to work out what had happened I was scrolling down the data in the Text to Columns window Step 3 of 3 when I noticed that despite having selected all columns that I didn't want to import (or so I thought), when I scrolled down the data another column would occasionally appear that coincided with the partial date/score entry I mentioned above. The new column would then disappear when I scrolled past the entry. I've never seen anything like it before and if anyone understands what is going on and can explain it I would be really grateful!

I did work out a way around it, and my revised macro now formats all the seasons data as I need it, however I'd like to know if I've done something wrong or whether this is a bug in the way that Excel deals with columns of data that might only have a few entries in it. Any feedback would be gratefully received.

Thanks in advance

Markaval

<tbody>
</tbody>

<tbody>
</tbody>
 

Excel Facts

Wildcard in VLOOKUP
Use =VLOOKUP("Apple*" to find apple, Apple, or applesauce
Are you sure it isn't a problem with the data?

Where are you getting it from?
 
Upvote 0
Are you sure it isn't a problem with the data?

Where are you getting it from?

Hi Norie, thanks for the quick reply :)

The data is perfect - I've got a macro that runs perfectly even with the data that was showing the issue when run through text to columns.

The actual data itself comes from www.resultdb.com/english-premier-league which is a pretty reputable source.

The process I've used for importing the data as a text file and this normally minimises potential data corruption for this type of exercise.

Hope that explains the data source.

Thanks again for replying.

Markaval
 
Upvote 0
The problem is here.
in the Text to Columns window Step 3 of 3 when I noticed that despite having selected all columns that I didn't want to import (or so I thought), when I scrolled down the data another column would occasionally appear that coincided with the partial date/score entry I mentioned above.

You have inconsistent dataset (text to columns depends on consistency)
When you first get to the 3rd window of TTC, the sample set of data shown in that window is only a few rows.
And it does not include the 'extra' column due to the occasional 3 word names.
So selecting all the columns at the end, is only really parsing the 'visible' rows within the TTC window.

If you scroll that window down far enough to actually see one of those oddball rows, THEN select all the columsn to the end and choose Do Not Import.

That should do the trick.
 
Upvote 0
The problem is here.


You have inconsistent dataset (text to columns depends on consistency)
When you first get to the 3rd window of TTC, the sample set of data shown in that window is only a few rows.
And it does not include the 'extra' column due to the occasional 3 word names.
So selecting all the columns at the end, is only really parsing the 'visible' rows within the TTC window.

If you scroll that window down far enough to actually see one of those oddball rows, THEN select all the columsn to the end and choose Do Not Import.

That should do the trick.

Hi Jonmo1,

Ok, I think I understand what you're saying. I'll try it out and drop you a line to let you know how I get on.

Thanks to forum members for quick responses!

Markaval
 
Upvote 0
Hi Jonmo1,

Just wanted to let you know that you were completely right. I tried out what you described and it worked. I would say that I would have thought that the geniuses at Microsoft would have figured out a way to give you the option to display columns that might have blanks in them by now :). Anyway, your advice helped, so thank you very much.

Regards

Mark
 
Upvote 0
You're welcome.

I replied to another almost identicle post, and I posted a nice little workaround.
Hang on, I'll see if I can find it.
 
Upvote 0

Forum statistics

Threads
1,223,901
Messages
6,175,277
Members
452,629
Latest member
SahilPolekar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top