Excel 2024: Break Apart Data
February 13, 2024 - by Bill Jelen
You have just seen how to join data, but people often ask about the opposite problem: how to parse data that is all in a single column. Say you wanted to sort the data in the figure below by zip code:
Tip
After March 2022, you could easily isolate the Zip code with =TEXTAFTER(A2," ",-1)
. See Text Before or After a Specific Delimiter.
Select the data in A2:A99 and choose Data, Text to Columns. Because some city names, such as Sioux Falls, are two words, you cannot break the data at each occurrence of a space. Instead, you need to use a comma to get the city in column A and the state and zip code in column B, so choose Delimited in step 1 of the wizard and click Next.
In step 2 of the wizard, deselect Tab and select Comma. The preview at the bottom of the dialog shows what your data will look like. Click Next. Aha! For the rest of the day after you use Text to Columns, Excel will remember the choices you've chosen in step 2 of the Convert Text to Columns Wizard. If you copy data from Notepad and paste to Excel, it will be split at the comma. This is often maddening because most days, the data is not parsed at the comma, but for the rest of today, it will be. To fix it, close and re-open Excel. |
|
Step 3 of the wizard asks you to declare each column as General, Text, or Date. It is fine to leave the columns set as General. |
|
After you've split the state and zip code to column B, select B2:B99 and again choose Data, Text to Columns. This time, since each state is two characters, you can use Fixed Width in step 1 of the wizard. To preserve leading zeros in the zip code, select the second column and choose Text as the data type in step 3 of the wizard. New in 2023: Go to File, Options, Data. Turn off Remove Leading Digits and Convert to a Number. |
Tip
A lot of data will work well with Fixed Width, even it doesn't look like it lines up. In the next figure, the first three rows are in Calibri font and don't appear to be lined up. But if you change the font to Courier New, as in rows 4:7, you can see that the columns are perfectly lined up.
Sometimes, you will find a data set where someone used Alt+Enter to put data on a new line within a cell. You can break out each line to a new column by typing Ctrl+j in the Other box in step 2 of the wizard, as shown below. Why Ctrl+j? Back in the 1980's IBM declared Ctrl+j to be a linefeed. Ctrl+j also can be typed in the Find & Replace dialog box.
There are three special situations that Text to Columns handles easily:
- Dates in YYYYMMDD format can be changed to real dates. In step 3 of the wizard, click the column heading in the dialog, choose Date, then choose YMD from the dropdown.
- If you have negative numbers where the minus sign shows up after the number, go to step 3 of the wizard, click the Advanced Button, and choose Trailing Minus for Negative Numbers.
- Data copied from a Table of Contents will often have dot leaders that extend from the text to the page number as shown below. In step 2 of the wizard, choose Other, type a period, and then select the checkbox for Treat Consecutive Delimiters as One.
This article is an excerpt from MrExcel 2024 Igniting Excel
Title photo by Libby Penner on Unsplash