Data manipulation to remove undesired character strings

FlexicareTommy

New Member
Joined
Apr 4, 2013
Messages
11
Hello Mr. Excel experts! I'm in need of help to manipulate a data report that populates the entire information into one cell ("Text to columns" does not work unless there's some amazing trick I don't know how to use). This is a credit card financial statement data. Below is a small sub-section sample.

12/04 THE PHOTO LAB COSTA MESA CA 1021.62
12/05 CKE*CALIENTE SOUTHWE 271 COSTA MESA CA 9.70
12/06 SP * TASTY SHOP HTTPSTASTYSHO NY 28.00
12/07 WAL-MART #5687 IRVINE CA 18.92
12/07 REI.COM 800-426-4840 WA 287.70
12/07 APL* ITUNES.COM/BILL 866-712-7753 CA .99[TABLE="width: 64"]
<tbody>[TR]
[/TR]
[TR]
[/TR]
[TR]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
I can easily break the transaction date by using a LEFT() formula because the string is always 5 characters long. However the vendor, city, state, and value are always dynamically changing in character length for thousands of rows of data.

The state is always 2 characters in length followed by a space and then the value of the transaction - anywhere from 3-8 characters depending on the actual dollar value. I cant come up with a consistent way to get all information into their respective columns because the vendor data is always so drastically different even including numbers and special characters (two ex: REI.com with phone number and ITUNES which doesn't have a proper city name)

The format I'm trying to break this into is minimum four columns - Date, Vendor, State, Value.
City would be nice, but since it's not always available, this might be impossible. Can anyone solve this puzzle?
 

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Date, State, and amount are easily accomplished. The problem with the city, without any real delimiter, is because cities can be more than 2 words, there is no real way to determine where whatever comes before the city ends and the city name starts.
Formula in B2: =LEFT(A2,5)
Formula in C2: =LEFT(REPLACE(A2,1,FIND("~",SUBSTITUTE(A2," ","~",(LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))),""),2)
Formula in D2: =RIGHT(SUBSTITUTE(A2," ",REPT(" ",255)),255)+0
Format column D as number with 2 decimal places.
Copy down.


Excel 2010
ABCD
1DateStateAmount
212/04 THE PHOTO LAB COSTA MESA CA 1021.6212/04CA1021.62
312/05 CKE*CALIENTE SOUTHWE 271 COSTA MESA CA 9.7012/05CA9.70
412/06 SP * TASTY SHOP HTTPSTASTYSHO NY 28.0012/06NY28.00
512/07 WAL-MART #5687 IRVINE CA 18.9212/07CA18.92
612/07 REI.COM 800-426-4840 WA 287.7012/07WA287.70
712/07 APL* ITUNES.COM/BILL 866-712-7753 CA .9912/07CA0.99
Sheet1
Cell Formulas
RangeFormula
B2=LEFT(A2,5)
B3=LEFT(A3,5)
B4=LEFT(A4,5)
B5=LEFT(A5,5)
B6=LEFT(A6,5)
B7=LEFT(A7,5)
C2=LEFT(REPLACE(A2,1,FIND("~",SUBSTITUTE(A2," ","~",(LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))),""),2)
C3=LEFT(REPLACE(A3,1,FIND("~",SUBSTITUTE(A3," ","~",(LEN(A3)-LEN(SUBSTITUTE(A3," ",""))-1))),""),2)
C4=LEFT(REPLACE(A4,1,FIND("~",SUBSTITUTE(A4," ","~",(LEN(A4)-LEN(SUBSTITUTE(A4," ",""))-1))),""),2)
C5=LEFT(REPLACE(A5,1,FIND("~",SUBSTITUTE(A5," ","~",(LEN(A5)-LEN(SUBSTITUTE(A5," ",""))-1))),""),2)
C6=LEFT(REPLACE(A6,1,FIND("~",SUBSTITUTE(A6," ","~",(LEN(A6)-LEN(SUBSTITUTE(A6," ",""))-1))),""),2)
C7=LEFT(REPLACE(A7,1,FIND("~",SUBSTITUTE(A7," ","~",(LEN(A7)-LEN(SUBSTITUTE(A7," ",""))-1))),""),2)
D2=RIGHT(SUBSTITUTE(A2," ",REPT(" ",255)),255)+0
D3=RIGHT(SUBSTITUTE(A3," ",REPT(" ",255)),255)+0
D4=RIGHT(SUBSTITUTE(A4," ",REPT(" ",255)),255)+0
D5=RIGHT(SUBSTITUTE(A5," ",REPT(" ",255)),255)+0
D6=RIGHT(SUBSTITUTE(A6," ",REPT(" ",255)),255)+0
D7=RIGHT(SUBSTITUTE(A7," ",REPT(" ",255)),255)+0
 
Last edited:
Upvote 0
Nice job @Scott Huish

And perhaps to round it off:

Cell E2 (HelperColumn)
Code:
=TRIM(RIGHT(A2,LEN(A2)-LEN(B2)))

Cell F2 Balance of text
Code:
=TRIM(LEFT(E2,((FIND(" "&C2&" ",E2))-1)))

Cheers

pvr928
 
Last edited:
Upvote 0
Hi,

Got interrupted while posting above.
Since you say some rows will have City and some won't, the Vendor column may or may not contain the City:


Book1
ABCDEF
1DataDateVendorStateValue
212/04 THE PHOTO LAB COSTA MESA CA 1021.6212/04THE PHOTO LAB COSTA MESACA1021.62
312/05 CKE*CALIENTE SOUTHWE 271 COSTA MESA CA 9.7012/05CKE*CALIENTE SOUTHWE 271 COSTA MESACA9.7
412/06 SP * TASTY SHOP HTTPSTASTYSHO NY 28.0012/06SP * TASTY SHOP HTTPSTASTYSHONY28
512/07 WAL-MART #5687 IRVINE CA 18.9212/07WAL-MART #5687 IRVINECA18.92
612/07 REI.COM 800-426-4840 WA 287.7012/07REI.COM 800-426-4840WA287.7
712/07 APL* ITUNES.COM/BILL 866-712-7753 CA .9912/07APL* ITUNES.COM/BILL 866-712-7753CA0.99
Sheet14
Cell Formulas
RangeFormula
C2=LEFT(A2,5)
D2=MID(A2,7,FIND(" "&E2&" ",A2)-7)
E2=TRIM(LEFT(RIGHT(SUBSTITUTE(A2," ",REPT(" ",100)),200),150))
F2=RIGHT(SUBSTITUTE(A2," ",REPT(" ",100)),50)+0


Formulae copied down.
 
Last edited:
Upvote 0
Nice one @jtakw - I love the approach you take in E2 and F2 - just figured out how it works. So simple and so much easier than what I tend to use.

Cheers

pvr928
 
Upvote 0

Forum statistics

Threads
1,224,820
Messages
6,181,155
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top