diversification
New Member
- Joined
- Jun 24, 2020
- Messages
- 37
- Office Version
- 365
- Platform
- Windows
Hi there, I'm interested in setting up VBA that I can use to completely clean account numbers and format them identically, no matter the data source. Just to explain my usecase a bit further, I'm bringing in account data from multiple locations and then running a lot of INDEX MATCH type of stuff to pull it all into one place. Obviously this requires that the functions I'm using correctly identify the account numbers from various sources as matching correctly, and this is impeded by things like spaces, non-printing characters, and improper formatting. I'm trying to create a magic bullet here to clean things up categorically, no matter the source of the data.
The way I've been doing this til now is using Text To Columns, which apparently strips out all unruly characters (save for spaces iirc?) and then lets excel choose what format it thinks is best. The account numbers I'm dealing with sometimes contain letters, and sometimes are all numbers, so as long as I run Text To Columns on all account number columns, they should match up as necessary. Obviouls this could result in a non-homogenously formatted column (some as-text, others as-values) which makes me a bit nervous. It's also a bit tedious to run Text To Columns numerous times.
I think this excellent VBA code (courtesy of @Rick Rothstein ) creates a UDF that is probably my foundation. Rick's post is copied / pasted below to save from having to click to that other thread. Hopefully I didn't butcher the formatting too badly.
I've used the code and already can confirm it pulls out the issues that I run into most (things like the notorious non-breaking space, aka CHAR 160, and so on.)
Here is my remaining concern:
Can this VBA be modified to clean up all unicode characters that might show up and break a MATCH type function? I honestly don't know if this is an issue, but seeing it brought up in that same thread has me concerned.
I also have a couple requests:
Thank you!
The way I've been doing this til now is using Text To Columns, which apparently strips out all unruly characters (save for spaces iirc?) and then lets excel choose what format it thinks is best. The account numbers I'm dealing with sometimes contain letters, and sometimes are all numbers, so as long as I run Text To Columns on all account number columns, they should match up as necessary. Obviouls this could result in a non-homogenously formatted column (some as-text, others as-values) which makes me a bit nervous. It's also a bit tedious to run Text To Columns numerous times.
I think this excellent VBA code (courtesy of @Rick Rothstein ) creates a UDF that is probably my foundation. Rick's post is copied / pasted below to save from having to click to that other thread. Hopefully I didn't butcher the formatting too badly.
Here is a UDF (user defined function) that I developed which will clean and trim the text passed into it. The trim operation is identical to Excel's worksheet TRIM function; however, the clean is slightly different. It cleans some additional non-printing characters that Excel's CLEAN function does not handle. Those additional characters are delineated here...
Remove spaces and nonprinting characters from text - Support - Office.com
I also included an optional argument to convert non-breaking spaces (ASCII 160) to real spaces (ASCII 32). Because non-breaking spaces are such a problem when copying text from the web, I defaulted this optional argument to True (meaning non-breaking space will be converted into true spaces and then handled, along with existing spaces, by the trim operation).
VBA Code:Function CleanTrim(ByVal S As String, Optional ConvertNonBreakingSpace As Boolean = True) As String Dim X As Long, CodesToClean As Variant CodesToClean = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, _ 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 127, 129, 141, 143, 144, 157) If ConvertNonBreakingSpace Then S = Replace(S, Chr(160), " ") For X = LBound(CodesToClean) To UBound(CodesToClean) If InStr(S, Chr(CodesToClean(X))) Then S = Replace(S, Chr(CodesToClean(X)), "") Next CleanTrim = WorksheetFunction.Trim(S) End Function
For those reading this thread who are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use CleanTrim just like it was a built-in Excel function. For example,
=CleanTrim(A1)
I've used the code and already can confirm it pulls out the issues that I run into most (things like the notorious non-breaking space, aka CHAR 160, and so on.)
Here is my remaining concern:
Can this VBA be modified to clean up all unicode characters that might show up and break a MATCH type function? I honestly don't know if this is an issue, but seeing it brought up in that same thread has me concerned.
I also have a couple requests:
- I'd like a way to make this code Macro friendly, in addition to having it setup as a UDF. This would give me the flexibility to run it on selected cells / columns, or to implement it via formula.
- I'd like a way to force a specific type of formatting -- likely either values or as text. As I mentioned above, non-homogenous formatting makes me a bit nervous. Then again, perhaps I'm creating a greater risk by trying to force formatting, so if this is a silly idea it can be skipped.
Thank you!