Need Help with replacing invalid characters in a cell

RoccoM

New Member
Joined
Apr 22, 2019
Messages
19
I have a column that is basically a description field. It may or may not contain both invalid characters, and hidden, or non-printable characters.
I want to use regular expression to remove the invalid characters and replace them with " " (a space).
so for example a-z, A-Z 0-9 are acceptable all other characters, and hidden characters need to be replaced with a space.

I have already enabled Microsoft VBScript Regular Expressions 5.5 in VBA project.
How can I use regular expression to solve this?

Here is an example of what my data looks like. This happens to be in Column B
[TABLE="width: 603"]
<tbody>[TR]
[TD]3Com, 3C1206-0 - SuperStack II Transceiver Module - Female AUI[/TD]
[/TR]
[TR]
[TD]3Com, 3C1206-3 - SuperStack II Transceiver Module - 10Base-T (UTP)[/TD]
[/TR]
[TR]
[TD]3Com, 3C1206-4 - SuperStack II Transceiver Module - Fan Out (Male AUI)[/TD]
[/TR]
[TR]
[TD]3Com, 3C1206-5 - SuperStack II Transceiver Module - Fibre Optic (ST)[/TD]
[/TR]
[TR]
[TD]3Com, 3C1206-7 - SuperStack II Transceiver Module - 10Base-FB[/TD]
[/TR]
[TR]
[TD]3Com, 3C16074 - SuperStack II Advanced RPS, Type 2 100W Power Module[/TD]
[/TR]
[TR]
[TD]3Com, 3C16411 - SuperStack III Baseline Dual Speed Hub, 24-Port[/TD]
[/TR]
[TR]
[TD]3Com, 3C16593A - SuperStack II Baseline Dual Speed Hub, 24 Port RJ45[/TD]
[/TR]
[TR]
[TD]3Com, 3C16610 - SuperStack II Dual Speed Hub 500 12x RJ45[/TD]
[/TR]
[TR]
[TD]3Com, 3C16683 - SuperStack II 100Base-TX Distance Extender Module[/TD]
[/TR]
[TR]
[TD]3Com, 3C16684 - SuperStack II 100Base-FX Distance Extender Module[/TD]
[/TR]
[TR]
[TD]3Com, 3C16685 - SuperStack II Dual Speed Hub 500 Manag.-M. for3C16610, 3C16611[/TD]
[/TR]
[TR]
[TD]3Com, 3C16700A - OfficeConnect Ethernet Hub 8/TPO, 8x 10BASE-T[/TD]
[/TR]
[TR]
[TD]3Com, 3C16701A - OfficeConnect Ethernet Hub 8C, 8x 10BASE-T[/TD]
[/TR]
[TR]
[TD]3Com, 3C16942A - SuperStack II Switch 3000 Ethernet, 12 x RJ45[/TD]
[/TR]
[TR]
[TD]3Com, 3C17111 - SuperStack III Switch 4300 Series, 100Base-FX Module[/TD]
[/TR]
[TR]
[TD]3Com, 3C17121 - SuperStack III Switch 4300 Series, 10/100/1000Base-T Module[/TD]
[/TR]
[TR]
[TD]3Com, 3C17131 - SuperStack III Switch 4300 Series, 1000Base-SX Module[/TD]
[/TR]
[TR]
[TD]3Com, 3C17203 - SuperStack III Switch 4400, 24 port 10BASE-T/100BASE-TX[/TD]
[/TR]
</tbody>[/TABLE]

Thank you,
Rocco
 

Excel Facts

Does the VLOOKUP table have to be sorted?
No! when you are using an exact match, the VLOOKUP table can be in any order. Best-selling items at the top is actually the best.
And when the procedure is done, you want the first one to look like this, correct?

3Com 3C1206 0 SuperStack II Transceiver Module Female AUI
 
Upvote 0
Code:
Function CleanStr(ByVal str As String) As String
Dim n&, temp$
For n = 1 To Len(str)
    temp = Mid(str, n, 1)
    Select Case Asc(temp)
        Case 32, 48 To 57, 65 To 90, 97 To 122
            'do nothing
        Case Else
            str = Replace(str, temp, " ")
    End Select
Next n
CleanStr = WorksheetFunction.Trim(str)
End Function
 
Upvote 0
This does not use Regular Expressions, but it should be quite fast. What the code does is handle the entire column of data all at once. You did not say what column the data is in, so I assumed Column A (change the red highlighted column letters as needed). You also did not make clear if you wanted the original data overwritten or if you want the output in a different column. I assumed a different column and used Column B for the output (change the blue highlighted column letter as needed... you can even change it to the column letter containing the original data if you want that original data overwritten).
Code:
Sub AlphaNumericOnly()
  Dim R As Long, X As Long, Data As Variant
  Data = Range("[B][COLOR="#FF0000"][SIZE=2]A[/SIZE][/COLOR][/B]1", Cells(Rows.Count, "[B][COLOR="#FF0000"][SIZE=2]A[/SIZE][/COLOR][/B]").End(xlUp))
  For R = 1 To UBound(Data)
    For X = 1 To Len(Data(R, 1))
      If Mid(Data(R, 1), X, 1) Like "[!A-Za-z0-9]" Then Mid(Data(R, 1), X) = " "
    Next
    Data(R, 1) = Application.Trim(Data(R, 1))
  Next
  Range("[B][COLOR="#0000FF"][SIZE=2]B[/SIZE][/COLOR][/B]1").Resize(UBound(Data)) = Data
End Sub
 
Last edited:
Upvote 0
It seems that there is a list of invalid characters.
With the following you can eliminate them and leave all the caracreres that you could need.


Code:
Sub ReplaceSpecial()
    Dim wCell As Range, i As Long, special As Variant, wValue As String
    special = Array(127, 129, 141, 143, 144, 157, 160)
    For Each wCell In ActiveSheet.UsedRange.SpecialCells(xlCellTypeConstants, 23)
        wValue = wCell.Value
        For i = 0 To UBound(special)
            wValue = Replace(wValue, Chr(special(i)), "")
        Next
        For i = 0 To 31
            wValue = Replace(wValue, Chr(i), "")
        Next
        wCell.Value = wValue
    Next
End Sub
 
Upvote 0
It seems that there is a list of invalid characters.
With the following you can eliminate them and leave all the caracreres that you could need.


Code:
Sub ReplaceSpecial()
    Dim wCell As Range, i As Long, special As Variant, wValue As String
    special = Array(127, 129, 141, 143, 144, 157, 160)
    For Each wCell In ActiveSheet.UsedRange.SpecialCells(xlCellTypeConstants, 23)
        wValue = wCell.Value
        For i = 0 To UBound(special)
            wValue = Replace(wValue, Chr(special(i)), "")
        Next
        For i = 0 To 31
            wValue = Replace(wValue, Chr(i), "")
        Next
        wCell.Value = wValue
    Next
End Sub
I don't think your code does what the OP wants. Your code replaces only certain invalid characters, but the OP said...
I want to use regular expression to remove the invalid characters and replace them with " " (a space).
so for example a-z, A-Z 0-9 are acceptable all other characters, and hidden characters need to be replaced with a space.
Also see my question about his first example (Message#2) and the OP's response (Message#3).
 
Upvote 0
I don't think your code does what the OP wants. Your code replaces only certain invalid characters, but the OP said...

Also see my question about his first example (Message#2) and the OP's response (Message#3).

Oh, you're right, then I add the space

Code:
Sub ReplaceSpecial()
    Dim wCell As Range, i As Long, special As Variant, wValue As String
    special = Array(127, 129, 141, 143, 144, 157, 160)
    For Each wCell In ActiveSheet.UsedRange.SpecialCells(xlCellTypeConstants, 23)
        wValue = wCell.Value
        For i = 0 To UBound(special)
            wValue = Replace(wValue, Chr(special(i)), [COLOR=#ff0000]" "[/COLOR])
        Next
        For i = 0 To 31
            wValue = Replace(wValue, Chr(i), [COLOR=#ff0000]" "[/COLOR])
        Next
        wCell.Value = wValue
    Next
End Sub

I want to use regular expression to remove the invalid characters and replace them with " " (a space).
Let's see if they are the invalid characters , In any case, there is a macro to remove only invalid characters ;)
 
Upvote 0

Forum statistics

Threads
1,223,243
Messages
6,170,967
Members
452,371
Latest member
Frana

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top