Function to return list of non-alpha character?

joshman108

Active Member
Joined
Jul 6, 2016
Messages
310
My desire is to return all sequences of non-alpha characters from a range and seperate them by char(10) so they are a list. My intention with char(10) is that I would basically like to be able to return all such non-alpha sequences into a column where each result is in a single cell. If that can be done without char(10) that's fine. But as for an example, if cell a1 = tish!!osi100sdg9sss then I would want this returned:

!!
100
9

where each non-alpha sequence is defined by the border of the nearest alpha characters on each side.

I would like to loop through all characters in all cells in a selection. And I hate to be picky but performance is desired as I have 400k rows in my desired selection.

I have found snippets of code online that get at the general idea of things, but am not yet able to craft something to be needs yet. Any ideas? Thanks. And I called it a function but ask for a macro (basically). Either works really, whichever is easiest.
 
Last edited:

Excel Facts

Workdays for a market open Mon, Wed, Friday?
Yes! Use "0101011" for the weekend argument in NETWORKDAYS.INTL or WORKDAY.INTL. The 7 digits start on Monday. 1 means it is a weekend.
Here's a macro that should do what you want.

Rich (BB code):
Sub ChangeIt()
Dim MyData As Variant, i As Long, j As Long, str1 As String, x As String

    MyData = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row).Value
    
    For i = 1 To UBound(MyData)
        str1 = ""
        For j = 1 To Len(MyData(i, 1))
            x = Mid(MyData(i, 1), j, 1)
            str1 = str1 & IIf(x Like "[a-zA-Z]", " ", x)
        Next j
        MyData(i, 1) = Replace(WorksheetFunction.Trim(str1), " ", Chr(10))
    Next i
    
    Range("B1:B" & UBound(MyData)).Value = MyData
    
End Sub
Change the source column A to your source column, and the output to where you want it. The output can be the same as the input column.

To install this, open a COPY of your workbook to the sheet with your data. Press Alt-F11 to open the VBA editor. From the menu select Insert > Module. Paste the above code into that sheet. Change the references as needed. Close the editor (Alt-Q or the red X). In Excel press Alt-F8, select ChangeIt, click Run.

I ran this on 5000 rows and it finished in a fraction of a second. Your mileage may vary. Let me know if this works for you.
 
Upvote 0
I tested on small sample and it worked great! Tested on whole set and it was chugging. I restarted excel and now it just keeps crashing it every time I try to run it, no matter how small the data set. I should get it working at some point, so thanks again! Super helpful
 
Upvote 0
Is it possible that any possible character, when encountered, would cause this to crash excel? It seems to work fine on dummy data, and a small sample of my real data, but fails any time I try to run it on 10,000 rows. It works on 1000. I suspect there is a character hiding in there somewhere that it doesn't like. The behaviour is really consistent. Any ideas?
 
Upvote 0
There's very little in there that should cause a crash. You can try this version:

Code:
Sub ChangeIt()
Dim MyData As Variant, i As Long, j As Long, str1 As String, x As String


    MyData = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row).Value
    
    On Error GoTo Oops:
    
    For i = 1 To UBound(MyData)
        str1 = ""
        For j = 1 To Len(MyData(i, 1))
            x = Mid(MyData(i, 1), j, 1)
            str1 = str1 & IIf(x Like "[a-zA-Z]", " ", x)
        Next j
        MyData(i, 1) = Replace(WorksheetFunction.Trim(str1), " ", Chr(10))
    Next i
    
    Range("B1:B" & UBound(MyData)).Value = MyData
    Exit Sub
Oops:
    MsgBox "Error on row " & i
    
End Sub
Same basic code, but if it encounters an error in the course of the program, it will tell you what line it's on. This really doesn't explain why it crashes Excel though. What version of Windows and Excel do you have, and how much ram do you have? Do you have anything else running?



Edit: Glad to see you got it. What was the issue? How long does it take to run?
 
Last edited:
Upvote 0
Rich (BB code):
Sub ChangeIt()
Dim MyData As Variant, i As Long, j As Long, str1 As String, x As String


    MyData = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row).Value
    
    On Error GoTo Oops:
    
    For i = 1 To UBound(MyData)
        str1 = ""
        For j = 1 To Len(MyData(i, 1))
            x = Mid(MyData(i, 1), j, 1)
            str1 = str1 & IIf(x Like "[a-zA-Z]", " ", x)
        Next j
        MyData(i, 1) = Replace(WorksheetFunction.Trim(str1), " ", Chr(10))
    Next i
    
    Range("B1:B" & UBound(MyData)).NumberFormat = "@"
    Range("B1:B" & UBound(MyData)).Value = MyData
    Exit Sub
Oops:
    MsgBox "Error on row " & i
    
End Sub
You should add the line of code I show in red, otherwise embedded numbers with leading zeroes will lose their leading zeroes. For example, this...

abc0009def

becomes 9, not 0009 (assuming the cell did not become formatted as Text from a previous run), in the cell. Here is another macro that will do what your code does (once you make the change I suggested) with using any loops (I use Regular Expressions to avoid them)...
Code:
[table="width: 500"]
[tr]
	[td]Sub NonLettersOnly()
  Dim Addr As String, Txt As String
  Addr = "A1:A" & Cells(Rows.Count, "A").End(xlUp).Row
  With CreateObject("VBScript.RegExp")
    .Global = True
    .Pattern = "[A-Za-z ]"
    Txt = Replace(Application.Trim(.Replace(Join(Application.Transpose(Range(Addr)), Chr(1)), " ")), " ", vbLf)
  End With
  Txt = Replace(Replace(Txt, vbLf & Chr(1), Chr(1)), Chr(1) & vbLf, Chr(1))
  Range(Addr).Offset(, 1).NumberFormat = "@"
  Range(Addr).Offset(, 1) = Application.Transpose(Split(Txt, Chr(1)))
End Sub[/td]
[/tr]
[/table]
 
Last edited:
Upvote 0
Edit: Glad to see you got it. What was the issue? How long does it take to run?

I had not read the code yet more than enough to see the char(10).. and I decided to change my mind and replace it with a space so a space was a delimiter. It was not only until later I noticed it was clearly replacing a space. So the circularity caused the crashes. But even after fixing, it still generally refuses to calculate on anything over 10k rows at a time. Excel 15.38 Mac, 16gigs gram. It manages 10k rows in about 5-10 seconds. But when trying 20k or more rows it will run for maybe a minute.. and keeps going. I don't have patience to see how long it takes, i just iterated manually :p :p :p


Thanks rick, but I actually don't care so much about numbers, particularly not leading/trailing as found in my data. Good call though!
 
Last edited:
Upvote 0
.. and I decided to change my mind and replace it with a space so a space was a delimiter.

But even after fixing, it still generally refuses to calculate on anything over 10k rows at a time. Excel 15.38 Mac, 16gigs gram. It manages 10k rows in about 5-10 seconds. But when trying 20k or more rows it will run for maybe a minute.. and keeps going. I don't have patience to see how long it takes, i just iterated manually :p :p :p
I don't have a Mac, so I do not know if the code I posted earlier works or not... I am not sure whether the CreateObject(VBScript.RegExp") works on a Mac or not. If it does work, then here is the code modified to use a space delimiter instead of a Line Feed delimiter...
Code:
[table="width: 500"]
[tr]
	[td]Sub NonLettersOnly()
  Dim Addr As String, Txt As String
  Addr = "A1:A" & Cells(Rows.Count, "A").End(xlUp).Row
  With CreateObject("VBScript.RegExp")
    .Global = True
    .Pattern = "[A-Za-z]"
    Range(Addr).Offset(, 1) = Application.Transpose(Split(Application.Trim(.Replace(Join(Application.Transpose(Range(Addr)), Chr(1)), " ")), Chr(1)))
  End With
End Sub[/td]
[/tr]
[/table]
 
Upvote 0
I had actually started playing around with RegEx after my first version. Mostly for self-education since I haven't had a lot of experience with them. Here's what I came up with:

Code:
Sub ChangeIt2()
Dim MyData As Variant, i As Long
Dim RegEx As Object

    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Pattern = "[a-z]+"
        .ignorecase = True
        .Global = True
        .MultiLine = True
    End With
    
    MyData = Range("A1:A" & Cells(Rows.Count, "A").End(xlUp).Row).Value
    
    For i = 1 To UBound(MyData)
        MyData(i, 1) = Replace(Trim(RegEx.Replace(MyData(i, 1), " ")), " ", vbLf)
        'MyData(i, 1) = Trim(RegEx.Replace(MyData(i, 1), " "))
    Next i
    
    Range("B1:B" & UBound(MyData)).Value = MyData
    
End Sub
It runs about 3 times faster than my original macro (which doesn't mean a whole lot, since my original ran in under 1 second for 20K lines). I suspect Rick's version would run even faster. However, like Rick, I know very little about VBA on Macs. I don't know if the RegEx exists there. If you try it, then you can use the second line in the loop, which is currently commented out, to use just a space between elements.

Rick, I tried both of your versions, and they both died on the main line. I tracked it down to the Trim function, which apparently has an upper limit for size. I reduced my test data, and it worked fine. Again, I'm no RegEx expert, but if you put a + on the end of your pattern, it might run a tiny bit more efficiently.
 
Upvote 0

Forum statistics

Threads
1,225,757
Messages
6,186,845
Members
453,379
Latest member
gabriellegonzalez

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top