Hey there,
I'm trying to write a program that automatically finds and removes a set of strings (stored in one column) from a large dataset. Here's my code:
I was able to run the code many times without fail (on about 300-500 lines/1800-3000 cells of data at once), but now I keep getting the out of memory issue. I'm thinking I need to flush some of the memory, but I would also love to know how I can make the code cleaner to avoid creating this problem again no matter how many iterations I run the code. I am getting the error on this line:
I've added this near the end of the routine, but can't even reach it anymore. Would this help anyway?
Thanks so much!
I'm trying to write a program that automatically finds and removes a set of strings (stored in one column) from a large dataset. Here's my code:
Code:
Sub Redact()
Dim strPattern As String
Dim strReplace As String: strReplace = "[Redacted]"
Dim regEx As New RegExp ' For detection
Dim nameRegEx As New RegExp ' For extraction
Dim strInput As String
Dim NameRange As Range
Dim ReplaceRange As Range
Set ReplaceRange = Application.InputBox("Select the range that you want to redact names from", "Obtain Range Object", Type:=8)
Set NameRange = Application.InputBox("Select the range that contains the names you want redacted", "Obtain Range Object", Type:=8)
NameRange.Sort Key1:=NameRange.Worksheet.Range("A1"), Order1:=xlDescending
Application.ScreenUpdating = False
Application.EnableEvents = False
With regEx
.MultiLine = True
.IgnoreCase = False
End With
With nameRegEx
.Global = True
.MultiLine = True
.IgnoreCase = False
End With
Dim currName As String 'Keeps track of the name currently being redacted
For Each cell2 In NameRange
currName = cell2.Value
strPattern = "(" + currName + "[\s\.\?!:;,]([A-Z]\.? )?)|(" + currName + "$)|(" + currName + "'s)"
For Each cell In ReplaceRange
If strPattern <> "" Then
strInput = cell.Value
regEx.Pattern = strPattern
If regEx.Test(strInput) Then
nameRegEx.Pattern = currName + "((\s[A-Z]\.?$)|(\s[A-Z]\.?[^a-z]))?[^a-z]|(" + currName + "((\s[A-Z]\.?$)|(\s[A-Z]\.?[^a-z]))?$)"
cell.Value = nameRegEx.Replace(strInput, strReplace)
End If
End If
Next
Next
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
I was able to run the code many times without fail (on about 300-500 lines/1800-3000 cells of data at once), but now I keep getting the out of memory issue. I'm thinking I need to flush some of the memory, but I would also love to know how I can make the code cleaner to avoid creating this problem again no matter how many iterations I run the code. I am getting the error on this line:
Code:
cell.Value = nameRegEx.Replace(strInput, strReplace)
I've added this near the end of the routine, but can't even reach it anymore. Would this help anyway?
Code:
Set regEx = Nothing
Set nameRegEx = Nothing
Thanks so much!