Let me tell you what I'm basically trying to do.
I have a folder with a lot of text files. 3000 txt files, each about 10MB. What I"m trying to do is extract a few pieces of data from each of these and build those into a single table. In the end I'd expect the table to be about 5 columns/10K records total - just to give you a ballpark.
I've written a macro, posted below, that covers the basic mechanics of this. In summary it performs the following steps:
1) Loops through each text file in a designated folder. One at a time it:
2) Opens the text file.
3) Copies the data to a designated worksheet
4) Enters, Copies, and then clears formulas that will identify the records we want.
5) Autofilter/copies the data we want and appends it to a data table on another sheet.
6) Clears the data on the original data sheet
7) Closes the text file
8) repeat until out of text files.
The macro seems to work okay but only when running it over a small number of text files. It runs reasonably well over 5 text files but it gets progressively more sluggish as I try to run it over more files.
With 5 text files to loop through the code averages about 17 seconds per file.
With 25 text files the code averaged about 2:30 per file.
With 100 text files the code took about 7 hours and crashed excel in the process.
So I figure I must have some fatal flaw in my code. Here is the code I'm using - I'd appreciate any optimization ideas.
I have a folder with a lot of text files. 3000 txt files, each about 10MB. What I"m trying to do is extract a few pieces of data from each of these and build those into a single table. In the end I'd expect the table to be about 5 columns/10K records total - just to give you a ballpark.
I've written a macro, posted below, that covers the basic mechanics of this. In summary it performs the following steps:
1) Loops through each text file in a designated folder. One at a time it:
2) Opens the text file.
3) Copies the data to a designated worksheet
4) Enters, Copies, and then clears formulas that will identify the records we want.
5) Autofilter/copies the data we want and appends it to a data table on another sheet.
6) Clears the data on the original data sheet
7) Closes the text file
8) repeat until out of text files.
The macro seems to work okay but only when running it over a small number of text files. It runs reasonably well over 5 text files but it gets progressively more sluggish as I try to run it over more files.
With 5 text files to loop through the code averages about 17 seconds per file.
With 25 text files the code averaged about 2:30 per file.
With 100 text files the code took about 7 hours and crashed excel in the process.
So I figure I must have some fatal flaw in my code. Here is the code I'm using - I'd appreciate any optimization ideas.
Code:
Sub LoopThroughText()
Dim path As String
Dim ThisWB As String
Dim Filename As String
Dim Wkb As Workbook
Dim lastR As Long
Dim i As Long
Application.EnableEvents = False
Application.ScreenUpdating = False
ThisWB = ActiveWorkbook.Name
path = "P:\Reports\call data\DumpingGround"
Filename = Dir(path & "\*.txt", vbNormal)
If Len(Filename) = 0 Then Exit Sub
Do Until Filename = vbNullString
If Not Filename = ThisWB Then
Set Wkb = Workbooks.Open(Filename:=path & "\" & Filename)
Cells.Select
Selection.Copy
Windows(ThisWB).Activate
Worksheets("Crunch").Activate
lastR = Range("A" & Rows.Count).End(xlUp).Row
Range("A1").PasteSpecial
Application.CutCopyMode = False
Columns("A:A").Select
Selection.Replace What:="=--", Replacement:="", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Columns("B:AI").Select
Selection.ClearContents
Range("B1").Value = "Catch"
Range("C1").Value = "Number"
Range("D1").Value = "From"
Range("E1").Value = "From2"
Range("F1").Value = "To"
Range("G1").Value = "Date"
Range("H1").Value = "Flag"
Range("B2").Formula = "=IF(OR(LEFT(A2,14)=""Invite SIP:248"",AND(B1=1,LEFT(A2,10)<>""Supported:"")),1,0)"
Range("C2").Formula = "=IF(LEFT(A2,11)=""Invite sip:"",MID(A2,12,FIND(""@"",A2)-12),IF(B2=1,C1,""""))"
Range("D2").Formula = "=IF(LEFT(A2,5)=""From:"",MID(A2,6,FIND(""<"",A2)-6),IF(B2=1,D1,""""))"
Range("E2").Formula = "=IF(LEFT(A2,5)=""From:"",MID(A2,FIND(""sip:"",A2)+4,FIND(""@"",A2)-FIND(""sip:"",A2)-4),IF(B2=1,E1,""""))"
Range("F2").Formula = "=IF(LEFT(A2,9)=""To: <sip:"",MID(A2,10,FIND(""@"",A2)-10),IF(B2=1,F1,""""))"
Range("G2").Formula = "=IF(LEFT(A2,5)=""Date:"",MID(A2,FIND("","",A2)+2,FIND("" GMT"",A2)-FIND("","",A2)-2)+0,IF(B2=1,G1,""""))"
Range("H2").Formula = "=1*OR(ISERROR(B2),AND(B2=1,B3=0))"
Range("B2:H2").Copy
Range(Cells(2, 2), Cells(Range("A200000").End(xlUp).Row, 8)).PasteSpecial (xlPasteFormulas)
Range(Cells(1, 2), Cells(Range("A200000").End(xlUp).Row, 8)).Copy
Range(Cells(1, 2), Cells(Range("A200000").End(xlUp).Row, 8)).PasteSpecial (xlPasteValues)
Range("H1").Select
Selection.AutoFilter
ActiveSheet.Range("H1").CurrentRegion.AutoFilter Field:=8, Criteria1:="=1"
If Range("A300000").End(xlUp).Row > 1 Then
Range(Cells(2, 2), Cells(Range("A200000").End(xlUp).Row, 8)).Select
Selection.Copy
Sheets("EndTable").Activate
Range("A" & Range("A200000").End(xlUp).Row + 1).PasteSpecial (xlPasteValues)
Sheets("Crunch").Activate
Range("H1").Select
Selection.AutoFilter
End If
Range("A1").Copy
Application.ScreenUpdating = True
Sheets("EndTable").Activate
Range("I" & Range("I200000").End(xlUp).Row + 1).PasteSpecial (xlPasteValues)
Range("J" & Range("J200000").End(xlUp).Row + 1).Formula = "=now()"
Range("J" & Range("J200000").End(xlUp).Row).Copy
Range("J" & Range("J200000").End(xlUp).Row).PasteSpecial (xlPasteValues)
Application.ScreenUpdating = False
Sheets("Crunch").Activate
Columns("A:AZ").Select
Selection.ClearContents
Wkb.Close False
End If
Filename = Dir()
Loop
Application.EnableEvents = True
Application.ScreenUpdating = True
MsgBox "Done"
End Sub