Clean VBA / Improve Performance - Cleans Excel Workbooks for Access, Runs Queries, Exports to Excel

audan2009

New Member
Joined
Aug 14, 2013
Messages
38
Hello! For the past week I've been trying to automate a tedious process. I got this beast of glued together code working but I'm not good enough to make it clean, run quickly or make it more dynamic to spreadsheet sizes. I'll explain what it does and walk through all the modules.

I made it to clean my exports from SAP for import in to Access, run my access queries and export it as a formatted report. They come out as "|" deliminated CSV files with run information and page breaks.

1) Delete all old files after the last run:

Code:
Sub KillOldData()
Dim fname As String
fname = "C:\Users\username\Desktop\ITP\Current Data\"
Kill fname & "*.xlsx"
End Sub

2a) Covert all .CSV files to .XLSX,
2b) reopen .XLSX files
2c) delete all rows until it gets to the first row to keep as column headings
2d) Combine columns (required because SAP breaks the report in to multiple columns)
2e) Move new combined columns to a new sheet, delete old sheet
2f) Text-to-columns
2g) Clean pages by removing all page breaks (SAP prints out about 100 rows then breaks each page with infomation like this is "PAGE 100" and leaves like 8 blank rows and row 1 and 8ish will have "-----" breaks - not good for access)
2h) Trim all cells (remove all spaces - SAP will put in a bunch of random spaces before and after)

Code:
Option Compare Database
Sub csvToXlsx()
Set appExcelstart = CreateObject("Excel.Application")
appExcelstart.Application.Visible = True
appExcelstart.Application.DisplayAlerts = False
Dim wb As Workbook
Dim strFile As String, strDir As String
strDir = "C:\Users\username\Desktop\ITP\Current Data\"
strFile = Dir(strDir & "*.csv")
Do While strFile <> ""
Set wb = Workbooks.Open(FileName:=strDir & strFile, Local:=True)
   With wb
        .SaveAs Replace(wb.FullName, ".csv", ".xlsx"), 51 'UPDATE:
        .Close True
    End With
    Set wb = Nothing
strFile = Dir
Loop
'reopen and run
Dim wbk As Workbook
Dim srcfolder As String, fname As String
srcfolder = "C:\Users\username\Desktop\ITP\Current Data\"
fname = Dir(srcfolder & "*.xlsx")  ' get first file name to match pattern
Do Until Len(fname) = 0  ' stop loop when no more files found
Set wbk = Workbooks.Open(FileName:=srcfolder & fname, Local:=True)
 Set shtsave = Worksheets(1) ' set sheet object, change sheet name to suit, or use index
 ' do all processing with workbook
'delete rows until pipe
Const colA      As Long = 1
Dim lngRow      As Long
Dim lngLastRow  As Long
lngLastRow = Cells.SpecialCells(xlCellTypeLastCell).Row
lngRow = 1
lngLastRow = 50
Do While lngRow <= lngLastRow
    If (Left(Cells(lngRow, colA), 3) <> "| M" And Left(Cells(lngRow, colA), 3) <> "| 0") And (Left(Cells(lngRow, colA), 2) <> "|M" And Left(Cells(lngRow, colA), 2) <> "|0") And (Left(Cells(lngRow, colA), 2) <> "|Z" And Left(Cells(lngRow, colA), 2) <> "|0") Then
        Cells(lngRow, 1).EntireRow.Delete
        lngLastRow = lngLastRow - 1
        Else: lngRow = lngRow + 1
    End If
    Loop
'combine columns
    Dim LastRow As Long
    Dim Ws As Worksheet
    For Each Ws In Worksheets
    LastRow = Ws.Range("A" & Ws.Rows.Count).End(xlUp).Row
    Ws.Range("AA1:AA" & LastRow).Formula = "=a1&b1&c1&d1&e1&f1&g1&h1&i1&j1&k1&l1&m1&n1&o1&p1&q1&r1&s1&t1&u1&v1&w1&x1&y1&z1"
  Next Ws
'paste values to new worksheet and delete old sheets
  Columns("AA:AA").Select
    Selection.Copy
    Sheets.Add After:=Sheets(Sheets.Count)
    DoCmd.SetWarnings False
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
        :=False, Transpose:=False
Worksheets(1).Delete
    
'perform text to columns
    For Each Ws In Worksheets
        Ws.Columns("A:A").TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
        TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
        Semicolon:=False, Comma:=False, Space:=False, Other:=True, OtherChar _
        :="|", FieldInfo:=Array(Array(1, 2), Array(2, 2), Array(3, 1), Array(4, 1), Array(5, _
        1), Array(6, 1), Array(7, 1), Array(8, 1), Array(9, 1), Array(10, 1), Array(11, 1), Array(12 _
        , 1), Array(13, 1), Array(14, 1), Array(15, 1), Array(16, 1), Array(17, 1), Array(18, 1), _
        Array(19, 1), Array(20, 1), Array(21, 1), Array(22, 1), Array(23, 1), Array(24, 1), Array( _
        25, 1), Array(26, 1), Array(27, 1), Array(28, 1), Array(29, 1), Array(30, 1), Array(31, 1), _
        Array(32, 1), Array(33, 1), Array(34, 1), Array(35, 1), Array(36, 1), Array(37, 1), Array( _
        38, 1), Array(39, 1), Array(40, 1), Array(41, 1), Array(42, 1), Array(43, 1), Array(44, 1), _
        Array(45, 1), Array(46, 1), Array(47, 1), Array(48, 1), Array(49, 1), Array(50, 1), Array( _
        51, 1), Array(52, 1), Array(53, 1), Array(54, 1), Array(55, 1), Array(56, 1), Array(57, 1), _
        Array(58, 1), Array(59, 1), Array(60, 1), Array(61, 1), Array(62, 1), Array(63, 1), Array( _
        64, 1), Array(65, 1), Array(66, 1), Array(67, 1), Array(68, 1), Array(69, 1), Array(70, 1), _
        Array(71, 1), Array(72, 1), Array(73, 1)), TrailingMinusNumbers:=True
     Next Ws
'clean pages
Rows("1:1").Select
    Selection.AutoFilter
    ActiveSheet.Range("$A$1:$BZ$700000").AutoFilter Field:=2, Criteria1:= _
        "=Material", Operator:=xlOr, Criteria2:="="
    Range("A60").Select
    Range(Selection, ActiveCell.SpecialCells(xlLastCell)).Select
    Selection.EntireRow.Delete
    ActiveSheet.Range("$A$1:$BZ$700000").AutoFilter Field:=2
    ActiveSheet.Range("$A$1:$BZ$700000").AutoFilter Field:=3, Criteria1:="="
    Range("A2").Select
    Range(Selection, ActiveCell.SpecialCells(xlLastCell)).Select
    Selection.EntireRow.Delete
    Columns("A:A").Select
    Selection.Delete Shift:=xlToLeft
    ActiveSheet.Range("$A$1:$BZ$300000").AutoFilter Field:=2
'trim column headings
 Dim a(), c&, cs&, r&, rs&, Rng2 As Range
  Set Rng2 = Range("A1:DD70000")
  a() = Rng2.Value
  rs = UBound(a, 1)
  cs = UBound(a, 2)
  For r = 1 To rs
    For c = 1 To cs
      If VarType(a(r, c)) = vbString Then a(r, c) = Trim(a(r, c))
    Next
  Next
  Rng2.Value = a()
    
With wbk
    .Save
    .Close True
End With
Set wbk = Nothing
fname = Dir
 
Loop
appExcelstart.Quit
End Sub

3) Import in to Access

Code:
Option Compare Database
Sub ImportToAccess()

Dim strSaveNameForExport As String
DoCmd.DeleteObject acTable, "Deliveries"
DoCmd.DeleteObject acTable, "Sales"
DoCmd.DeleteObject acTable, "Inventory"
DoCmd.DeleteObject acTable, "ItemInfo"
DoCmd.DeleteObject acTable, "UnScheduledDel"
DoCmd.DeleteObject acTable, "DD"
DoCmd.DeleteObject acTable, "ID"
DoCmd.DeleteObject acTable, "Repair"

strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\Deliveries.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "Deliveries", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\sales.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "Sales", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\inventory.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "Inventory", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\ItemInfo.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "ItemInfo", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\UnScheduledDel.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "UnScheduledDel", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\DD.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "DD", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\ID.xlsx"
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "ID", strSaveNameForExport, True, "Sheet1$"
strSaveNameForExport = "C:\Users\username\Desktop\ITP\Current Data\Repair\Repair.xlsx"

DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, "Repair", strSaveNameForExport, True, "Sheet1$"
DoCmd.DeleteObject acTable, "Sheet1$_ImportErrors"
DoCmd.DeleteObject acTable, "Sheet1$_ImportErrors1"

End Sub

4) Export back to Excel:

Code:
Option Compare Database
 Public Function ExportAdvanced()
 
 Dim strWorksheet As String
 Dim strWorkSheetPath As String
 Dim appExcel As Excel.Application
 
 Dim sht As Excel.Worksheet
 Dim wkb As Excel.Workbook
 Dim Rng As Excel.Range
 Dim strTable As String
 Dim strRange As String
 Dim strSaveName As String
 Dim strPrompt As String
 Dim strTitle As String
 Dim strDefault As String
 
Set appExcel = CreateObject("Excel.Application")
appExcel.Application.Visible = True
 
 strTable = "Summary"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "Summary"
 strTable1 = "SOH vs BO"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "SOH vs BO"
 
 strTable2 = "Short Alert"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "Short Alert"
 
 strTable3 = "Repair"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "Repair"
 
 strTable4 = "SchDelDetail"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "SchDelDetail"
 
 strTable5 = "UnSchDelDetail"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "UnSchDelDetail"
 
 strTable6 = "Defect"
 DoCmd.SetWarnings Flase
 DoCmd.OpenQuery "Defect"
 
 GetWorksheetsPath = "C:\Users\username\Desktop\ITP\"
 strWorkSheetPath = GetWorksheetsPath
 strWorksheet = "SPIDER Results"
 strSaveName = strWorkSheetPath & strWorksheet & ".xls "
 Debug.Print "Worksheet save name" & strSaveName
 
 On Error Resume Next
 
 Kill strSaveName
 
 
 
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable, FileName:=strSaveName, _
    hasfieldnames:=True
    
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable1, FileName:=strSaveName, _
    hasfieldnames:=True
    
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable2, FileName:=strSaveName, _
    hasfieldnames:=True
    
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable3, FileName:=strSaveName, _
    hasfieldnames:=True
    
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable4, FileName:=strSaveName, _
    hasfieldnames:=True
    
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable5, FileName:=strSaveName, _
    hasfieldnames:=True
    
 DoCmd.TransferSpreadsheet transfertype:=acExport, _
    spreadsheettype:=acSpreadsheetTypeExcel9, _
    TableName:=strTable6, FileName:=strSaveName, _
    hasfieldnames:=True
    
Set appExcel = GetObject(, "Excel.Application")
appExcel.Workbooks.Open (strSaveName)
Set wkb = appExcel.ActiveWorkbook
Set sht = appExcel.ActiveSheet
Set sht1 = appExcel.Worksheets(2)
Set sht2 = appExcel.Worksheets(3)
Set sht3 = appExcel.Worksheets(4)
Set sht4 = appExcel.Worksheets(5)
Set sht5 = appExcel.Worksheets(6)
Set sht6 = appExcel.Worksheets(7)

sht.Activate
With sht 'Summary
    
 appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:I").ColumnWidth = 14
    appExcel.Columns("J:J").ColumnWidth = 20
    appExcel.Columns("K:K").ColumnWidth = 9
    appExcel.Columns("L:O").ColumnWidth = 20
    appExcel.Columns("P:P").ColumnWidth = 13
    appExcel.Columns("Q:T").ColumnWidth = 12
    appExcel.Columns("A:W").HorizontalAlignment = xlCenter
    appExcel.Columns("A:W").VerticalAlignment = xlCenter
End With
sht1.Activate
With sht1 'SOH vs BO
    
   appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:I").ColumnWidth = 14
    appExcel.Columns("A:I").HorizontalAlignment = xlCenter
    appExcel.Columns("A:I").VerticalAlignment = xlCenter
End With
sht2.Activate
With sht2 'Short Alert
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:D").ColumnWidth = 14
    appExcel.Columns("E:E").ColumnWidth = 16
    appExcel.Columns("F:F").ColumnWidth = 12
    appExcel.Columns("G:G").ColumnWidth = 20.57
    appExcel.Columns("H:H").ColumnWidth = 9
    appExcel.Columns("I:I").ColumnWidth = 15
    appExcel.Columns("J:Q").ColumnWidth = 14
    appExcel.Columns("A:Q").HorizontalAlignment = xlCenter
    appExcel.Columns("A:Q").VerticalAlignment = xlCenter
End With
sht3.Activate
With sht3 'Repair
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:B").ColumnWidth = 14
    appExcel.Columns("C:E").ColumnWidth = 10.5
    appExcel.Columns("F:F").ColumnWidth = 20.57
    appExcel.Columns("G:J").ColumnWidth = 13
    appExcel.Columns("G:J").ColumnWidth = 13
    appExcel.Columns("K:K").ColumnWidth = 20.57
    appExcel.Columns("L:M").ColumnWidth = 13
    appExcel.Columns("A:M").HorizontalAlignment = xlCenter
    appExcel.Columns("A:M").VerticalAlignment = xlCenter
End With
sht4.Activate
With sht4 'SchDel
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:F").ColumnWidth = 14
    appExcel.Columns("G:H").ColumnWidth = 24
    appExcel.Columns("I:J").ColumnWidth = 9
    appExcel.Columns("K:N").ColumnWidth = 21
    appExcel.Columns("O:P").ColumnWidth = 13
    appExcel.Columns("Q:S").ColumnWidth = 12
    appExcel.Columns("T:U").ColumnWidth = 21
    appExcel.Columns("V:W").ColumnWidth = 12.5
    appExcel.Columns("X:Y").ColumnWidth = 11.5
    appExcel.Columns("Z:AF").ColumnWidth = 21
    appExcel.Columns("AG:AI").ColumnWidth = 14
    appExcel.Columns("AJ:AN").ColumnWidth = 10
    appExcel.Columns("AO:AQ").ColumnWidth = 14
    appExcel.Columns("AR:AR").ColumnWidth = 18
    appExcel.Columns("A:AQ").HorizontalAlignment = xlCenter
    appExcel.Columns("A:AQ").VerticalAlignment = xlCenter
End With
sht5.Activate
With sht5 'UnSchd Del
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:E").ColumnWidth = 14
    appExcel.Columns("F:G").ColumnWidth = 11
    appExcel.Columns("H:I").ColumnWidth = 12
    appExcel.Columns("I:I").ColumnWidth = 9.5
    appExcel.Columns("J:J").ColumnWidth = 21
    appExcel.Columns("K:K").ColumnWidth = 27
    appExcel.Columns("L:L").ColumnWidth = 9.5
    appExcel.Columns("M:M").ColumnWidth = 18
    appExcel.Columns("N:N").ColumnWidth = 19
    appExcel.Columns("O:Q").ColumnWidth = 9.5
    appExcel.Columns("R:R").ColumnWidth = 21
    appExcel.Columns("S:S").ColumnWidth = 9.5
    appExcel.Columns("T:U").ColumnWidth = 21
    appExcel.Columns("V:AA").ColumnWidth = 14
    appExcel.Columns("A:AA").HorizontalAlignment = xlCenter
    appExcel.Columns("A:AA").VerticalAlignment = xlCenter
End With
sht6.Activate
With sht6 'Defect
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects.Add(xlSrcRange, Range("A1").CurrentRegion, , xlYes).Name = _
        "Table6"
    appExcel.Range("A1").End(xlDown).Select
    appExcel.ActiveSheet.ListObjects("Table6").TableStyle = "TableStyleMedium2"
    appExcel.Range("A1").End(xlUp).Select
    appExcel.Columns("A:G").ColumnWidth = 15
    appExcel.Columns("A:G").HorizontalAlignment = xlCenter
    appExcel.Columns("A:G").VerticalAlignment = xlCenter
End With
sht.Name = "Summary"
sht1.Name = "Inventory vs Backorders"
sht2.Name = "Short Alert"
sht3.Name = "Defect Detail"
sht4.Name = "Schd Del Detail"
sht5.Name = "UnSchDelDetail"
sht6.Name = "Defect"

appExcel.Application.Visible = True
strPrompt = _
    "Enter file name and path for saving worksheet"
strTitle = "File Name"
strDefault = strSaveName
strSaveName = InputBox(prompt:=strPrompt, _
    Title:=strTitle, Default:=strDefault)
    
wkb.SaveAs FileName:=strSaveName
appExcel.Application.Visible = True
 
 End Function

Thanks for your help or at least considering! I know this is a beast.
 
Set a flag: use a boolean and set it to true after the first occurance of the M| then once true the others are deleted.

Not sure how to do that....

Before you told me that, I deleted that 2nd loop and decided to try to do an autofilter then do the delete. So, I replaced some code. This is the new clean pages code:


Code:
'clean pages
            Columns("A:A").Select
            Selection.Delete Shift:=xlToLeft
    
            Dim rngTable As Range
            Dim ws As Worksheet
            Dim StartCell As Range
            
            Const ColumntoFilter1 As Integer = 1
            Const FilterCriteria1 As String = "Material"
            
            Set ws = ActiveSheet
            'Set the starting position (Top-left most position) of your data range
            Set StartCell = ws.Range("A1")
            
            'Turn off autofilter in case it's already active
            ws.AutoFilterMode = False
            'Define data table
            Set rngTable = StartCell.CurrentRegion
            'Filter and delete data
            With rngTable
                .AutoFilter Field:=ColumntoFilter1, Criteria1:=FilterCriteria1
                .Offset(1, 0).EntireRow.Delete
            End With
            
            'Turn filters off again
            ws.AutoFilterMode = False
            
            Set rngTable = Nothing
            Set StartCell = Nothing
            Set ws = Nothing
            
            
            'trim column headings. Do this in array to speed up
 
Upvote 0

Excel Facts

What is the shortcut key for Format Selection?
Ctrl+1 (the number one) will open the Format dialog for whatever is selected.
Is there anyway to make the loop run faster? Some of these workbooks have over 400,000 rows. It's been stuck on this one workbook for like 6 minutes now. Still going.
 
Upvote 0
Not sure how to do that....
Soemthing like:
Code:
if x.value= y then
   if bFlag then
       x.entirerow.delete
   else
      bFlag = true
   end if
end if

Yes your last solution will be a lot quicker, but you need lots of memory as well...

What it does is it puts the whole! table into an array (in memory) and then you can do everything in memory. The last step should be to write the array back to (in this case I think best) a new sheet.

Why it is faster is that reading and writing to excel takes time, so if you have 400000 rows and each takes a millisecond to read and write then you are 400 seconds in the running. With the array you have one (big)read and one (big) write, which will still take some time, say 10 - 20 seconds each.

But you have to do a lot more yourself, because there is no such thing as delete row in the array. SO you can either work with two arrays (redim the 2nd one same size as first) and then do the reverse: loop through the 1st array and copy the rows you want to keep to the 2nd. (each array will need a pointer to keep track where it is). Then dump the 2nd array to the sheet.

The 2nd array will have been oversized, but that is quicker than redim preserve each time
 
Upvote 0
Also make sure you don't have any other sheets open. If these have calculated functions, they would be recalculating with every write. (You can disable this with
Code:
Application.EnableEvents = False
Don't forget to set to true at the end
 
Upvote 0
Not confident in my ability to set that flag correctly. I think the autofilter after the text-to-columns is quick enough for now.

It's the looping that is the bottleneck right now.

I'm reading that a "For Each" loop might be faster than a "Do While" loop, is that true?

The way I had it before was faster but not as clean.
 
Upvote 0
working with an array will be significantly faster however it needs a lot of memory. If we were to combine all the columns first, pop themin an array to find the lines to delete, then rough calculation at say 100 characters per line = 200 Bytes, then 400,000 lines = 80* 10^6 bytes = 80 MB

However Excel has more up its sleeve. Very much faster than looping through each cell is using Range.Find(). We can search for your start strings (and add PAGE, and ---- at the same time) and delete lines that contain that. So then we have probably reduced the 400,000 to 8,000 or so items tto be checked...

Code:
Sub csvToXlsx()
' Sub loops through all csv files in folder and prepares _
  them for import into access. Saved as .Xlsx file
'
'---------------------------------------------------------
    Dim wbWB As Workbook
    Dim Rng2 As Range, rFnd As Range
    Dim appExcelstart As Object
    Dim strFile As String, strDir As String
    Dim lngLastRow As Long, c As Long, _
        cMax As Long, r As Long, rMax As Long
    Dim vArr As Variant
    Dim i As Integer

    ' open Excel
    Set appExcelstart = CreateObject("Excel.Application")
    With appExcelstart
        .Application.Visible = True
        .Application.ScreenUpdating = False ' avoid screen flikker
        .Application.DisplayAlerts = False
    End With
    strDir = "C:\Users\username\Desktop\ITP\Current Data\"
    strFile = Dir(strDir & "*.csv")
    
    'loop through all *.csv files to convert to .xlsx
    Do While strFile <> ""
        Set wbWB = Workbooks.Open(Filename:=strDir & strFile, Local:=True)
           With wbWB
                .SaveAs Replace(wbWB.FullName, ".csv", ".xlsx"), 51 '51=.xlsx type
    
        ' brand new sheet, so no formatting or hidden objects. _
          that means the following can be used to find the real _
          last row.
        lngLastRow = Cells.SpecialCells(xlCellTypeLastCell).Row
         ' now do all processing with workbook
         ' delete rows until pipe
         ' ******You can add 'PAGE' or '----' in this list _
           to get rid of those lines as well *****
        vArr = Array("| M", "|M", "| 0", "|0", "|Z")
        ' loop to find each in turn
        For i = LBound(vArr, 1) To UBound(vArr, 1)
            Set rFnd = Range("A1:A" & lngLastRow).Find(What:=vArr(1), _
                    LookIn:=xlValues, lookat:=xlPart, _
                    searchorder:=xlRows, searchdirection:=xlNext, _
                    MatchCase:=True, After:=Range("A1"))
                    
            If Not rFnd Is Nothing Then ' value found
                Do
                    'check that it is the first characters
                    If Left(rFnd.Value, Len(vArr(i))) = vArr(i) Then
                        rFnd.EntireRow.Delete
                    End If
                    Set rFnd = Range("A1:A" & lngLastRow).Find(What:=vArr(1), _
                            LookIn:=xlValues, lookat:=xlPart, _
                            searchorder:=xlRows, searchdirection:=xlNext, _
                            MatchCase:=True, After:=Range("A1"))
                Loop While rFnd.Address <> Range("A1").Address
            End If
        Next i
        
        
        'combine columns to one long string
        'recalc last row
        lngLastRow = Cells.SpecialCells(xlCellTypeLastCell).Row
        'insert new column A
        Range("A1").EntireColumn.Insert
        With Range("A1:A" & lngLastRow)
            .Formula = _
                "=b1&c1&d1&e1&f1&g1&h1&i1&j1&k1&l1&m1&n1&o1" & _
                "&p1&q1&r1&s1&t1&u1&v1&w1&x1&y1&z1&aa1"
            'paste values over formula
            .Value = .Value
            ' delete old columns
            Range("B1:AA" & lngLastRow).Clear
            
            'perform text to columns
           .TextToColumns Destination:=Range("A1"), _
                    DataType:=xlDelimited, _
                    TextQualifier:=xlDoubleQuote, _
                    ConsecutiveDelimiter:=False, _
                    Tab:=False, Semicolon:=False, _
                    Comma:=False, Space:=False, Other:=True, OtherChar:="|", _
                    FieldInfo:= _
                    Array(Array(1, 2), Array(2, 2), Array(3, 1), Array(4, 1), Array(5, 1), _
                    Array(6, 1), Array(7, 1), Array(8, 1), Array(9, 1), Array(10, 1), _
                    Array(11, 1), Array(12, 1), Array(13, 1), Array(14, 1), Array(15, 1), _
                    Array(16, 1), Array(17, 1), Array(18, 1), Array(19, 1), Array(20, 1), _
                    Array(21, 1), Array(22, 1), Array(23, 1), Array(24, 1), Array(25, 1), _
                    Array(26, 1), Array(27, 1), Array(28, 1), Array(29, 1), Array(30, 1), _
                    Array(31, 1), Array(32, 1), Array(33, 1), Array(34, 1), Array(35, 1), _
                    Array(36, 1), Array(37, 1), Array(38, 1), Array(39, 1), Array(40, 1), _
                    Array(41, 1), Array(42, 1), Array(43, 1), Array(44, 1), Array(45, 1), _
                    Array(46, 1), Array(47, 1), Array(48, 1), Array(49, 1), Array(50, 1), _
                    Array(51, 1), Array(52, 1), Array(53, 1), Array(54, 1), Array(55, 1), _
                    Array(56, 1), Array(57, 1), Array(58, 1), Array(59, 1), Array(60, 1), _
                    Array(61, 1), Array(62, 1), Array(63, 1), Array(64, 1), Array(65, 1), _
                    Array(66, 1), Array(67, 1), Array(68, 1), Array(69, 1), Array(70, 1), _
                    Array(71, 1), Array(72, 1), Array(73, 1)), _
                    TrailingMinusNumbers:=True
        End With
        
        'clean pages
        Columns("A:A").EntireColumn.Delete

        Const ColumntoFilter1 As Integer = 1
        Const FilterCriteria1 As String = "Material"
        
        'Turn off autofilter in case it's already active
        ActiveSheet.AutoFilterMode = False
        'Define data table
        Set Rng2 = ws.Range("A1").CurrentRegion
        'Filter and delete data
        With Rng2
            .AutoFilter Field:=ColumntoFilter1, Criteria1:=FilterCriteria1
            .Offset(1, 0).EntireRow.Delete
        End With
        
        'Turn filters off again
        ActiveSheet.AutoFilterMode = False
        
        
        'trim column headings. Do this in array to speed up
        Set Rng2 = Range("A1").CurrentRegion
        ' load array
        vArr = Rng2.Value
        rMax = UBound(vArr, 1)
        cMax = UBound(vArr, 2)
        For r = 1 To rMax
            For c = 1 To cMax
                If VarType(vArr(r, c)) = vbString Then
                    vArr(r, c) = Trim(vArr(r, c))
                End If
            Next
        Next
        ' write array back to sheet
        Rng2.Value = vArr
            
        ' now save the Workbook & close
        wbWB.Close savechanges:=True
            
        ' get next .csv file
        strFile = Dir
    Loop
    
    'Clean up
    Set wbWB = Nothing
    Set Rng2 = Nothing
    With appExcelstart
        .Application.ScreenUpdating = True
        .Application.DisplayAlerts = True
        .Quit
    End With
    Set appExcelstart = Nothing
    
End Sub


See the comments in the code.
 
Last edited by a moderator:
Upvote 0
Ya, I was thinking about that but I didn't know how to articulate it. I just tested it and it's getting stuck in the new Range.Find() loop. It won't even get past the 1st file.

Previous script took about 25 seconds on the 1st file, about 1 minute on the 2nd and over 5 minutes (never completed) on the 3rd file. There are 3 more after but they are much smaller, they probably take 30 to 2 minutes each.
 
Upvote 0
is it possible to put one of these.csv files in dropbox or so? If you want you can post the URL in a private message. it is very difficult for me to test the code otherwise.
 
Upvote 0

Forum statistics

Threads
1,225,761
Messages
6,186,883
Members
453,381
Latest member
CGDobyns

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top