VBA: Speed up macro (for large file)

Mange

New Member
Joined
Sep 8, 2016
Messages
20


I wrote a macro that compares the columns B, which contains file numbers, in two worksheets. There are three possibilities: the file number exists in both columns, the file number exists only in the first column and the file number exists only in the second column. If e.g. the file number exists in both columns, the macro should copy/paste the entire row to another sheet. Same for the other two scenario's.
My code work perfect for a small file (around 500 rows, 236 columns), but for the large files it doesn't work. It takes way too long, and at the end it just crashes. I already tried the usual tricks to speed up the macro.

Code:
Option Explicit
Sub CopyPasteWorksheets()
Dim wbDec As Workbook, wbJune As Workbook, wbAnalysis As Workbook
Dim wsDec As Worksheet, wsJune As Worksheet
Dim PresPres As Worksheet, PresAbs As Worksheet, AbsPres As Worksheet
'Stop screen from updating to speed things up
Application.ScreenUpdating = False
Application.EnableEvents = False
'Add 3 new worksheets. They each represent a different category, namely the one with already existing insurances, one with new insurances
'and one with the insurances that are closed due to mortality, lapse or maturity. Add two (temporary) worksheets to paste the databases.
Worksheets.Add().Name = "PresPres"
Worksheets.Add().Name = "PresAbs"
Worksheets.Add().Name = "AbsPres"
Worksheets.Add().Name = "DataDec"
Worksheets.Add().Name = "DataJune"
'Define the active workbook
Set wbAnalysis = ThisWorkbook
'Define the first database. Copy/paste the sheet and close them afterwards.
Set wbDec = Workbooks.Open(Filename:="F:\Risk_Management_2\Embedded_Value\2015\20151231\Data\DLL\Master Scala\Extract.xlsx")
wbDec.Sheets("SCALA").Range("A1").CurrentRegion.Copy
wbAnalysis.Sheets("DataDec").Range("A1").PasteSpecial xlPasteValues
wbDec.Close
'We have to do the same for the other database. We cannot do it at the same time, because both files have the same name,
'and can't be opened at the same time.
Set wbJune = Workbooks.Open(Filename:="F:\Risk_Management_2\Embedded_Value\2016\20160630\Data\DLL\Master Scala\extract.xlsx")
wbJune.Sheets("SCALA").Range("A1").CurrentRegion.Copy
wbAnalysis.Sheets("DataJune").Range("A1").PasteSpecial xlPasteValues
    
wbJune.Close
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub
Sub Compare()
Dim DataDec As Worksheet, DataJune As Worksheet
Dim lastRowDec As Long
Dim lastRowJune As Long
Dim lastRowPresAbs As Long
Dim lastRowPresPres As Long
Dim lastRowAbsPres As Long
Dim foundTrue As Boolean
Dim i As Long, j As Long, k As Long, l As Long
'Define the last row of the different sheets
lastRowDec = Sheets("DataDec").Cells(Sheets("DataDec").Rows.Count, "B").End(xlUp).Row
lastRowJune = Sheets("DataJune").Cells(Sheets("DataJune").Rows.Count, "B").End(xlUp).Row
lastRowPresAbs = Sheets("PresAbs").Cells(Sheets("PresAbs").Rows.Count, "B").End(xlUp).Row
lastRowPresPres = Sheets("PresPres").Cells(Sheets("PresPres").Rows.Count, "B").End(xlUp).Row
lastRowAbsPres = Sheets("AbsPres").Cells(Sheets("AbsPres").Rows.Count, "B").End(xlUp).Row
'Compare the file numbers in column B of both sheets. If they are the same, copy/paste the entire row to sheet PresPres,
'if they are not, copy/paste the entire row to sheet PresAbs.
For i = 1 To lastRowDec
foundTrue = False
For j = 1 To lastRowJune
    If Sheets("DataDec").Cells(i, 1).Value = Sheets("DataJune").Cells(j, 1).Value Then
        foundTrue = True
        Sheets("PresPres").Rows(lastRowPresPres + 1) = Sheets("DataDec").Rows(i)
        lastRowPresPres = lastRowPresPres + 1
        Exit For
    End If
Next j
If Not foundTrue Then
    Sheets("DataDec").Rows(i).Copy Destination:= _
    Sheets("PresAbs").Rows(lastRowPresAbs + 1)
    lastRowPresAbs = lastRowPresAbs + 1
End If
Next i

'Look if there are file numbers that are only present in June's database. If so, copy/paste entire row to sheet AbsPres.
For k = 1 To lastRowJune
foundTrue = False
For l = 1 To lastRowDec
    If Sheets("DataJune").Cells(k, 1).Value = Sheets("DataDec").Cells(l, 1).Value Then
        foundTrue = True
        Exit For
    End If
    
Next l
If Not foundTrue Then
    Sheets("DataJune").Rows(k).Copy Destination:= _
    Sheets("AbsPres").Rows(lastRowAbsPres + 1)
    lastRowAbsPres = lastRowAbsPres + 1
End If
Next k
'Stop screen from updating to speed things up.
Application.ScreenUpdating = True
Application.EnableEvents = True
End Sub

I've added some comments to explain what I'm trying to do. I'm relatively new to VBA so I believe I'm not coding very efficient.
Could someone have a look and try to make it work?
Thanks!!



<tbody>
</tbody>
 

Excel Facts

Format cells as currency
Select range and press Ctrl+Shift+4 to format cells as currency. (Shift 4 is the $ sign).
I think the problem was with how the data was being copied. Assigning arrays isn't very efficient and I suspect Excel doesn't free up the allocated memory. This is where I am right now:

Code:
Sub Compare()

'Sheets
Dim wsDec As Worksheet
Dim wsJun As Worksheet
Dim wsPP As Worksheet
Dim wsPA As Worksheet
Dim wsAP As Worksheet

'Last rows
Dim lrDec As Long
Dim lrJun As Long
Dim lrPP As Long
Dim lrPA As Long
Dim lrAP As Long

'Dictionary
Dim JunData As Object

'Utility variables
Dim i As Long
Dim fileNum As Variant
Dim foundValue As Double

'Get worksheets
Set wsDec = Worksheets("DataDec")
Set wsJun = Worksheets("DataJune")
Set wsPP = Worksheets("PresPres")
Set wsPA = Worksheets("PresAbs")
Set wsAP = Worksheets("AbsPres")

'Define the last row of the different sheets
lrDec = wsDec.Cells(wsDec.Rows.Count, 2).End(xlUp).Row
lrJun = wsJun.Cells(wsJun.Rows.Count, 2).End(xlUp).Row
lrPP = wsPP.Cells(wsPP.Rows.Count, 2).End(xlUp).Row
lrPA = wsPA.Cells(wsPA.Rows.Count, 2).End(xlUp).Row
lrAP = wsAP.Cells(wsAP.Rows.Count, 2).End(xlUp).Row

'Get list of file numbers for June
Set JunData = CreateObject("Scripting.Dictionary")
For i = 2 To lrJun
    fileNum = wsJun.Cells(i, 2).Value
    JunData(fileNum) = i
Next i

'Stop screen from updating to speed things up.
Application.ScreenUpdating = False
Application.EnableEvents = False

'Now process all rows on the December sheet
For i = 2 To lrDec
    fileNum = wsDec.Cells(i, 2).Value
    If JunData.Exists(fileNum) Then
        lrPP = lrPP + 1
        wsDec.Rows(i).Copy wsPP.Cells(lrPP, 1)
        JunData.Remove fileNum
    Else
        lrPA = lrPA + 1
        wsDec.Rows(i).Copy wsPA.Cells(lrPA, 1)
    End If
    'Remove these two lines to make it quicker!
    Application.StatusBar = i & " / " & lrDec
    DoEvents
Next i

'Process all the rows on the June sheet
For Each fileNum In JunData.Keys
    i = JunData(fileNum)
    lrAP = lrAP + 1
    wsJun.Rows(i).Copy wsAP.Cells(lrAP, 1)
    'Remove these two lines to make it quicker!
    Application.StatusBar = i & " / " & lrJun
    DoEvents
Next fileNum

'Re-enable screen updating.
Application.ScreenUpdating = True
Application.EnableEvents = True

End Sub

As noted, you can take some lines out to speed it up if necessary.

WBD
 
Upvote 0
Does this work for you? It's still running here.. I don't get the error message though.
 
Upvote 0
It was going to take a long time that's for sure! I reduced the data set size and ran a test and observed Excel memory usage which stayed pretty low. I suspect it will take 30 minutes or so depending on your processor speed.

WBD
 
Upvote 0
You appear to have disabled events and screen updating have you tried turning off auto calculations at the top and enable at bottom otherwise excel will calculate the whole worksheet for every iteration of a copy disable/enable will only calculate once on completion
 
Upvote 0
Yes, but as there aren't any formulas in the worksheets, it doesn't have a big impact.
 
Upvote 0
With the original files, it took 23 minutes 42 seconds to run :cool: Thank you for the solution. I'm glad it doesn't crash anymore.

If any one has an idea to make it faster, don't hesitate to respond.
 
Upvote 0
Quick outline of logic:


  1. Create a hash table (dictionary) with all the file numbers from June.
  2. For each file number from December, check whether it exists in the June list.
    • If it does, add it to the "PresPres" sheet and remove the entry from the hash table (makes it quicker to search and is re-used later on).
    • If it doesn't add it to the "PresAbs" sheet.
  3. Now work through what's left of the hash table; these must be June entries that didn't occur in December (or they would have been removed). Add them to the "AbsPres" sheet.

Hash tables should be quick so I'm not sure how/if the code can be further optimized. I did consider copying only the cells that had data (rather than the whole row) but I don't think there will be much to gain from that.

WBD
 
Upvote 0
How many columns are in the current region of sheets?
 
Upvote 0

Forum statistics

Threads
1,221,418
Messages
6,159,793
Members
451,589
Latest member
Harold14

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top