Make VBA loop more efficient - needs to handle a lot of data

davio565

New Member
Joined
Jan 19, 2017
Messages
23
Hi Guys,

So I have a VBA macro which I can rearrange data with but need some help to make it more efficient.

So i have a lot of source data (~900,000 rows), populating A2:AD900,000. My code works down column AM which is populated with names. So the code takes the name from AM2 and loops down column I and every time it matches AM2, it copies that row of data and pastes to the right of AM2 until there are no more matched in column I. then it goes to AM3 etc. etc. The problem is I have a lot of names (~400k) but the code only runs in a reasonable time up to about AM50. If there are more it struggles. So i need to make it quicker but am really struggling to add additional things into this code.

So the code is:

Code:
Sub finddata()


Dim name As String
Dim finalrow As Long
Dim finalrowdata As Long
Dim i As Long


Sheets("Sheet1").Range("BI2:CAA5000").ClearContents
    
finalrow = Sheets("sheet1").Range("AM100000").End(xlUp).Row
finalrowdata = Sheets("sheet1").Range("C1000000").End(xlUp).Row


Dim cell As Range


For Each cell In ActiveSheet.Range("AM2:AM" & finalrow)
   
For i = cell.Row + 1 To finalrowdata
If Cells(i, 9) = cell.Value Then
    Range(Cells(i, 1), Cells(i, 30)).Copy
    Cells(cell.Row, Columns.Count).End(xlToLeft).offset(, 1).PasteSpecial xlPasteValues
    End If
Next i


  Next cell
  
End Sub

Firstly, I want the code to stop looking for more matches of AM2 if it reaches 40 , and move onto AM3 etc. I have tried a couple of things but honestly I don't think I'm even close with putting this in the correct place in the code, let alone the correct commands.

Secondly, to reduce the number of cells the code loops through, I have been trying to re-order the source data alphabetically and then define 'finalrowdata' and a new Dim 'startrow' so that the first letter of the name in AM2 would be defined by 'Dim strLeft As String' and strLeft = Left(str, 1). So if the name was 'Bill Smith' it would recognise 'B' as the first letter then only look in the rows where column I names begin with 'B' (rows 40k-70k for example).

I have tried playing around with the code however I'm really struggling to get anything to work at all as I think the basic structure of the code needs revising and I am out of my depth.

If anyone could help to add these to my code it would be a massive help. Also if anyone has any other ideas to make it more efficient that would be great too. I'm not expecting to do all 400k names in one go but even batches of 5-10k would be much better than the lifetime I'm currently looking at...

Hope this isn't too rambling and makes sense :)

Thanks
Dave
 

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

Give this version a try.

Code:
Sub t()
Dim c As Range, rng As Range, r As Range
With ActiveSheet
    For Each c In .Range("AM2", .Cells(Rows.Count, "AM").End(xlUp))
        Intersect(.Range("A:AD"), .UsedRange).AutoFilter 9, c.Value
        Set rng = Intersect(.Range("A:AD"), .UsedRange).Offset(1).SpecialCells(xlCellTypeVisible)
        For Each r In rng.Rows
            If Application.CountA(r) <> 0 Then
                r.Copy .Cells(c.Row, Columns.Count).End(xlToLeft).Offset(, 1)
            End If
        Next
        Set rng = Nothing
        .AutoFilterMode = False
    Next
End With
End Sub
 
Last edited:
Upvote 0
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

Hi,
One more solution, should be fast:
Rich (BB code):
Sub DuplicateMatched()
 
  Const SortColumn = "I"  ' <-- This column will be sorted in A:BQ at the end
 
  Dim arrI(), arrAM()
  Dim i As Long, j As Long, k As Long
  Dim s As String
 
  ' Disable screen updating
  Application.ScreenUpdating = False
 
  ' Copy data of I and AM columns to arrays
  With Sheets("Sheet1")
   
    ' Find last data rows
    i = .Columns("I").Find(What:="*", LookIn:=xlFormulas, SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious, SearchFormat:=False).Row
    j = .Columns("AM").Find(What:="*", LookIn:=xlFormulas, SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious, SearchFormat:=False).Row
    k = .Columns("AD").Column
   
    ' Clear destination range
    .Range("AN2").Resize(i - 1, k).ClearContents
   
    ' Copy data of I and AM into arrays
    arrI() = .Range("I2:I" & i).Value
    arrAM() = .Range("AM2:AM" & j).Value
   
  End With
 
  ' Main
  With CreateObject("Scripting.Dictionary")
    .CompareMode = 1
   
    ' Create dictionary of AM
    For i = 1 To UBound(arrAM)
      s = Trim(arrAM(i, 1))
      If Len(s) > 0 Then .Item(s) = 0
    Next
   
    ' Put 1 into arrI if value of I present in AM else put empty value
    For i = 1 To UBound(arrI)
      s = Trim(arrI(i, 1))
      If Len(s) > 0 Then
        If .Exists(s) Then
          arrI(i, 1) = 1
        Else
          arrI(i, 1) = Empty
        End If
      End If
    Next
   
    ' Do final stage
    If .Count > 0 Then
     
      With Sheets("Sheet1").Range("A2:AN" & UBound(arrI) + 1)
       
        ' Put arrI into AN column
        .Columns("AN").Value = arrI()
       
        ' Sort A:AN by AN
        .Sort .Cells(1, "AN"), xlAscending, Header:=xlNo
       
        ' Find last row with 1 in AN
        i = .Columns("AN").Find(What:=1, LookIn:=xlFormulas, SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious, SearchFormat:=False).Row - 1
       
        ' Copy matched data to the dest range
        .Columns("AN").Resize(i, k).Value = .Columns("A:AD").Resize(i).Value
       
        ' Sort A:BQ by SortColumn
        With .Resize(, .Columns.Count + k)
          .Sort .Cells(1, SortColumn), xlAscending, Header:=xlNo
        End With
       
      End With
     
    End If
   
  End With
 
  ' Restore screen updating
  Application.ScreenUpdating = True
 
End Sub
Regards
 
Last edited:
Upvote 0
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

You may also use this modified part of the code to exclude memory limitation at copying:
Rich (BB code):
        ' Copy matched data to the dest range
        '.Columns("AN").Resize(i, k).Value = .Columns("A:AD").Resize(i).Value
        For j = 1 To k
          .Columns(.Columns.Count + j - 1).Resize(i).Value = .Columns(j).Resize(i).Value
        Next
 
Upvote 0
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

Give this version a try.

Code:
Sub t()
Dim c As Range, rng As Range, r As Range
With ActiveSheet
    For Each c In .Range("AM2", .Cells(Rows.Count, "AM").End(xlUp))
        Intersect(.Range("A:AD"), .UsedRange).AutoFilter 9, c.Value
        Set rng = Intersect(.Range("A:AD"), .UsedRange).Offset(1).SpecialCells(xlCellTypeVisible)
        For Each r In rng.Rows
            If Application.CountA(r) <> 0 Then
                r.Copy .Cells(c.Row, Columns.Count).End(xlToLeft).Offset(, 1)
            End If
        Next
        Set rng = Nothing
        .AutoFilterMode = False
    Next
End With
End Sub

Hi JLGWhiz!

This version is awesome thanks. I've tested it a fair bit against the original and it runs consistently about 35-40% faster which is great. To make my task feasible, I'm also going to cut down the source data by about half to 450k rows which helps considerably.

So one tweak that may be possible with this version, since the script filters the first 30 columns by each name in AM, would it be possible to add a condition where, if for example 55 matches are found, only the first 30 iterations are copied across to the right? Obviously this is only going to help if it speeds the execution and most names average about 20ish i think with very few over 70. I'm not sure adding another variable will speed things but the A column has dates in which might be used with xlTop30Items once the block is filtered by AM value. Again I have tried to get this function in but I am struggling to do so with out error.

Thanks again for all your help and this is awesome!!

David
 
Upvote 0
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

Hi,
One more solution, should be fast:
Rich (BB code):
Sub DuplicateMatched()
 
  Const SortColumn = "I"  ' <-- This column will be sorted in A:BQ at the end
 
  Dim arrI(), arrAM()
  Dim i As Long, j As Long, k As Long
  Dim s As String
 
  ' Disable screen updating
  Application.ScreenUpdating = False
 
  ' Copy data of I and AM columns to arrays
  With Sheets("Sheet1")
   
    ' Find last data rows
    i = .Columns("I").Find(What:="*", LookIn:=xlFormulas, SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious, SearchFormat:=False).Row
    j = .Columns("AM").Find(What:="*", LookIn:=xlFormulas, SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious, SearchFormat:=False).Row
    k = .Columns("AD").Column
   
    ' Clear destination range
    .Range("AN2").Resize(i - 1, k).ClearContents
   
    ' Copy data of I and AM into arrays
    arrI() = .Range("I2:I" & i).Value
    arrAM() = .Range("AM2:AM" & j).Value
   
  End With
 
  ' Main
  With CreateObject("Scripting.Dictionary")
    .CompareMode = 1
   
    ' Create dictionary of AM
    For i = 1 To UBound(arrAM)
      s = Trim(arrAM(i, 1))
      If Len(s) > 0 Then .Item(s) = 0
    Next
   
    ' Put 1 into arrI if value of I present in AM else put empty value
    For i = 1 To UBound(arrI)
      s = Trim(arrI(i, 1))
      If Len(s) > 0 Then
        If .Exists(s) Then
          arrI(i, 1) = 1
        Else
          arrI(i, 1) = Empty
        End If
      End If
    Next
   
    ' Do final stage
    If .Count > 0 Then
     
      With Sheets("Sheet1").Range("A2:AN" & UBound(arrI) + 1)
       
        ' Put arrI into AN column
        .Columns("AN").Value = arrI()
       
        ' Sort A:AN by AN
        .Sort .Cells(1, "AN"), xlAscending, Header:=xlNo
       
        ' Find last row with 1 in AN
        i = .Columns("AN").Find(What:=1, LookIn:=xlFormulas, SearchOrder:=xlByRows, _
                      SearchDirection:=xlPrevious, SearchFormat:=False).Row - 1
       
        ' Copy matched data to the dest range
        .Columns("AN").Resize(i, k).Value = .Columns("A:AD").Resize(i).Value
       
        ' Sort A:BQ by SortColumn
        With .Resize(, .Columns.Count + k)
          .Sort .Cells(1, SortColumn), xlAscending, Header:=xlNo
        End With
       
      End With
     
    End If
   
  End With
 
  ' Restore screen updating
  Application.ScreenUpdating = True
 
End Sub
Regards

Hi ZVI,

Wow i really wanted this to work as the code looks really cool but it didn't execute as intended unfortunately. So the rows of source data identified from the I column, were copied, however each row of 30 that matches with AM2, should be pasted to the right of AM2, keeping all matches on row 2. Then AM3, all matched rows copied to the right of row 3 etc. The matched rows in your script ended up being copied to different areas down the sheet from column AN. So the rows matching 'Bill Smith' were copied and pasted on the same row numbers as 'Bill Smith' in the original source data (which ends up being ordered alphabetically).

The fault may be mine however as perhaps the sheet is set up is slightly different then you imagined, however I tried to change my sheet to how you may have set up yours but couldn't get the script to run as I hoped. The solution JLGWhiz posted works well so you don't need to spend time trying to fix unless it's a personal challenge :).

Thanks for posting though!!

David
 
Upvote 0
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

Hi JLGWhiz!

This version is awesome thanks. I've tested it a fair bit against the original and it runs consistently about 35-40% faster which is great. To make my task feasible, I'm also going to cut down the source data by about half to 450k rows which helps considerably.

So one tweak that may be possible with this version, since the script filters the first 30 columns by each name in AM, would it be possible to add a condition where, if for example 55 matches are found, only the first 30 iterations are copied across to the right? Obviously this is only going to help if it speeds the execution and most names average about 20ish i think with very few over 70. I'm not sure adding another variable will speed things but the A column has dates in which might be used with xlTop30Items once the block is filtered by AM value. Again I have tried to get this function in but I am struggling to do so with out error.

Thanks again for all your help and this is awesome!!

You're welcome. Regarding the limitation on number of rows copied over v. speed of procedure: That would require code to run multiple filters on segmented groups of data, making the code more complex.
Regards, JLG
 
Last edited:
Upvote 0
Re: Help needed to make VBA loop more efficient - needs to handle a lot of data

Not sure if you still care, but assuming you are still working this issue, this code should be significantly faster. You will just need to update the findNamesAddress for each run.

Code:
Sub daFind()
    Application.ScreenUpdating = False
    thetime = Now()
'first we load an array with the names we are searching for, and the data list, for the results, we will store the first 40 rows where there are a mathc seperated by a coma.
    Dim findNamesArr, dataNamesArr, resultsArr() As String, resultsCount, dataStartRowNum, findNamesStartRow, currentPasteRow, currentPasteColumn, dataAddress, findNamesAddress
    
    findNamesAddress = "am2:am51" 'UPDATE AS NECCESSARY 'this should have the addrress of the current block of names you are searching for --
    dataAddress = "i2:i1000000" 'hold the range address where all your data names are.
        
    With Worksheets("sheet1")
        findNamesArr = .Range(findNamesAddress)
        dataNamesArr = .Range(dataAddress)
        dataStartRowNum = .Range(dataAddress).Row 'need to know the row start number in order to calculate what row the results will be on
        findNamesStartRow = .Range(findNamesAddress).Row
        ReDim resultsArr(LBound(findNamesArr) To UBound(findNamesArr))
        For q = LBound(findNamesArr) To UBound(findNamesArr)
            resultsCount = 0 'reset count for each new name we search.
            For w = LBound(dataNamesArr) To UBound(dataNamesArr)
                If LCase(findNamesArr(q, 1)) = LCase(dataNamesArr(w, 1)) Then  ' we have a match, store the result
                    resultsCount = resultsCount + 1
                    If resultsCount = 1 Then
                        resultsArr(q) = dataStartRowNum + w - 1 'figure out which row on the sheet is the match and store it -- the first result has no coma in front
                    Else
                        resultsArr(q) = resultsArr(q) & "," & dataStartRowNum + w - 1 'figure out which row on the sheet is the match and store it -- each result after the first has a leading coma
                    End If
                    
                    If resultsCount = 40 Then Exit For 'start searchinf for next name after 40 results
                
                End If
            Next w
            
        Next q
        






'at this point we have the resultsArr loaded with a string showing each row a result was found on.
'now move the results data over next to the items from the findNamesArr


    For e = LBound(resultsArr) To UBound(resultsArr)
        If resultsArr(e) <> "" Then
            currentPasteRow = findNamesStartRow + e - 1
            currentPasteColumn = .Range(findNamesAddress).Column + 1
            Dim resultRowsArr
            resultRowsArr = Split(resultsArr(e), ",")
            For r = LBound(resultRowsArr) To UBound(resultRowsArr)
                .Cells(currentPasteRow, currentPasteColumn).Resize(1, 30).Value = .Cells(resultRowsArr(r), 1).Resize(1, 30).Value
                currentPasteColumn = currentPasteColumn + 30
            Next r
        End If
    Next e
    
    End With
    
        Debug.Print Format(Now - thetime, "hh:MM:ss")
    'Debug.Print resultsArr(25)
    Application.ScreenUpdating = True
End Sub
 
Upvote 0

Forum statistics

Threads
1,225,754
Messages
6,186,825
Members
453,377
Latest member
JoyousOne

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top