Make sumifs with array faster

ouvay

Board Regular
Joined
Jun 9, 2022
Messages
131
Office Version
  1. 2019
Platform
  1. Windows
Hello all

I've written this bit of code.. it works ... but can anyone help make it faster?

Its basically a sumifs with arrays -problem is that I'm dealing with over 1.2 million rows (600k+ in one table and 600k+ in the sumifs data table) - so it takes a pretty minute

VBA Code:
Sub totals_fromHDFC()

    Dim hdfcwb As Workbook: Set hdfcwb = Workbooks.Item("Payment Control - HDFC Leg.xlsm")
    Dim hdfc As Worksheet: Set hdfc = hdfcwb.Worksheets(1)
    Dim a() As Variant
    Dim data2() As Variant
    Dim r As Long, i As Long, NumRows As Long,  s1 As String
 
 
    With Sheet2 ''assigning main data table to an array
    NumRows = .Cells(.Rows.Count, "B").End(xlUp).Row
    data2 = .Range("D1:J" & NumRows).Value
    For i = LBound(data2) To UBound(data2)
        data2(i, 2) = data2(i, 7)
        data2(i, 3) = data2(i, 5)
    Next i
    ReDim Preserve data2(1 To NumRows, 1 To 3) As Variant
    ReDim sum2(2 To NumRows, 1 To 2) As Variant
    s2 = .Name
    End With
 
    With hdfc 'assigning table where I want to sumif, to array
    NumRows = .Cells(.Rows.Count, "B").End(xlUp).Row
    a = .Range("C1:H" & NumRows).Value
    For i = LBound(a) To UBound(a)  ''feel free to ignore this hot mess :)  its just me rearranging my array to suit my preferences
        a(i, 1) = a(i, 4)
        a(i, 4) = a(i, 2)
        a(i, 2) = a(i, 5)
        a(i, 5) = a(i, 3)
        a(i, 3) = a(i, 6)
    Next i
    ReDim Preserve a(1 To NumRows, 1 To 5) As Variant
    ReDim sum2(1 To NumRows, 1 To 2) As Variant
    s2 = .Name
    End With
 

'actual summing happens in this loop
For r = 2 To UBound(data2)
    Application.StatusBar = "Calculating " & s2 & " row " & r & " of " & UBound(data2) & "... " & Format(r / UBound(data2), "PERCENT") & " Completed"  ' just some status bar ux
    For i = LBound(a) To UBound(a)
        If data2(r, 1) = a(i, 1) And data2(r, 2) = a(i, 2) And data2(r, 3) = a(i, 3) Then
            sum2(r, 1) = sum2(r, 1) + a(i, 4)
            sum2(r, 2) = sum2(r, 2) + a(i, 5)
        End If
    Next i
Next r
 
    Sheet2.Range("AD2").Resize(UBound(sum2), 1).Value = sum2
 
End Sub
 
Last edited:
Interested to know by how much ?

Just to know…

Thanks a lot for feedback, glad it helped
I'll be happy to give you an exact number ( before and after) on Monday when I go into work and don't have to do it remotely :)
 
Upvote 0

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Here is a dictionary version. See if that makes any difference.
Just check that the 2 wb (workbook) and 2 sht (sheet) references are correct.

VBA Code:
Sub totals_fromHDFC_AB()

    Dim wbHdfc As Workbook, shtHdfc As Worksheet
    Dim rngHdfc As Range, arrHdfc As Variant
    Dim wbChart As Workbook, shtChart As Worksheet
    Dim rngChart As Range, arrChart() As Variant
    Dim lrowHdfc As Long, lrowChart As Long, i As Long, j As Long
    Dim arrSum() As Double
  
    Set wbHdfc = Workbooks.Item("Payment Control - HDFC Leg.xlsm")
    Set shtHdfc = wbHdfc.Worksheets(1)
    lrowHdfc = shtHdfc.Cells(shtHdfc.Rows.Count, "F").End(xlUp).Row
    Set rngHdfc = shtHdfc.Range("A2:H" & lrowHdfc)
    arrHdfc = rngHdfc.Value
    ReDim arrSum(1 To UBound(arrHdfc), 1 To 2)
  
    Set wbChart = ThisWorkbook
    Set shtChart = wbChart.Worksheets(Sheet2.Name)
    lrowChart = shtChart.Cells(shtChart.Rows.Count, "D").End(xlUp).Row
    Set rngChart = shtChart.Range("A2:AH" & lrowChart)
    arrChart = rngChart.Value
  
    Dim dictHdfc As Object, dictKey As String

    Set dictHdfc = CreateObject("Scripting.dictionary")
  
    ' Load hdfc range into Dictionary & Sum rows with matching criteria
    For i = 1 To UBound(arrHdfc)
        dictKey = arrHdfc(i, 6) & "|" & arrHdfc(i, 7) & "|" & arrHdfc(i, 8)
        If Not dictHdfc.exists(dictKey) Then
            j = j + 1
            dictHdfc(dictKey) = j
            arrSum(j, 1) = arrHdfc(i, 4)
            arrSum(j, 2) = arrHdfc(i, 5)
        Else
            arrSum(dictHdfc(dictKey), 1) = arrSum(dictHdfc(dictKey), 1) + arrHdfc(i, 4)
            arrSum(dictHdfc(dictKey), 2) = arrSum(dictHdfc(dictKey), 2) + arrHdfc(i, 5)
        End If
    Next i
  
    ' Update rows with matching criteria
    For i = 1 To UBound(arrChart)
        dictKey = arrChart(i, 4) & "|" & arrChart(i, 10) & "|" & arrChart(i, 8)
        If dictHdfc.exists(dictKey) Then
            arrChart(i, 30) = arrSum(dictHdfc(dictKey), 1)
            arrChart(i, 31) = arrSum(dictHdfc(dictKey), 2)
        Else
            arrChart(i, 30) = 0
            arrChart(i, 31) = 0
        End If
    Next i
  
    ' Write back Totals
    rngChart.Columns(30).Value = Application.Index(arrChart, 0, 30)
    rngChart.Columns(31).Value = Application.Index(arrChart, 0, 31)
 
End Sub
 
Upvote 0
Solution
Here is a dictionary version. See if that makes any difference.
Just check that the 2 wb (workbook) and 2 sht (sheet) references are correct.

VBA Code:
Sub totals_fromHDFC_AB()

    Dim wbHdfc As Workbook, shtHdfc As Worksheet
    Dim rngHdfc As Range, arrHdfc As Variant
    Dim wbChart As Workbook, shtChart As Worksheet
    Dim rngChart As Range, arrChart() As Variant
    Dim lrowHdfc As Long, lrowChart As Long, i As Long, j As Long
    Dim arrSum() As Double
  
    Set wbHdfc = Workbooks.Item("Payment Control - HDFC Leg.xlsm")
    Set shtHdfc = wbHdfc.Worksheets(1)
    lrowHdfc = shtHdfc.Cells(shtHdfc.Rows.Count, "F").End(xlUp).Row
    Set rngHdfc = shtHdfc.Range("A2:H" & lrowHdfc)
    arrHdfc = rngHdfc.Value
    ReDim arrSum(1 To UBound(arrHdfc), 1 To 2)
  
    Set wbChart = ThisWorkbook
    Set shtChart = wbChart.Worksheets(Sheet2.Name)
    lrowChart = shtChart.Cells(shtChart.Rows.Count, "D").End(xlUp).Row
    Set rngChart = shtChart.Range("A2:AH" & lrowChart)
    arrChart = rngChart.Value
  
    Dim dictHdfc As Object, dictKey As String

    Set dictHdfc = CreateObject("Scripting.dictionary")
  
    ' Load hdfc range into Dictionary & Sum rows with matching criteria
    For i = 1 To UBound(arrHdfc)
        dictKey = arrHdfc(i, 6) & "|" & arrHdfc(i, 7) & "|" & arrHdfc(i, 8)
        If Not dictHdfc.exists(dictKey) Then
            j = j + 1
            dictHdfc(dictKey) = j
            arrSum(j, 1) = arrHdfc(i, 4)
            arrSum(j, 2) = arrHdfc(i, 5)
        Else
            arrSum(dictHdfc(dictKey), 1) = arrSum(dictHdfc(dictKey), 1) + arrHdfc(i, 4)
            arrSum(dictHdfc(dictKey), 2) = arrSum(dictHdfc(dictKey), 2) + arrHdfc(i, 5)
        End If
    Next i
  
    ' Update rows with matching criteria
    For i = 1 To UBound(arrChart)
        dictKey = arrChart(i, 4) & "|" & arrChart(i, 10) & "|" & arrChart(i, 8)
        If dictHdfc.exists(dictKey) Then
            arrChart(i, 30) = arrSum(dictHdfc(dictKey), 1)
            arrChart(i, 31) = arrSum(dictHdfc(dictKey), 2)
        Else
            arrChart(i, 30) = 0
            arrChart(i, 31) = 0
        End If
    Next i
  
    ' Write back Totals
    rngChart.Columns(30).Value = Application.Index(arrChart, 0, 30)
    rngChart.Columns(31).Value = Application.Index(arrChart, 0, 31)
 
End Sub
Thank you so much for putting that together! I appreciate the hard work!

I'll give it a go on Monday and get back to you with feedback! :)

Sure thing about the references
 
Upvote 0
Here is a dictionary version. See if that makes any difference.
Just check that the 2 wb (workbook) and 2 sht (sheet) references are correct.

VBA Code:
Sub totals_fromHDFC_AB()

    Dim wbHdfc As Workbook, shtHdfc As Worksheet
    Dim rngHdfc As Range, arrHdfc As Variant
    Dim wbChart As Workbook, shtChart As Worksheet
    Dim rngChart As Range, arrChart() As Variant
    Dim lrowHdfc As Long, lrowChart As Long, i As Long, j As Long
    Dim arrSum() As Double
 
    Set wbHdfc = Workbooks.Item("Payment Control - HDFC Leg.xlsm")
    Set shtHdfc = wbHdfc.Worksheets(1)
    lrowHdfc = shtHdfc.Cells(shtHdfc.Rows.Count, "F").End(xlUp).Row
    Set rngHdfc = shtHdfc.Range("A2:H" & lrowHdfc)
    arrHdfc = rngHdfc.Value
    ReDim arrSum(1 To UBound(arrHdfc), 1 To 2)
 
    Set wbChart = ThisWorkbook
    Set shtChart = wbChart.Worksheets(Sheet2.Name)
    lrowChart = shtChart.Cells(shtChart.Rows.Count, "D").End(xlUp).Row
    Set rngChart = shtChart.Range("A2:AH" & lrowChart)
    arrChart = rngChart.Value
 
    Dim dictHdfc As Object, dictKey As String

    Set dictHdfc = CreateObject("Scripting.dictionary")
 
    ' Load hdfc range into Dictionary & Sum rows with matching criteria
    For i = 1 To UBound(arrHdfc)
        dictKey = arrHdfc(i, 6) & "|" & arrHdfc(i, 7) & "|" & arrHdfc(i, 8)
        If Not dictHdfc.exists(dictKey) Then
            j = j + 1
            dictHdfc(dictKey) = j
            arrSum(j, 1) = arrHdfc(i, 4)
            arrSum(j, 2) = arrHdfc(i, 5)
        Else
            arrSum(dictHdfc(dictKey), 1) = arrSum(dictHdfc(dictKey), 1) + arrHdfc(i, 4)
            arrSum(dictHdfc(dictKey), 2) = arrSum(dictHdfc(dictKey), 2) + arrHdfc(i, 5)
        End If
    Next i
 
    ' Update rows with matching criteria
    For i = 1 To UBound(arrChart)
        dictKey = arrChart(i, 4) & "|" & arrChart(i, 10) & "|" & arrChart(i, 8)
        If dictHdfc.exists(dictKey) Then
            arrChart(i, 30) = arrSum(dictHdfc(dictKey), 1)
            arrChart(i, 31) = arrSum(dictHdfc(dictKey), 2)
        Else
            arrChart(i, 30) = 0
            arrChart(i, 31) = 0
        End If
    Next i
 
    ' Write back Totals
    rngChart.Columns(30).Value = Application.Index(arrChart, 0, 30)
    rngChart.Columns(31).Value = Application.Index(arrChart, 0, 31)
 
End Sub
Very nice Alex - should be considerably faster (y)
 
Upvote 0
Here is a dictionary version. See if that makes any difference.
Just check that the 2 wb (workbook) and 2 sht (sheet) references are correct.

VBA Code:
Sub totals_fromHDFC_AB()

    Dim wbHdfc As Workbook, shtHdfc As Worksheet
    Dim rngHdfc As Range, arrHdfc As Variant
    Dim wbChart As Workbook, shtChart As Worksheet
    Dim rngChart As Range, arrChart() As Variant
    Dim lrowHdfc As Long, lrowChart As Long, i As Long, j As Long
    Dim arrSum() As Double
 
    Set wbHdfc = Workbooks.Item("Payment Control - HDFC Leg.xlsm")
    Set shtHdfc = wbHdfc.Worksheets(1)
    lrowHdfc = shtHdfc.Cells(shtHdfc.Rows.Count, "F").End(xlUp).Row
    Set rngHdfc = shtHdfc.Range("A2:H" & lrowHdfc)
    arrHdfc = rngHdfc.Value
    ReDim arrSum(1 To UBound(arrHdfc), 1 To 2)
 
    Set wbChart = ThisWorkbook
    Set shtChart = wbChart.Worksheets(Sheet2.Name)
    lrowChart = shtChart.Cells(shtChart.Rows.Count, "D").End(xlUp).Row
    Set rngChart = shtChart.Range("A2:AH" & lrowChart)
    arrChart = rngChart.Value
 
    Dim dictHdfc As Object, dictKey As String

    Set dictHdfc = CreateObject("Scripting.dictionary")
 
    ' Load hdfc range into Dictionary & Sum rows with matching criteria
    For i = 1 To UBound(arrHdfc)
        dictKey = arrHdfc(i, 6) & "|" & arrHdfc(i, 7) & "|" & arrHdfc(i, 8)
        If Not dictHdfc.exists(dictKey) Then
            j = j + 1
            dictHdfc(dictKey) = j
            arrSum(j, 1) = arrHdfc(i, 4)
            arrSum(j, 2) = arrHdfc(i, 5)
        Else
            arrSum(dictHdfc(dictKey), 1) = arrSum(dictHdfc(dictKey), 1) + arrHdfc(i, 4)
            arrSum(dictHdfc(dictKey), 2) = arrSum(dictHdfc(dictKey), 2) + arrHdfc(i, 5)
        End If
    Next i
 
    ' Update rows with matching criteria
    For i = 1 To UBound(arrChart)
        dictKey = arrChart(i, 4) & "|" & arrChart(i, 10) & "|" & arrChart(i, 8)
        If dictHdfc.exists(dictKey) Then
            arrChart(i, 30) = arrSum(dictHdfc(dictKey), 1)
            arrChart(i, 31) = arrSum(dictHdfc(dictKey), 2)
        Else
            arrChart(i, 30) = 0
            arrChart(i, 31) = 0
        End If
    Next i
 
    ' Write back Totals
    rngChart.Columns(30).Value = Application.Index(arrChart, 0, 30)
    rngChart.Columns(31).Value = Application.Index(arrChart, 0, 31)
 
End Sub
so I couldn't wait to give this a go, and after some minor changes (I added the second chart - over 600k rows) ... it took 85 seconds! which is brilliant considering my version took well over 4 hours!
thank you so much
I'm going to break it down and learn how to write it myself for future use... I quite like the concatenation in there... I wasn't sure dictionaries could handle a 3rd criteria.. but that's a neat trick


RobP said:
Interested to know by how much ?

Just to know…

Thanks a lot for feedback, glad it helped

so I did do a test... I turned off all visual aides and after 20 minutes, i debugged and checked on how far along the loop had come and we were at 40,000 of 650,000 so I don't see a future there :S

thanks for all your help everyone!l! ! I'll mark this solved :)
 
Upvote 0
We appreciate you having tested and providing feedback on all the options. Glad we could help.

For what we are doing you don't really need the delimiter (I used the pipe symbol "|") in the key but it does make is easier to read in say the watch window and in some applications of the dictionary they do want to split it apart again in which case you do need the delimiter.
 
Upvote 0
We appreciate you having tested and providing feedback on all the options. Glad we could help.

For what we are doing you don't really need the delimiter (I used the pipe symbol "|") in the key but it does make is easier to read in say the watch window and in some applications of the dictionary they do want to split it apart again in which case you do need the delimiter.
It's the least one can do when so many take time to help out :) I really appreciate all the help

Thanks for explaining that :)

I do have one question about how you add values to the array

arrSum(dictHdfc(dictKey), 1) = arrSum(dictHdfc(dictKey), 1) + arrHdfc(i, 4)
Does dictHdfc(dictKey) provide the row number here? Or does it directly provide the key value (xxxxx|yyyyy|1)?

Its quite surprising if it's key value, I wasn't aware arrays worked like that :o
 
Upvote 0
It is providing the row number.

If we were only summing one field we could have used the value field of the dictionary to hold the total but since we are accumulating 2 fields, we are using the arrSum to hold the accumulated values and using the value field of the dictionary to hold which row / position in arrSum
 
Upvote 0
It is providing the row number.

If we were only summing one field we could have used the value field of the dictionary to hold the total but since we are accumulating 2 fields, we are using the arrSum to hold the accumulated values and using the value field of the dictionary to hold which row / position in arrSum
That's really clever!! Thank you for explaining that ☺️

Appreciate it
 
Upvote 0
@ouvay - ha ok, well I suppose 2.5 hours is better than over 4 .. so it made a considerable difference .. (even if still not usable! :-))

As I thought (when I used dictionary on one of my projects, the result was almost instantaneous (well, less than 10 seconds) but I was only using up to 100k lines rather than your 650k+.)_

Looks like @Alex Blakenburg has nailed it for you. Good stuff.
 
Upvote 0

Forum statistics

Threads
1,223,919
Messages
6,175,371
Members
452,638
Latest member
Oluwabukunmi

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top