CountIF alternative for performance improvements

ItalianPlatinum

Well-known Member
Joined
Mar 23, 2017
Messages
880
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
Little background I have an inventory management tool that consolidates inventory for products. There is no reporting for this tool. So before I run it I have no idea how long or how many lots could be eliminated. Therefore I need visibility before executing. My journey has been to find a way to replicate what it does via a macro (VBA) as reporting. The tool combines lots if they match criteria (cost, identifier). I.e. if two occurrences for the same identifier it sells 2 and buys on a combined 1. Trying to capture that.

I am able to extract data from the system but that data is so large it reaches the excel row limit so I have to move the data into sheets to break up the data, for example if identifier starts with G I move it to Sheet G, and so forth. My VBA currently loops through each identifier (extracting from the system). If starts with A extract from system, then transfer to sheet A, then run for B, transfer to sheet B. I then apply the countIF to find out how many lots fit the parameters and that is where my problem resides. My battle is how to handle the data. I created a summary sheet that lists all identifiers but then I had problems getting that to work. I have 3,700 identifiers, among those there are a total of 3 million lots. When I ran the VBA it took at day to run :sick:. Looking for some big time help here and willing to scrape it all if there is a better approach like leveraging the identifier approach on my summary sheet?

VBA Code:
With Worksheets(SheetName)
        .Range("A1").Resize(lastRow - 10 + 1, 12).Value = WsSec.Range("A10:L" & lastRow).Value
lr1 = .Cells(rows.count, "A").End(xlUp).row
        .Range("M1:Q1") = Array("Rounded 2 digit", "Lot Greater than 1yr", "Lot Greater than 3yr", "For formula", "Same Occurrence")
        .Application.Calculation = xlAutomatic
        .Range("P2:P" & lr1).NumberFormat = "General"
        .Range("M2:M" & lr1).Formula = "=Round(K2, 2)"
        .Range("N2:N" & lr1).Formula = "=if(RUN-E2>365,""YES"",""NO"")"
        .Range("O2:O" & lr1).Formula = "=if(RUN-E2>(365*3),""YES"",""NO"")"
If TFLG = "N" Then
        .Range("P2:P" & lr1).Formula = "=D2&M2"
Else
        .Range("P2:P" & lr1).Formula = "=D2&E2&M2"
End If
        .Range("P2:P" & lr1).NumberFormat = "@"
        .Range("Q2").Formula2 = "=COUNTIF($P$2:$P$" & lr1 & ",$P$2:$P$" & lr1 & ")"
        .Range("M2:P" & lr1).Value = .Range("M2:P" & lr1).Value
End With

VBA Code:
.Range("Q2").Formula2 = "=COUNTIF($P$2:$P$" & lr1 & ",$P$2:$P$" & lr1 & ")"
 
I extract the data from my core system by use of a SUB identifier. All the data is in the core system, The SUB identifier gives me the ability to see what's in my core system. The Excel wordbook is a data extracting tool from our web based system. I just added additional code, sheets for my project. Hope that helps answer your question.
This is a very generic description which fits many different systems, so not much help. The devil is in the details - the Identifier code, the "core system" details.
Since you have a system in place which doesn't take a day-off to crunch the data and spit out a report, then it is probably better suited to deal with the data. Use it's capabilities, if possible. Let it do the hard work. But to know what it can give you you have to know it and to be able to get it out of it. In this respect it probably helps to see the code of the Identifier sub which may give a clue as to what this system is and how it works.
I am not sure how to even do this but what if after I set the formulas then I have VBA that delete any row with Column Q equal to 1? Cause in reality if there is no pair or match that information is of no purpose to me. But I don't know if that helps me cause the formulas have to run in order to determine that but maybe it helps the next loop because less data in the workbook?
Maybe it is possible to extract from the system only the values with count > 1. Then you will have less work to do.
 
Upvote 0

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
Yeah I was concerned I was going to be in trouble replicating what it does for such a large data set. The extract is pretty dry. It can only do simple extracts. Like =, <, >, <> but not any logical items. That is where most of my other VBAs fill those voids and was trying to find a way to fill this void as well.

Is it fair to say everything outside of that "identifier sub," which like I said in the opening post is the one item I cannot modify, is relatively optimized in your eyes? i.e. transftering the data to other sheets and the use of the countif formulas? I wasn't sure if there was VBA that could kick out the countif formulas faster. I did notice and I posted in another thread I for some reason cut my speed time by 50% by adding a hyphen in my countif logic. I found that odd....

I posted another thread to test/see if deleting rows with 1 occurrence could reduce the amount of data in the file. Not sure the time usage to perform that analysis and operation. But figured worth a try.
 
Upvote 0
I wasn't sure if there was VBA that could kick out the countif formulas faster.
It depends on the conditions. See some test results below.

1713676115281.png
 
Upvote 0
See this:
Excel Formula:
=UNIQUE(P:P,,TRUE)
Shows all items that appear exactly once, and I would say it's lightning fast
 
Upvote 0
Ok so the thought could be initially I use that formula then use some logic to delete all true rows then come back with my count if?
 
Upvote 0
On a small data set i didn't try my bigger one i was able to gain 9 seconds out of 3.5min runtime. by doing the below. I mean probably could be a more efficient way to do it. I am figured each next loop will recalc off less data from the prior if the sheet has less data in it? Wishful thinking.....

VBA Code:
With Worksheets(SheetName)
lr1 = .Cells(rows.count, "A").End(xlUp).row
        .Range("M1:Q1") = Array("Rounded 2 digit Cost", "Greater than 1yr", "Greater than 3yr", "For formula", "Same Occurrence")
        .Application.Calculation = xlAutomatic
        .Range("P2:P" & lr1).NumberFormat = "General"
        .Range("M2:M" & lr1).Formula = "=Round(K2, 2)"
        .Range("N2:N" & lr1).Formula = "=if(RUN-E2>365,""YES"",""NO"")"
        .Range("O2:O" & lr1).Formula = "=if(RUN-E2>(365*3),""YES"",""NO"")"
If TERMFLG = "L" Then
        .Range("P2:P" & lr1).Formula = "=""_""&D2&M2"
Else
        .Range("P2:P" & lr1).Formula = "=""_""&D2&E2&M2"
End If
        .Range("P2:P" & lr1).NumberFormat = "@"
        .Range("Q2").Formula2 = "=COUNTIF($P$2:$P$" & lr1 & ",$P$2:$P$" & lr1 & ")"
        .Range("M2:P" & lr1).Value = .Range("M2:P" & lr1).Value
        .Range("3:3").AutoFilter Field:=17, Criteria1:="1"
        .Range("Q2").Value = .Range("Q2").Value
        .Application.DisplayAlerts = False
        .Range("A3:Q" & lr1).SpecialCells(xlCellTypeVisible).Delete
        .Application.DisplayAlerts = True
lr2 = .Cells(rows.count, "A").End(xlUp).row
        .Range("Q2").Formula2 = "=COUNTIF($P$2:$P$" & lr2 & ",$P$2:$P$" & lr2 & ")"
        '.Range("P2").Formula2 = "=COUNTIFS($D$2:$D$" & lr1 & ",$D$2:$D$" & lr1 & ",$M$2:$M$" & lr1 & ",$M$2:$M$" & lr1 & ")"
        .Range("1:1").AutoFilter
        .Cells.EntireColumn.AutoFit
        .Activate
End With
 
Upvote 0
Ok did some speed tests. I did one more modification and included this in my loop. Also, ran for my largest dataset. was able to cut it in half. but 1hr44min still stings but such a long improvement from before. I am not sure where i can find any more optimization.

VBA Code:
 .Range("Q2:Q" & lr1).Value = .Range("Q2#").Value

1713744640227.png
 
Upvote 0
For really large datasets in VBA code, there's an object called a "dictionary", used to identify and summarize data based on unique values, similar to the COUNTIF function.
I believe it will help reduce your processing time. I can assist if you can share an image (or upload a link for me to download it). Of course, the number of rows will be reduced, but still ensuring representation.
 
Upvote 0
Dictionary approach looks like something that's worth exploring. I just always assumed simple arrays are faster.
On a sample dataset your original formula took 47 seconds, this crunched it in less than 1.
 
Upvote 0
and based on this, the code below produces the same output as your original Countif formula in about 328.12 miliseconds:
VBA Code:
Sub CountIf_2()
  Dim a As Variant, b As Variant
  Dim dic As Object
  Dim cad As String
  Dim i As Long
 
  Set dic = CreateObject("Scripting.Dictionary")
 
  a = Range("A1").CurrentRegion
  ReDim b(1 To UBound(a, 1), 1 To 1)
  For i = 1 To UBound(a, 1)
      dic(a(i, 1)) = dic(a(i, 1)) + 1
  Next
  For i = 1 To UBound(a, 1)
    b(i, 1) = dic(a(i, 1))
  Next
  Range("C1").Resize(UBound(b, 1)).Value = b
End Sub
so compared with 4o-something seconds there is as significant performance improvement, more than 100x.
 
Upvote 0
Solution

Forum statistics

Threads
1,224,811
Messages
6,181,081
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top