Filtering in VBA

JoeS01

Well-known Member
Joined
Jun 25, 2005
Messages
832
I have a large 2 dimensional array set up in VBA, and wish to count the number of items where the value is below a certain figure, then move on.

Rather than going through each cell, is there any quick way I can filter the array and use the criteria as CellValue<=1 and then count the number of cells meeting this criteria.

All help greatly appreciated
 

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.
I didn’t really want to come back to this one. But since it’s continued on with some interesting views expressed, I guess there’s some interest in where it leads.

(a) Jindon’s code. I had hoped he’d be around to explain this himself. His approach was to fill (each cell of) an excel range with a random number formula, then convert the formula to values, then to put the values into a range/array (called a) in which he then counted the < =1 using a neat and simple bit of code.
I restate that approach below, and hope my reformulation answers the questions you asked about his meanings.
Also I give the data generation, the Jindon approach, and the CountIf approach recommended by Nate all together, with each part timed by the Excel timer.
You might play around around with the size of the date set to count the <=1 (by changing rws and cls as you like near the top). Currently it’s 2 million random numbers.
I think you’ll find Jindon’s counting method gives the same (correct) result faster for ALL data sets that are of a largish size, right up to the limits of you computer’s RAM, than does the CountIf approach.
Now, why do you think this is so? You might also assess the speed of Nate’s code, for comparison purposes.
CountIf is a worksheet function, suited to work on worksheets and best used there. Array-based approaches are better suited to VBA. You can there handle problems almost as complex as you like, and often, as your current problem shows, handle even the simpler ones faster than can worksheet functions.
Code:
Sub allup()
[a:iv].ClearContents
td = Timer
rws = 50000: cls = 40
Set a = Range([a1], Cells(rws, cls))
    a.Formula = "=rand()*2"
    a = a.Value
MsgBox "Range/array generated and listed in " & Timer - td & " secs"

'VBA counting from array
tm = Timer
For Each e In a
    If e <= 1 Then p = p + 1
Next
x = Timer - tm
MsgBox "VBA counted " & p & " <=1 in " & x & " secs"

'CountIf counting from range on spreadsheet
tf = Timer
q = WorksheetFunction.CountIf([a1].CurrentRegion, "<=1")
y = Timer - tf
MsgBox "CountIf counted " & q & " <=1 in " & y & " secs"

If x > 0 And y > 0 Then _
MsgBox "Formula (CountIf) approach is " _
& Format((y / x - 1) * 100, "0.0") & " percent slower"

End Sub

(b) Nate’s comments are generally good, but his timing method is not. Using “now” to express his current time his time measurements are in days. Not very surprising then that the time taken over this sort of problem barely registers.
The excel timer uses units of 1/64 seconds and gives a much better idea of how long a procedure actually takes.

(c) Arrays and ranges. An array is an ordered collection of data. A range (in excel) is a specified cell or number of cells, which needn’t even contain any data. Excel ranges have at most two dimensions, whereas arrays can have as many as you like, and it’s sometimes useful to use plenty. Most arrays can’t be shown on spreadsheets. and confining oneself to only the ones that can is pretty restrictive.
I used both range and array in my initial post above. That was to find just what you wanted as much as to provide a solution.

(d) Dimensioning or declaring in Excel VBA. This is a in a interesting and debatable topic. Not to say its never necessary, but in most cases appearing on this forum the suggested codes are overdimmed by very large amounts. Keep your codes as straightforward and uncluttered as possible and you’ll have less problems with them, just as you might in other aspects of your life.
Long codes and complex problems are a different issue, and keeping close track of all variables in then a good idea.
I note you tried dimensioning Jindon’s code. Entirely unnecessary complexity in that case and a waste of time and space. If you don’t dim that code it runs perfectly (or does on my machine). If you had problems running it, dimming or not dimming was NOT the problem.
There’s a long established philosophical principle, called Ockam’s Razor, which goes back some 7 centuries to a group of thinkers called the Medieval Scholastics (real interesting bunch, Duns Scotus, Peter Abelard who got castrated because of his disapproved love affair with Heloise, Roger Bacon who invented gunpowder, Desiderius Erasmus, Thomas Acquinas etc. This isn’t the right forum for this sort of stuff and maybe not for me either.) One version is “it is vain to do with more what can be done with fewer”, another is “entities should not be multiplied beyond necessity”. This idea has had significant influence on for example the development of evolutionary theory on biology.
Personally, I also like it when applied to code-writing, and only dim when I see it as useful, which is not that often. Never had a problem that I recall, and I’ve done quite a lot of code-writing.
Of course must one respect Nate’s view that the way he does it is “doing it right”. I guess most of us think that way.
 
Upvote 0
I am always right!! :wink: :LOL:

Well, maybe not, but there's a few flaws in your post worth discussing. The first is that Now is a little higher resolution than what you're stating it is, it returns a Serial Date-time, with resolution in the seconds, the same as Timer, eh. Are you thinking of Date? Try the following:

Code:
Debug.Print Now
And see the following regarding timing:

http://support.microsoft.com/kb/172338

Function Units Resolution
---------------------------------------------------------------------------
Now, Time, Timer seconds 1 second
Note, Timer, is not really part of Excel's Object model, but that of VB[A].

So, if you really want a high-resolution counter, don't use Now or Timer, use QueryPerformanceCounter.

I think you’ll find Jindon’s counting method gives the same (correct) result faster for ALL data sets that are of a largish size, right up to the limits of you computer’s RAM, than does the CountIf approach.
Negative, your test is flawed, so your results are flawed and misleading. One such flaw is to exclude 'a = a.Value' (more on this practice, later) from your time-test. Another flaw is to not handle how you set the size of the Range consistently within your timed code. If we're going to compare this in a meaningful way, we need to be consistent; let's try this again...

Instead of evaluating your Sub Routine line-by-line and attempting to do anything with it, I set up a new, high-resolution comparison, and we're going to do this apples-to-apples; that is, independent Sub Routines to return the Count to a Variable called myCnt.

Fill A1:Z50000 with values before running any of this, now try the following:

Code:
Option Explicit

Public Declare Function QueryPerformanceFrequency _
    Lib "kernel32.dll" ( _
    lpFrequency As Currency) As Long

Public Declare Function QueryPerformanceCounter _
    Lib "kernel32.dll" ( _
    lpPerformanceCount As Currency) As Long

Private Sub TimerTime()
    Dim Ctr1 As Currency, Ctr2 As Currency, Freq As Currency
    Dim Overhead As Currency, i As Long
    QueryPerformanceFrequency Freq
    QueryPerformanceCounter Ctr1
    QueryPerformanceCounter Ctr2
    Overhead = Ctr2 - Ctr1 ' determine API overhead
    QueryPerformanceCounter Ctr1 ' time loop
    
    For i = 1 To 10
        'Call foo
        'Call bar
        'Call VarLoop
        Call verySlow
    Next i
    
    QueryPerformanceCounter Ctr2
    Debug.Print (Ctr2 - Ctr1 - Overhead) / Freq
End Sub

Sub foo()
Dim myCnt As Long
Let myCnt = WorksheetFunction.CountIf(Range("A1:Z50000"), "<1")
End Sub

Sub bar()
Dim varArr() As Variant, i As Long, j As Long
Dim myCnt As Long
Let varArr = Range("A1:Z50000").Value
For i = LBound(varArr, 1) To UBound(varArr, 1)
    For j = LBound(varArr, 2) To UBound(varArr, 2)
        If varArr(i, j)< 1 Then Let myCnt = myCnt + 1
    Next j
Next i
End Sub

Sub VarLoop()
Dim varArr() As Variant, varElem As Variant
Dim myCnt As Long
Let varArr = Range("A1:Z50000").Value
For Each varElem In varArr
    If varElem< 1 Then Let myCnt = myCnt + 1
Next
End Sub

Sub verySlow()
Dim rngData As Range, cl As Range
Dim myCnt As Long
Set rngData = Range("A1:Z50000")
For Each cl In rngData
    If cl.Value< 1 Then Let myCnt = myCnt + 1
Next
Set rngData = Nothing
End Sub
Note, myCnt returns the same value in all 4 approaches... Here were the timed results on my machine:
Book1
ABCD
1CountIf()For i = LoopFor Each Variant LoopRange Iteration
25.40801135313.1640018512.1113297962.57259736
35.4501938112.8395530712.4625456662.1728133
45.41648393912.6118501112.14436863
55.43681808712.6053305712.32084804
65.42215197712.5652352512.20144879
75.41201353812.7125402812.13423438
85.41728040912.6131357512.22174494
95.50733878212.6975934112.03445103
Time Results


All times are in seconds...

I was surprised to see that the Variant Loop (3rd procedure) performed better on average than the For i = loop. But, obviously, CountIf() is blowing the other three out of the water; not even close! And I grew tired of of watching the Range Iteration approach taking its sweet time (your first recommendation in this thread, I believe).

So, allow me to clarify, if you have you data in a Range, use CountIf(), as I stated before. Otherwise, iterate through the array. Hey, if you want short and sweet code, CountIf() certainly gives you this.
Not to say its never necessary, but in most cases appearing on this forum the suggested codes are overdimmed by very large amounts.
Overdimmed? Erm, I've never heard anyone say something like this... Such a curious statement, I just had to Google it; the returns were far and few between:

http://www.google.com/search?q=overdimmed+"visual+basic"

I note you tried dimensioning Jindon’s code.
I didn't 'try' to do anything, I did declare his Variables. ;)

Best practice? There's really no doubt about it, declare your Variables. I can't really believe anyone would seriously argue otherwise...

In fact, it took me a few minutes to figure out what 'a' was at which point in your code... It appears to morph. from a Range Object to a Fundamental Data Type, that being the Variant Array... Very confusing, on what could be a very simple Sub Routine...

Now what I will give you, is that people tend to overuse Variables in general. If you only refer to a Worksheet once in your code, burning it into a Worksheet Object Variable might be a little gratuitous. But if you're going to use a Variable you should declare it.

Variable declaration, this way, you're not using a Variant where you don't want to and your code is self-documented. ;)
 
Upvote 0
Hi Joe,
Code:
Sub test() 
Dim a As variant  ' changed (array)

Dim x As Double 
Dim e As variant  ' changed (element in an array)
Dim n As Long 
Dim y As Double 

With Range("a1:z20000") 
    .Formula = "=rand()*2" 
    .Value = .Value      <<<<< 
    a = .Value              <<<<< 
End With 

x = Timer 
For Each e In a          <<<<<< 
    If e <= 1 Then n = n + 1 
Next 
y = Timer - x 
MsgBox n & " : took " & y 
End Sub
1) .Value = .Value
needs to be converted as values since the formula value will change from its nature (RAND function)

2) a = .Value
is a=range("a1:z20000").value
now a is an array hold the values in that range..

so, the first part is only to generate sample array,,, and it didn't actually need to use the range though.

You know, as I always declare variables clearly, and it gives us following benefits:
1) Faster: when you don't declare variables, Excel need the time to find out the data type for the variable
2) Memory usage: obviously Varinat type comsumes more...
3) Automatic property/method appearance: this helps a lot to find out the precise property/method to use.
 
Upvote 0
With thanks, jindon

Many thanks for taking the time and trouble to explain this, jindon - it certainly is much clearer and is a clever bity of programming
 
Upvote 0
With thanks

What a great discussion you guys have on this topic, and from this we all learn lots - many thanks. :biggrin:

I appreciate the time and effort taken to explain the various bits of code and the discussion on timing. For me it has been a most valuable learning experience.
 
Upvote 0
You're welcome. :)

Just be sure you're not confusing Date and Now, as I believe happened earlier in this thread, somehow, e.g.,

Code:
Sub foo()
Debug.Print Date, Now
End Sub
;)
 
Upvote 0

Forum statistics

Threads
1,221,626
Messages
6,160,909
Members
451,677
Latest member
michellehoddinott

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top