Is There A Faster Way To Do This?

CaliKidd

Board Regular
Joined
Feb 16, 2011
Messages
173
I am using Excel 2007 and the following code to find blank/empty cells in a large column of data. If blanks exist, I use a variable to count the number of them and I highlight the cell using a sub.

The problem is this loop is slow because it requires every cell to be scanned. Is there a faster way, perhaps using a range object, variant array, find, intersect/union, etc? Unfortunately, I am not well-versed in these so I need some assistance.

For reasons I won't elaborate on, I do not want to use a conditional formatting solution.

Code:
For Each Scanned_Cell In Range("A5:A1000000")
[INDENT]Scanned_Cell.Select
If Len(Trim(Scanned_Cell)) = 0 Then
[INDENT]Scanned_Cell.Select
Call Highlight_Cell
Blanks = Blanks + 1
[/INDENT]End If
[/INDENT]Next
Also, instead of finding blank cells, what mods would be required if I wanted to:
1. Assuming the data is numeric, find any cell with a specific value (e.g., 0 (zero))?
2. Assuming the data is numeric, find any cell with a value greater than a certain number (e.g., > 1)?
3. Assuming the data is string, find any cell that is not equal to the letter "A", "B" or "C"?

Thanks in advance.
 
Peter,

"Using many of the concepts of mirabeau's code, but with some slight variations, on my sample data (about 1/3 of the cells are blank) this takes only about half as long."

Just curious about this statement.
so I generated a bit of test data (with about 1/3 blank cells out of about one million) with a test data code
Code:
With Range("A5:A1000000")
    .Cells = "=if(rand()>.33,int(rand()*50),"""")"
    .Value = .Value
End With
and ran the code you provided and the code I provided three times each.
Peter_SSs code timed: 9.03secs, 9.04secs, 9.11secs
mirabeau code timed: 5.74secs, 5.75secs, 5.73secs
Am I wrong somewhere? but would you like to expand on your "... takes only about half as long", since this doesn't seem to be correct?
The special cells approach (unmodified) took me about 8 seconds on the same data.



CaliKidd,

If you like the idea of using autofilter on this sized problem then I wish you good luck!!
 
Upvote 0

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
Peter,
CaliKidd,

If you like the idea of using autofilter on this sized problem then I wish you good luck!!
Well, I like the idea of using Excel's native autofilter technology, but I don't know if there are mean, ugly monsters waiting for me in the code world. I'm too naive at this point to know any better. :biggrin:

My comment was simply based on the fact that I was just playing with the autofilter directly in Excel and it seemed fast and powerful. I like the fact I could filter in multiple ways. Now, how to step through those filtered cells like Rorya described is a mystery to me. :confused:

At the end of the day, I'll be happy with any solution that is faster than my current incremental VBA loop. I'd still be interested in how to accomplish 1, 2, and 3 using your approach. If an autofilter solution arises, then I'd like to evaluate it too, and also to learn from it by studying the code.
 
Upvote 0
hey CaliKidd,

You have to try whatever you feel inclined to do. That's how we all learn things.

I certainly don't want to discourage you from anything at all.

With your 1. 2 and 3 problems and the code I gave, my preliminary thoughts were using the Match function and/or a bit of extension/variation of the sort-based code. Details would really depend on any example data you provided.

But if you get other good answers in the meantime, I'd be very happy with that. I can't do anything more for some time anyway.
 
Upvote 0
Mirabeau,

Thanks for your support. This forum really helps. I've read several 3" thick books on VBA, which have helped me learn the core concepts and basic fundamentals, but now I'm looking for some better solutions to specific problems. In most cases, I've hacked out a brute force solution, but it's slow/inefficient. It's ok with small data sets, but it drags with large ones. Hence, my search for a better way.

One thing I've learned by interacting with people on this forum is there are many different ways to getting a job done. There are a lot of smart people on this forum. When I interact with people more advanced than myself, such as yourself, it really helps in my development. I've learned more in 100 posts than I have reading 5,000+ book pages! When I step through people's code, I'm always impressed at the way they approached the problem. Definitely lots of wisdom gained. I hope one day soon I can help others too.

The data involved in #1 and #2 is purely numeric. I'm either trying to find cells with a specific value or trying to find cells that satisfy a particular condition (<, >, <>, etc.). Think: test scores.

The data behind #3 is string and just one letter (think: school grades). In this case, I am trying to find any cells that do not match a particular condition; i.e., not equal to a certain letter, such as "A", "B", "C", etc.
 
Upvote 0
.. I generated a bit of test data (with about 1/3 blank cells out of about one million) with a test data code
Code:
With Range("A5:A1000000")
    .Cells = "=if(rand()>.33,int(rand()*50),"""")"
    .Value = .Value
End With
and ran the code you provided and the code I provided three times each.
Peter_SSs code timed: 9.03secs, 9.04secs, 9.11secs
mirabeau code timed: 5.74secs, 5.75secs, 5.73secs
Am I wrong somewhere? but would you like to expand on your "... takes only about half as long", since this doesn't seem to be correct?
mirabeau

My original test data was just a mixture of blanks, text and integers like below, manually entered, then copied down to row 1,000,000

Excel Workbook
A
5
6gd
7dfg
8
9g
10fg
11
12
13dfgs
142
156
160
Sheet4


I've just retested that and numerical values generated by your random number formula 4 times using each code. My results were, in seconds:
Peter_SSs code: 3.805, 3.758, 3.789, 3.699
mirabeau code: 7.254, 7.160, 7.051, 7.207

I don't have an explanation for the turn-around of the results between your tests and mine. :confused:
Do you have any ideas?

One thing I didn't mention is that I tested on Excel 2010 whereas the OP is using 2007. What were you using?

I have now been able to test on Excel 2007 and in a test on each set of data with each code I got:
Peter_SSs: 5.477, 6.254
mirabeau: 6.348, 6.645

Interestingly, though mine still tested slightly faster than yours, mine was considerably slower on 2007 v 2010 while yours was faster.



CaliKidd

Would be interested in your speed results.

However, given your confirmation that data is only in column A, this is a marginally faster (for me anyway :)) version of my earlier code.

<font face=Courier New><br><SPAN style="color:#00007F">Sub</SPAN> ColourAndCountBlanks_2()<br>    <SPAN style="color:#00007F">Dim</SPAN> rws <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, c <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, z <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN><br><br>    <SPAN style="color:#00007F">Const</SPAN> fr <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN> = 5<br>    <SPAN style="color:#00007F">Const</SPAN> lr <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN> = 1000000<br>    <br>    rws = lr - fr + 1<br>    Application.ScreenUpdating = <SPAN style="color:#00007F">False</SPAN><br>    <SPAN style="color:#00007F">With</SPAN> Cells(fr, "A").Resize(rws, 2)<br>        .Cells(1, 2).Value = 1<br>        .Offset(, 1).Resize(, 1).DataSeries<br>        .Sort Key1:=.Cells(1, 1), _<br>            Order1:=xlAscending, Header:=xlNo<br>        z = .Cells(1, 1).End(xlDown).Row<br>        c = lr - z<br>        <SPAN style="color:#00007F">If</SPAN> c > 0 <SPAN style="color:#00007F">Then</SPAN><br>            Cells(z + 1, 1).Resize(c).Interior.ColorIndex = 3<br>        <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">If</SPAN><br>        .Sort Key1:=.Cells(1, 2), _<br>            Order1:=xlAscending, Header:=xlNo<br>        .Offset(, 1).Resize(, 1).ClearContents<br>    <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">With</SPAN><br>    Application.ScreenUpdating = <SPAN style="color:#00007F">True</SPAN><br>    MsgBox "Count = " & c<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Sub</SPAN><br></FONT>


In relation to your other problems, this is slower than the 'blanks' and there may be faster ways, but try this for
1. Find cells with a specific value (e.g., 0 (zero))

<font face=Courier New><br><SPAN style="color:#00007F">Sub</SPAN> ColourAndCount()<br>    <SPAN style="color:#00007F">Dim</SPAN> rws <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, c <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, z <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN>, i <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN><br>    <SPAN style="color:#00007F">Dim</SPAN> rA <SPAN style="color:#00007F">As</SPAN> Range, rB <SPAN style="color:#00007F">As</SPAN> Range, rC <SPAN style="color:#00007F">As</SPAN> Range, rBC <SPAN style="color:#00007F">As</SPAN> Range, rABC <SPAN style="color:#00007F">As</SPAN> Range<br>    <SPAN style="color:#00007F">Dim</SPAN> b<br><br>    <SPAN style="color:#00007F">Const</SPAN> fr <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN> = 5<br>    <SPAN style="color:#00007F">Const</SPAN> lr <SPAN style="color:#00007F">As</SPAN> <SPAN style="color:#00007F">Long</SPAN> = 1000000<br>    <br>    rws = lr - fr + 1<br>    <SPAN style="color:#00007F">Set</SPAN> rA = Cells(fr, "A").Resize(rws)<br>    <SPAN style="color:#00007F">With</SPAN> rA<br>        <SPAN style="color:#00007F">Set</SPAN> rB = .Offset(, 1)<br>        <SPAN style="color:#00007F">Set</SPAN> rC = .Offset(, 2)<br>        <SPAN style="color:#00007F">Set</SPAN> rABC = .Resize(, 3)<br>    <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">With</SPAN><br>    <SPAN style="color:#00007F">Set</SPAN> rBC = rB.Resize(, 2)<br>    b = rA.Value<br>    <SPAN style="color:#00007F">For</SPAN> i = 1 <SPAN style="color:#00007F">To</SPAN> rws<br>        <br>        <SPAN style="color:#007F00">'*******************************</SPAN><br>        <SPAN style="color:#007F00">'Cells = 0</SPAN><br>        <SPAN style="color:#00007F">Select</SPAN> <SPAN style="color:#00007F">Case</SPAN> b(i, 1)<br>            <SPAN style="color:#00007F">Case</SPAN> vbNullString<br>                b(i, 1) = "#"<br>            <SPAN style="color:#00007F">Case</SPAN> 0<br>                b(i, 1) = vbNullString<br>        <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Select</SPAN><br>        <SPAN style="color:#007F00">'*******************************</SPAN><br>    <br>    <SPAN style="color:#00007F">Next</SPAN> i<br>    Application.ScreenUpdating = <SPAN style="color:#00007F">False</SPAN><br>    rB.Value = b<br>    rC.Cells(1).Value = 1<br>    rC.DataSeries<br>    rABC.Sort Key1:=rB.Cells(1), Order1:=xlAscending, Header:=xlNo<br>    z = rB.Cells(1).End(xlDown).Row<br>    c = lr - z<br>    <SPAN style="color:#00007F">If</SPAN> c > 0 <SPAN style="color:#00007F">Then</SPAN><br>        Cells(z + 1, rA.Column).Resize(c).Interior.ColorIndex = 3<br>    <SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">If</SPAN><br>    rABC.Sort Key1:=rC.Cells(1), Order1:=xlAscending, Header:=xlNo<br>    rBC.ClearContents<br>    Application.ScreenUpdating = <SPAN style="color:#00007F">True</SPAN><br>    MsgBox "Count = " & c<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Sub</SPAN><br></FONT>


Then for the other problems, replace the section between the '****** rows with the relevant section from below. If they don't do what you want, post back with more details.


<font face=Courier New><SPAN style="color:#007F00">'Cells with value > 1</SPAN><br><SPAN style="color:#00007F">Select</SPAN> <SPAN style="color:#00007F">Case</SPAN> Val(b(i, 1))<br>    <SPAN style="color:#00007F">Case</SPAN> <SPAN style="color:#00007F">Is</SPAN> > 1<br>        b(i, 1) = vbNullString<br>    <SPAN style="color:#00007F">Case</SPAN> 0<br>        b(i, 1) = "#"<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Select</SPAN><br><br><SPAN style="color:#007F00">'Cells not equal to A or B or C</SPAN><br><SPAN style="color:#00007F">Select</SPAN> <SPAN style="color:#00007F">Case</SPAN> b(i, 1)<br>    <SPAN style="color:#00007F">Case</SPAN> "A", "B", "C"<br><br>    <SPAN style="color:#00007F">Case</SPAN> <SPAN style="color:#00007F">Else</SPAN><br>        b(i, 1) = vbNullString<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Select</SPAN><br><br><SPAN style="color:#007F00">'Numeric cells</SPAN><br><SPAN style="color:#00007F">Select</SPAN> <SPAN style="color:#00007F">Case</SPAN> b(i, 1) <> vbNullString And IsNumeric(b(i, 1))<br>    <SPAN style="color:#00007F">Case</SPAN> <SPAN style="color:#00007F">True</SPAN><br>        b(i, 1) = vbNullString<br>    <SPAN style="color:#00007F">Case</SPAN> <SPAN style="color:#00007F">False</SPAN><br>        b(i, 1) = "#"<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Select</SPAN><br>        <br><SPAN style="color:#007F00">'Text cells</SPAN><br><SPAN style="color:#00007F">Select</SPAN> <SPAN style="color:#00007F">Case</SPAN> IsNumeric(b(i, 1))<br>    <SPAN style="color:#00007F">Case</SPAN> <SPAN style="color:#00007F">True</SPAN><br>        b(i, 1) = "#"<br>    <SPAN style="color:#00007F">Case</SPAN> <SPAN style="color:#00007F">False</SPAN><br>        b(i, 1) = vbNullString<br><SPAN style="color:#00007F">End</SPAN> <SPAN style="color:#00007F">Select</SPAN></FONT>
 
Upvote 0
Actaully, not sure why I was using a text value.
Changing
b(i, 1) = "#"
to
b(i, 1) = 0
in the codes in my previous post saves approximately 1/3 to 1/2 second for me.
 
Last edited:
Upvote 0
... mainly for Peter_SSs


Hi,

I did recheck that stuff on how long it takes using your suggested data (generated via the following code)
Code:
Sub pssstestdata()
Dim nul, c(), j&
Dim u&, e, arr
ReDim c(1 To 10 ^ 6 + 8, 1 To 1)
nul = vbNullString
arr = Array(nul, "gd", "dfg", nul, "g", "fg", _
        nul, nul, "dfgs", 2, 6, 0)
For j = 1 To 10 ^ 6 Step UBound(arr) + 1
    For Each e In arr
        u = u + 1
        c(u, 1) = e
    Next e
Next j
[a5].Resize(10 ^ 6 - 4) = c
End Sub
The results were essentially the same as in my Post#21. As would be expected, the specifics of the data (with about 1/3 blanks) made little difference difference to the respective rankings and the times.

Yes, it is interesting and remarkable that you and I should get pretty much opposite results from exactly the same data.

However, in this case, it's also good that any divergence can be easily and objectively assessed by any third party with an interest, because the requisite data is readily and immediately available.

I have Excel 2007, HP portable computer with I7 processor and 4gb ram. I don't have access to a usable version of Excel 2010.
 
Upvote 0
hey Calikid,

You were also interested in highlighting specific values. Say all values of zero in a list about 1 million long.

There's a code in the previous post which will relatively quickly list Peter's suggested test data down to row 1miilion in ColA.

One in every 12 is a zero.

To highlight all the zeros in ColA, try the following code, which took under two seconds on my computer to do the job.
Code:
Sub hilitezeros() 
Dim t#
t = Timer
Dim e As Range, a As Range, ang As String
Dim stg, x&, k&, p&, nr&
Set e = Range("A1")
nr = e(Rows.Count).End(3).Row
Set a = Range(e, e(nr))
stg = 0

Do
x = Application.Match(stg, a, 0)
If IsError(x) Then Exit Do
k = k + x
ang = ang & ",A" & k
If Len(ang) > 245 Or k >= nr Then
    Range(Right(ang, Len(ang) - 1)).Interior.Color = vbGreen
    ang = vbNullString
End If
If k >= nr Then Exit Do
p = p + 1
Set a = Range(e(k + 1), e(nr))
Loop

MsgBox "Code took " & Format(Timer - t, "0.00 secs") & Chr(10) & _
    Chr(10) & "Count = " & p
Cells(Rows.Count, 1).End(3).Select
End Sub
Of course easily extended to any value or string, not just zero.
 
Last edited:
Upvote 0
FWIW, using 2007 on WinXP I ran both sets of code 3 times each on the sample data. Output:
Mirabeau:
Total: 25.13683
Average: 8.37893
Max: 10.14229

Peter:
Total: 25.59975
Average: 8.53323
Max: 9.02622

so both pretty much the same. :)
 
Upvote 0
FWIW, using 2007 on WinXP I ran both sets of code 3 times each on the sample data. Output:
Mirabeau:
Total: 25.13683
Average: 8.37893
Max: 10.14229

Peter:
Total: 25.59975
Average: 8.53323
Max: 9.02622

so both pretty much the same. :)
Thanks for doing that Rory. I'm still struggling with how mirabeau & I could be getting such opposite results. Your results are really quite different again.

My tests were on Excel 2007 & 2010 (32 bit). OS is Win7 professional (64 bit) SP1. Don't know what it all means but processor is Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz. 8 GB RAM.
 
Upvote 0

Forum statistics

Threads
1,221,444
Messages
6,159,914
Members
451,603
Latest member
SWahl

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top