Find a string in a range

bbrother

New Member
Joined
Nov 23, 2015
Messages
41
I need the fastest way to look across a range of cells where each cell in the range contains a long string to see if any words in a list occurs in any of those cells. The first thing I did of course was to read the range of cells into an array, then I loop through my list (for each item in listarray) and do a simple instr on each element of the array. All I need is a yes/no - does any list item occur in the text, but I need to repeat this for about 50,000 ranges so I need to do it quickly. For example, suppose you had a worksheet where each cell contained a paragraph of text, and you needed to check each row in the worksheet for a list of words the FCC doesn't allow (that often slip out when finding bugs). You could concatenate all of the cell values in the row into a large string, then check that large string using instr across each item in your list. However, I think it is faster to check each paragraph independently so INSTR doesn't have to slog through giant strings - because I can bail out of the check as soon as a hit is found (but then I have another nested loop to do this)
 

Excel Facts

Show numbers in thousands?
Use a custom number format of #,##0,K. Each comma after the final 0 will divide the displayed number by another thousand
bbrother,

You could use the Timer function to time your different approaches to determine which is fastest with your data.

Cheers,

tonyyy
 
Upvote 0
Well according to my coffee cup timer, working with big strings required almost 1/2 cup of time, whereas using a select case to test substrings only required a few sips. Not very sophisticated I know, but the former was really testing my patience.I did find another bottleneck in another section. It turns out that working with collections to compare a collection item with an array item from a big matrix is very slow compared with creating two 1 dimensional arrays of fixed length strings and comparing those representative arrays.For string performance I found two articles that contained many performance tips:http://www.aivosto.com/vbtips/stringopt.html and http://www.shamrock-software.eu/vb.htm
 
Last edited:
Upvote 0
Are the cells in each of your ranges all in a single row or a single column (as opposed to a mult-row/multi-column range of cells)? If yes, which direction is the single line of cells in?
 
Upvote 0
The cells I need to look through are in a single row of non continuous cells. I have to decide yes/no for each row, and repeat for about 20-50k rows. To add to the fun, I just spotted a few people decided to type in Kanji...
 
Upvote 0
Hi there

In my experience VBA is very slow when compared with worksheet functions. Why not just use the inbuild Excel find function from within the VBA to loop through the bad list? sometihing like Application.WorksheetFunction.Find

If it is only one column you are examining you can also use datafilters.

Cheers DavidB
 
Upvote 0
I need the fastest way to look across a range of cells where each cell in the range contains a long string to see if any words in a list occurs in any of those cells. The first thing I did of course was to read the range of cells into an array
You said in Message #5 that the cells you needed to examine were "non continuous" (I presume you meant non-contiguous). That would mean your array contains values that you are not interested in examining, correct? How are you distinguishing between cells you want to examine from those you don't want to examine?
 
Upvote 0
You're right - pardon my grimmer grammer. I've been staring at spreadsheets for too long today. The cells are non-contiguous.Since I have to do many different things with this data, I read the entire sheet into a giant array called "snapshot" which is arranged exactly as the sheet. So now when I refer to an element of the array it can be mapped back to a specific cell. I conduct all my operations on this giant array, clear the sheet, then dump the array (in blocks) back to the sheet.So, with this array, I know the columns that I need to check and can simply loop through the rows. I use "select case" to check each of the specific columns in the array so that if I have a hit I don't have to waste time checking the other columns in the row - I know the whole row gets flagged.If I weren't seeing double I would create a sample code and post that, but I am having a devil of a time getting things formatted on this board! All my spacing and NL's seem to be deleted and my messages look like heck.
 
Upvote 0
So, with this array, I know the columns that I need to check and can simply loop through the rows.
I have an ideas I want to try... which columns are they?

Also, where does the result go at?
 
Last edited:
Upvote 0
The range of interest is row A5:AY27500. There are different data types from currency to strings in the different columns, but the columns that I need to search through are A, I, J, K, AE, AF, and AG. All the columns I am searching contain strings. Each row must be searched independently so I can make a yes/no judgement on every row. (Does one of the strings I am looking for exist in any one of the cells in this row? yes/no Go on to the next row.)


My list of search terms varies. I put each list of search terms on a separate worksheet use that list to see which rows should be copied over to the new sheet. Obviously it takes too long to do this using find, so I load the entire data set into an array and complete all of the searching operations without accessing cells. Once I have my subset (based on looping through the search terms) I dump the new array on to the same worksheet that contains the search terms. I found that I can't dump the whole array at once - it is too large - so I have to write the array to the sheet in large blocks. (I also have to format some of the columns as text first since some of the strings contained in the array are too long for the general format).
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,820
Messages
6,181,157
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top