Formula or VBA to see if a cell contains text from large list and if it does, remove it.

VZPRO

New Member
Joined
Aug 14, 2017
Messages
2
First time posting so sorry if I make a mistake.

I'm trying to go through a list of keywords I have built over the last few months. The list is of 46,849 keywords (column A). I want to check each of the keywords (really keyword phrases) to see if it contains any of the text from a different list I have of Cities, Counties and States in the US which I've put in column B ( a list of 15,624)

So what I want to happen is that you look at column A (my keyword phrases) and then check all the items in column B to see if any of that text is contained inside of the cell in column A.

I've done =SUMPRODUCT(--ISNUMBER(SEARCH($B1:B$15462,A1)))>0 Everything comes back TRUE

I've done =IF(MAX(IFERROR(SEARCH($B$1:$B$15642,$A1),0))>0,"yes","no") Everything comes back yes

Not sure if this is the right way to do this but I'm lost. After the data comes back, I need to delete the cells (or rows, doesn't matter to me) that contain the selected data. I can do this by a filter which is fine But the most important part is to see if cell A1 contains ANY of the text from the city, county, state list.

[TABLE="width: 800"]
<tbody>[TR]
[TD]Keywords[/TD]
[TD]Cities[/TD]
[TD]Results[/TD]
[/TR]
[TR]
[TD]Roofer[/TD]
[TD]Birmingham[/TD]
[TD]No[/TD]
[/TR]
[TR]
[TD]Roofer in Birmingham AL[/TD]
[TD]Mobile[/TD]
[TD]Yes
[/TD]
[/TR]
[TR]
[TD]Roofing Contractor[/TD]
[TD]Atlanta[/TD]
[TD]No[/TD]
[/TR]
[TR]
[TD]Roofing Contractor in Mobile AL[/TD]
[TD]Kentucky[/TD]
[TD]Yes[/TD]
[/TR]
[TR]
[TD]Roofing Shingles[/TD]
[TD]Wyoming[/TD]
[TD]No[/TD]
[/TR]
[TR]
[TD]Roofing Gun[/TD]
[TD]Florida[/TD]
[TD]No[/TD]
[/TR]
[TR]
[TD]Birmingham Roofing Company[/TD]
[TD]Washington DC[/TD]
[TD]Yes[/TD]
[/TR]
[TR]
[TD]Local Roofer[/TD]
[TD]Denver[/TD]
[TD]No[/TD]
[/TR]
[TR]
[TD]Mobile Roofing Contractors[/TD]
[TD]Buffalo[/TD]
[TD]Yes[/TD]
[/TR]
</tbody>[/TABLE]
 

Excel Facts

Return population for a City
If you have a list of cities in A2:A100, use Data, Geography. Then =A2.Population and copy down.
@Steve
By my reading of it you may have the columns in your formulas around the wrong way. I think the OP wants to know if the text in column A contains any of the texts in column B, whereas your formulas are marking the column B values that appear somewhere in column A.
@VZPRO
You have asked to check if the text in column A contains the text of any value(s) in column B. Is that really what you want?
For example, suppose that column A contains a cell "Roofing for tomorrow" and that somewhere in column B you have a cell containing "Morrow" (a County in Ohio) do you expect that the column A cell should be marked as a "Yes" because of
"Roofing for tomorrow"? Or are you looking for a match of complete words?

I have assumed that you are looking for complete word matches and suggest this macro as one to test.
Your data is very big and any solution may take a while to produce the required results. I tested with data approximately the same size as yours and on my (old) machine it took roughly 30 seconds to process.
Anyway, you could give it a try on a copy of your data. Currently my code simply puts a 1 in column C next to any column A data that contains a word (or group of words) matching at least one entry in column B.

Code:
Sub CheckWords()
  Dim a As Variant, c As Variant
  Dim i As Long
  Dim s As String
  
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Value
  ReDim c(1 To UBound(a), 1 To 1)
  With CreateObject("VBScript.RegExp")
    .Global = True
    .IgnoreCase = True
    s = "\b(" & Join(Application.Transpose(Range("B2", Range("B" & Rows.Count).End(xlUp))), "|") & ")\b"
    .Pattern = "\|{2,}"
    .Pattern = .Replace(s, "|")
    For i = 1 To UBound(a)
      If .test(a(i, 1)) Then c(i, 1) = 1
    Next i
  End With
  Range("C2").Resize(UBound(c)).Value = c
End Sub
 
Upvote 0
@Peter_SSs

First off, thank you for taking the time to write this up. I really appreciate it. I just ran the macro and it does run exactly as needed. It does pull some unwanted data but that's because of the lists and not the macro. For example, there is a city called "Worth" and one of the phrases is "is it worth it" so it identified it. But hey, it's doing exactly what we told it to do. And this list doesn't need to be perfect like that anyway. roughly 20k of the 46k are city modified searches so any sort of clean up is perfect. Again, thank you for taking the time to write this up. I've been beating my head against a wall for about a week trying to find a viable solution.

As far as your question, yes, I would prefer the solution you suggested which only gave me results with full phrases. So thank you for that. Also, just an fyi, it only took the macro about 60 or so seconds to run my full data through.
 
Upvote 0
Glad it pretty much worked for you.

For example, there is a city called "Worth" and one of the phrases is "is it worth it" so it identified it.
Try removing the following line from the code

Code:
.IgnoreCase = True
 
Upvote 0
This macro is exactly what I need.... Is there any way to identify the word that was a match instead of just putting 1?

Thanks,
 
Upvote 0
Is there any way to identify the word that was a match instead of just putting 1?
What if there was more than one match? The code with the suggested change below returns the first word in column A that is matched in B (see row 11 example)

Just change this line in the code
Rich (BB code):
If .test(a(i, 1)) Then c(i, 1) = .Execute(a(i, 1))(0)


Book1
ABC
1KeywordsCitiesResults
2RooferBirmingham
3Roofer in Birmingham ALMobileBirmingham
4Roofing Contractor[COLOR="#FF0000"]Atlanta[/COLOR]
5Roofing Contractor in Mobile ALKentuckyMobile
6Roofing ShinglesWyoming
7Roofing GunFlorida
8Birmingham Roofing CompanyWashington DCBirmingham
9Local RooferDenver
10Mobile Roofing Contractors[COLOR="#0000FF"]Buffalo[/COLOR]Mobile
11[COLOR="#0000FF"]Buffalo[/COLOR] & [COLOR="#FF0000"]Atlanta[/COLOR] Flooring[COLOR="#0000FF"]Buffalo[/COLOR]
Sheet1
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,897
Messages
6,175,270
Members
452,628
Latest member
dd2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top