Hi,
I have existing VBA code which scans a list of keywords on a webpage, and returns a TRUE or FALSE based on whether it exists. Id like the code to scan the ENTIRE site (internal links only), as this code only scans the entered home page. Secondly id like to ensure only the readable content is scanned (ignore any background html code etc). This is what I have currently
I have existing VBA code which scans a list of keywords on a webpage, and returns a TRUE or FALSE based on whether it exists. Id like the code to scan the ENTIRE site (internal links only), as this code only scans the entered home page. Secondly id like to ensure only the readable content is scanned (ignore any background html code etc). This is what I have currently
Sub CheckKeywordsInInternalLinks()
Dim domainName As String
Dim keywordRange As Range
Dim keywordCell As Range
Dim keyword As String
Dim foundKeyword As Boolean
Dim html As Object
Dim xhr As Object
Dim pageContent As String
Dim bodyText As String
Dim allLinks As Object
Dim link As Object
' Ask the user for a domain name
domainName = InputBox("Enter the domain name (e.g., Example Domain):", "Domain Name")
' Check if "Keywords" sheet exists
On Error Resume Next
Set ws = ThisWorkbook.Sheets("Keywords")
On Error GoTo 0
If ws Is Nothing Then
MsgBox "The 'Keywords' sheet does not exist. Please check the sheet name.", vbCritical
Exit Sub
End If
' Set the range of keywords from the Keywords sheet
Set keywordRange = ws.Range("A1:A" & ws.Cells(Rows.Count, 1).End(xlUp).Row)
' Initialize XMLHttpRequest to get the page content
Set xhr = CreateObject("MSXML2.XMLHTTP")
' Loop through each keyword
For Each keywordCell In keywordRange
keyword = keywordCell.Value
foundKeyword = False
' Fetch the webpage content
xhr.Open "GET", domainName, False
xhr.Send
' Get the page content as a string
pageContent = xhr.responseText
' Load the content into an HTMLDocument object
Set html = CreateObject("HTMLfile")
html.body.innerHTML = pageContent
' Get all the body text (including text outside links)
bodyText = html.body.innerText
' Check if the keyword exists anywhere in the body text
If InStr(1, bodyText, keyword, vbTextCompare) > 0 Then
foundKeyword = True
End If
' Output "True" or "False" in the adjacent column
If foundKeyword Then
keywordCell.Offset(0, 1).Value = "True"
Else
keywordCell.Offset(0, 1).Value = "False"
End If
Next keywordCell
End Sub