How to use selectNodes to get elements from web page?

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
328
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I'm trying to get a elements from a web page with MSXML library using Xpath but I'm not able to make it work.

For the code below that I have so far, I'm getting "Run-time error 13" in the line Set elem = doc.SelectNodes("//a[contains(.,'Message Board')]")

May someon help me with this. Thanks

VBA Code:
Sub Test()
Dim URL As String
Dim oHttp As New MSXML2.XMLHTTP60
Dim elem As MSXML2.IXMLDOMNode


    URL = "https://www.mrexcel.com/"
    oHttp.Open "GET", URL, False
    oHttp.send
    
    Dim html As New HTMLDocument
    html.body.innerHTML = oHttp.responseText
    
    Set doc = New MSXML2.DOMDocument60
    doc.SetProperty "SelectionLanguage", "XPath"
    doc.Load oHttp.responseText
    
    Set elem = doc.SelectNodes("//a[contains(.,'Message Board')]")
    
End Sub
 

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.
You only need MSXML2.XMLHTTP60 to make your http request, and MSHTML.HTMLDocument to access the elements with the html document...

VBA Code:
Option Explicit

Sub test()

    'Set a reference (VBE > Tools > References) to the following libraries:
    '   1) Microsoft XML, v6.0
    '   2) Microsoft HTML Object Library

    Dim xmlReq As MSXML2.XMLHTTP60
    Dim htmlDoc As MSHTML.HTMLDocument
    Dim htmlLinks As MSHTML.IHTMLDOMChildrenCollection
    Dim url As String
    Dim resp As String
    Dim linkIndex As Long
    
    url = "https://www.mrexcel.com"
    
    Set xmlReq = New MSXML2.XMLHTTP60
    
    Set htmlDoc = New MSHTML.HTMLDocument
    
    With xmlReq
        .Open "GET", url, False
        .send
        Do While .readyState <> 4
            DoEvents
        Loop
        If .Status <> 200 Then
            MsgBox "Error " & .Status & ":  " & .statusText
            Exit Sub
        End If
        resp = .responseText
    End With
    
    htmlDoc.body.innerHTML = resp
    
    Set htmlLinks = htmlDoc.querySelectorAll("a.nav-link")  ' use the querySelector to search by tag and class
    
    With htmlLinks
        For linkIndex = 0 To .Length - 1
            If Trim$(.Item(linkIndex).innerText) = "Message Board" Then
                Debug.Print .Item(linkIndex).href
                Exit For
            End If
        Next linkIndex
    End With
    
    Set xmlReq = Nothing
    Set htmlDoc = Nothing
    Set htmlLinks = Nothing

End Sub

Hope this helps!
 
Last edited:
Upvote 0
Thanks @Domenic for the clarification about the only libraries I need. Your code works, but I was wondering if there is a way to get elements by Xpath (without the need to install Selenium driver), since I need a way to search the elements based on text since classes, IDs change from time to time in the website.
 
Upvote 0
In Excel, XPath is used for XML documents. However, if you want to use XPath for HTML documents, I think you'll need to use a third party driver, such as Selenium.
 
Upvote 1

Forum statistics

Threads
1,225,734
Messages
6,186,715
Members
453,369
Latest member
positivemind

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top