VBA HTTP request to scrape webpage links for a URL

MoshiM

Active Member
Joined
Jan 31, 2018
Messages
439
Office Version
  1. 2016
Platform
  1. Windows
I'm attempting to use HTTP requests using https://codingislove.com/parse-html-in-excel-vba/ as a guide in order to bypass having to use internet explorer. When I use the following macro it doesn't assign T_Object_S to do anything and skips over the for loop.
Desired url: https://www.theice.com/publicdocs/futures/COTHist2019.csv

Code:
SUB  Get_File_URL() 
Dim oStrm As Object, WinHttpReq As Object, T_Object_S As Object, html As New HTMLDocument, Extension As String, File_Name As String, TitleEM As Object, T_Object As HTMLHtmlElement  

  File= "https://www.theice.com/marketdata/reports/122"


Set WinHttpReq = CreateObject("MSXML2.XMLHTTP")
    WinHttpReq.Open "GET", File, False 
    WinHttpReq.send


        html.body.innerHTML = WinHttpReq.responseText: Set T_Object_S = html.getElementsByClassName("document-list")

        For Each T_Object In T_Object_S
            
        TitleEM = T_Object.getElementsByTagName("li")(0)
            
        MsgBox TitleEM.getElementsByTagName("a")(0).innerText

        Next T_Object

End SUB

The code below does the same thing and with the correct result but uses internet explorer.

Code:
Sub Scrape_Web_ICE() 

    Dim internet As Object
    Dim internetdata As Object
    Dim div_result As Object
    Dim header_links As Object
    Dim link As Object
    Dim URL As String


    Set internet = CreateObject("InternetExplorer.Application")
    internet.Visible = True


    URL = "https://www.theice.com/marketdata/reports/122"
    internet.Navigate URL


    Do Until internet.ReadyState >= 4
        DoEvents
    Loop


    Application.Wait Now + TimeSerial(0, 0, 5)


    Set internetdata = internet.document
    Set div_result = internetdata.getElementsByClassName("true-grid-3")


msgbox div_result.Item(1).all.Item(1).FirstChild.NextSibling.all.Item(0).href
 

internet.Quit


End Sub
The targeted link:
https://www.dropbox.com/s/8fy4rytokppfozs/Untitled.png?dl=0
 
Last edited:

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
You need to call this url, rather than the parent:
https://www.theice.com/CommitmentOfTradersReports.shtml?rcMode=1&reportId=122

The page you are calling is calling the url above, so in your first code, the elements you are looking for do not exist.

You've also missed a "Set"
Code:
Sub Get_File_URL()
Dim oStrm As Object, WinHttpReq As Object, T_Object_S As Object, html As New HTMLDocument, Extension As String, File_Name As String, TitleEM As Object, T_Object As HTMLHtmlElement

  File = "https://www.theice.com/CommitmentOfTradersReports.shtml?rcMode=1&reportId=122"


Set WinHttpReq = CreateObject("MSXML2.XMLHTTP")
    WinHttpReq.Open "GET", File, False
    WinHttpReq.send


        html.body.innerHTML = WinHttpReq.responseText:
        
        Set T_Object_S = html.getElementsByClassName("document-list")

        For Each T_Object In T_Object_S
            
        Set TitleEM = T_Object.getElementsByTagName("li")(0)
            
        MsgBox TitleEM.getElementsByTagName("a")(0).innerText

        Next T_Object

End Sub

Though I don't know why you need to jump through these hoops as the url of the CSV doesn't seem to change, why not call it directly?
 
Last edited:
Upvote 0
You could simplify the selection to the below

Code:
    Set WinHttpReq = CreateObject("MSXML2.XMLHTTP")
    WinHttpReq.Open "GET", File, False
    WinHttpReq.send

    html.body.innerHTML = WinHttpReq.responseText
    
    Set T_Object_S = html.querySelector(".true-grid-3 > .document-list > li > a")
    MsgBox T_Object_S.innerText & " : " & "https://www.theice.com/" & T_Object_S.pathname
 
Upvote 0
Thank you.Your solution worked.

What if I wanted all the items like that or a specific one and not just the first? I googled your solution and if I understand correctly, I can use the following to get all of them
Code:
Set T_Object_S = html.querySelectorAll(".true-grid-3 > .document-list > li > a")
but when I try and view the contents with the Locals Window it crashes the Excel file.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,904
Messages
6,175,295
Members
452,631
Latest member
a_potato

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top