Webscraping only returning one table, multiple available

René R

New Member
Joined
Jul 1, 2019
Messages
5
I´m trying to scrape a page for a specific table.
there´s three tables available but I only get the first table returned.


Url: https://www.dhl.dk/exp-da/express/shipping/forsendelsesraadgivning/express_olietillaeg.html




I can get the details for the first table. This is also expanded by default.
I´m pretty sure I need to somehow expand the other two tables to get acces through internet control. But I haven´t succeded yet..


Code:
Option Explicit


Dim HTMLDoc As MSHTML.HTMLDocument
Dim MyBrowser As SHDocVw.InternetExplorer


Sub GetAdditionalCharges()


Dim MyHTML_Element, item As IHTMLElement
Dim MyURL As String
Dim tSt, sElapsed As Double
tSt = Timer


'MyURL = ThisWorkbook.Sheets("DHL Extractions").Range("D2").Value
MyURL = "https://www.dhl.dk/exp-da/express/shipping/forsendelsesraadgivning/express_olietillaeg.html"


Set MyBrowser = New SHDocVw.InternetExplorer


MyBrowser.Silent = True
MyBrowser.navigate MyURL
MyBrowser.Visible = True


While MyBrowser.Busy
DoEvents
Wend


Set HTMLDoc = MyBrowser.document


    Dim hTable As Object
    Dim hBody As Object
    Dim hHead As Object
    Dim tb As Object ' table
    Dim hr As Object ' header row
    Dim hrA As Object ' header row array
    Dim hrV As Object ' header row value
    Dim bb As Object ' table row
    Dim tr As Object ' table row array
    Dim hTr As Object ' table row value
    Dim hTD As Object
    Dim td As Object
    Dim y As Long
    Dim z As Long
    Dim wb As Excel.Workbook
    Dim ws As Excel.Worksheet


    Set wb = Excel.ActiveWorkbook
    Set ws = wb.ActiveSheet
    y = 3   'Return column to Excel
    z = 3   'Return Row in Excel
    Sheets("DHL Extractions").Activate
    Range("C3:M5000").ClearContents
    
    Set hTable = HTMLDoc.getElementsByTagName("table")


        For Each tb In hTable
            Set hBody = tb.getElementsByTagName("tbody")
            Set hHead = tb.getElementsByTagName("thead")
                For Each hr In hHead
                    Set hrA = hr.getElementsByTagName("th")
                        For Each hrV In hrA
                            ws.Cells(3, y).Value = hrV.innerText
                            y = y + 1
                        Next hrV
                    Exit For
                Next hr
                y = 3
                z = 3
                For Each bb In hBody
                    Set hTr = bb.getElementsByTagName("tr")
                    For Each tr In hTr
                        Set hTD = tr.getElementsByTagName("td")
                        y = 3 'Resetting the column
                        For Each td In hTD
                            ws.Cells(z, y).Value = td.innerText
                            y = y + 1
                        Next td
                        DoEvents
                        z = z + 1
                    Next tr
                    Exit For
                Next bb
            Exit For
        Next tb


MyBrowser.Quit


'Returning time elapsed
sElapsed = Round(Timer - tSt, 0)


Sheets("DHL Extractions").Range("I1") = sElapsed


End Sub

Hoping for some assistance.
 

Excel Facts

How to fill five years of quarters?
Type 1Q-2023 in a cell. Grab the fill handle and drag down or right. After 4Q-2023, Excel will jump to 1Q-2024. Dash can be any character.
See if you can incorporate this into your code. Uses early binding, so you must set references to MS Internet Controls and MS HTML Object Library, via Tools-References in the VBA editor.

Code:
Public Sub IE_Test()

    Dim URL As String
    Dim IE As InternetExplorer
    Dim HTMLdoc As HTMLDocument
    Dim table As HTMLTable
    Dim tRow As HTMLTableRow, tCell As HTMLTableCell
   
    URL = "https://www.dhl.dk/exp-da/express/shipping/forsendelsesraadgivning/express_olietillaeg.html"
    
    Set IE = New InternetExplorer
    IE.Visible = True
    IE.navigate URL
    While IE.Busy Or IE.readyState <> READYSTATE_COMPLETE: DoEvents: Wend
    Set HTMLdoc = IE.document
        
    Do
        Set table = HTMLdoc.getElementsByTagName("TABLE")(2)
        DoEvents
    Loop While table Is Nothing
    For Each tRow In table.Rows
        For Each tCell In tRow.Cells
            Debug.Print tCell.innerText; " ";
        Next
        Debug.Print
    Next

    'IE.Quit
    Set HTMLdoc = Nothing
    Set IE = Nothing
    
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,898
Messages
6,175,272
Members
452,628
Latest member
dd2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top