Fast ways to grab table from HTML using VBA ?

JumboCactuar

Well-known Member
Joined
Nov 16, 2016
Messages
788
Office Version
  1. 365
Platform
  1. Windows
Hi,

i found this vba script which works super fast, however it only works if there is only 1 table on the website.

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://ffrk.kongbakpao.com/character-usable-abilities/", False
        .Send
        oDom.body.innerHtml = .responseText
    End With
    
    With oDom.getelementsbytagname("table")(0)
        ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
        For Each oRow In .Rows
            For Each oCell In oRow.Cells
                data(x, y) = oCell.innerText
                y = y + 1
            Next oCell
            y = 1
            x = x + 1
        Next oRow
    End With
    
    Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
End Sub

Now im looking for something similar which works for specific tables, i.e when you "get data from web" and it displays table 0, table 1 etc....

Does anyone know of a way? get data from web is pretty slow as it works differently to the above i think

Appreciate any help
 
I've modified your code so that it loops through each table, and checks for the specified caption, if one exists...

Code:
Sub test()
    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim oTable As Object
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "https://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_winners", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    For Each oTable In oDom.getElementsByTagName("table")
        With oTable.getElementsByTagName("caption")
            If .Length > 0 Then
                If RTrim(.Item(0).innerText) = "Nations that won the World Cup" Then
                    Exit For
                End If
            End If
        End With
    Next oTable
    
    If oTable Is Nothing Then
        MsgBox "Table not found!", vbExclamation
    Else
        With oTable
            ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
            For Each oRow In .Rows
                For Each oCell In oRow.Cells
                    data(x, y) = oCell.innerText
                    y = y + 1
                Next oCell
                y = 1
                x = x + 1
            Next oRow
        End With
        Sheets(1).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    End If
    
    Set oDom = Nothing
    Set oTable = Nothing
    Set oRow = Nothing
    Set oCell = Nothing
End Sub

Hope this helps!

You'll need to look at the HTML behind the page to see how the table is structured and then change the code so that it handles that rather than only grabbing the innerText of each cell in the table.

PS Why aren't you using Get data from web... or perhaps PowerQuery (Get & Transform Data)?

Get data from Web seems a lot slower, though I could go with that if I can output values only.

Currently with this option it creates a table which causes problems with VBA after deleting the range
 
Upvote 0

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
You can always convert the table to a 'standard' range.
 
Upvote 0

Forum statistics

Threads
1,223,240
Messages
6,170,951
Members
452,368
Latest member
jayp2104

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top