VBA getElementsByTagName fail to catch "td" tag elements

Aurelius M

New Member
Joined
May 24, 2020
Messages
2
Office Version
  1. 2016
Platform
  1. Windows
I need some help from the smart guys here on the site. The code snippet below don't give any errors and it looks that it works fine, but the TD element collection stays empty and the innerText is empty too. That while the data are showing in a visual browser as You can see in the image below. getElementsByTagName don't catch TDelements from the web table, and I can't figure out why ....

Code:
Sub Extract_TD_Text()
    Dim URL As String
    Dim wMatrix As String
    Dim Search As String
    Dim IE As InternetExplorer
    Dim HTMLdoc As HTMLDocument
    Dim myTable As HTMLObjectElement
    Dim TDelements As IHTMLElementCollection
    Dim TDelement As HTMLTableCell
    Dim rij As Long
    Dim bGetNext As Boolean
    
    URL = "https://www.nasdaq.com/market-activity/stocks/arch"
    wMatrix = "sector;industry;1 year target;forward p/e 1 yr.;earnings per share(eps);annualized dividend;ex dividend date;beta"
    
    Set IE = New InternetExplorer
    With IE
        .Navigate2 URL
        .Visible = False 'or True it don't matter
        'Wait for page to load
        While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
        rij = 0
        bGetNext = False
        Set HTMLdoc = .Document
        If HTMLdoc.getElementsByTagName("table")(0).getAttribute("class") = "summary-data__table" Then
            Set myTable = HTMLdoc.getElementsByTagName("table")(0)
            Set TDelements = myTable.getElementsByTagName("td")
            For Each TDelement In TDelements
                Search = LCase(Trim(TDelement.innerText))
                If Search = "" Then
                    Search = "Nothing Found"
                Else
                    Search = "*" & Search & "*"
                End If
                If bGetNext = True Then
                    Debug.Print TDelement.innerText 'Catch the data from the TDelement
                    bGetNext = False
                End If
                'Debug.Print TDelement.className '- useful for inspecting the HTML info
                If wMatrix Like Search Then
                    Debug.Print TDelement.innerText 'Show the string where searching for
                    bGetNext = True 'Trigger to catch the data from the next TDelement
                End If
                rij = rij + 1
            Next
        End If
    End With
    IE.Quit
End Sub

Thanks in advance.

Table Data.jpg
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Theoretically, this code should work, but it is not.

I will try to find out what is happening…

VBA Code:
Sub Web_Table()
Dim HTMLDoc As New HTMLDocument, objTable As Object, lRow As Long, tabc%
Dim rowc%, colv%, ActRw%, objIE As InternetExplorer, bod As Object, i%
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate "https://www.nasdaq.com/market-activity/stocks/arch"
Application.Wait (Now + TimeValue("0:00:08"))
HTMLDoc.body.innerHTML = objIE.Document.body.innerHTML
With HTMLDoc.body
    Set objTable = .getElementsByTagName("table")
    For tabc = 0 To objTable.Length - 1             ' all tables
        MsgBox objTable(tabc).ClassName & vbLf & objTable(tabc).Rows.Length, , "row count"
        For rowc = 0 To objTable(tabc).Rows.Length - 1
            For colv = 0 To objTable(tabc).Rows(rowc).Cells.Length - 1
                ThisWorkbook.Sheets("Sheet1").Cells(ActRw + rowc + 1, colv + 1) = _
                objTable(tabc).Rows(rowc).Cells(colv).innerText
            Next
        Next
        ActRw = ActRw + objTable(tabc).Rows.Length + 1
    Next
End With
Set bod = objTable(0).getElementsByTagName("tbody")
For i = 0 To bod.Length - 1
    MsgBox bod(i).getElementsByTagName("tr").Length, , "TR count"
Next
objIE.Quit
End Sub
 
Upvote 0
If the web page is already loaded in an IE window, this code correctly identifies the desired table.

I will be back tomorrow to continue…

VBA Code:
Sub demo()
Dim ie As Object, doc As New HTMLDocument, tb As Object, i%
Set ie = GetIE
doc.body.innerHTML = ie.Document.body.innerHTML
With doc.body
    Set tb = .getElementsByTagName("table")
    MsgBox tb.Length, , "Number of tables"
    For i = 0 To tb.Length - 1
        MsgBox tb(i).Rows.Length, , "Number of rows"
    Next
End With
End Sub

Function GetIE() As Object
For Each GetIE In CreateObject("Shell.Application").Windows()
    If (Not GetIE Is Nothing) And _
    GetIE.Name = "Internet Explorer" Then Exit For  ' found
Next
If GetIE Is Nothing Then _
Set GetIE = CreateObject("InternetExplorer.Application")
GetIE.Visible = True
End Function
 
Upvote 0
I did it with Chrome:

VBA Code:
Public d As ChromeDriver

Sub AuM()
Dim col As Object
Set d = New ChromeDriver
With d
    .Timeouts.PageLoad = 30000:    .Timeouts.Server = 30000
    .get "https://www.nasdaq.com/market-activity/stocks/arch"
    .Wait 2300
    While .ExecuteScript("return document.readyState") <> "complete"
        .Wait (5000)
    Wend
    MsgBox .ExecuteScript("return document.readyState")
    Application.Wait Now + TimeValue("0:00:02")
    Set col = .FindElementByXPath _
    ("/html/body/div[2]/div/main/div/div[5]/div/div[1]/div/div[2]/table")
    col.ScrollIntoView                                  ' force table to load
    .Wait 2000
    col.AsTable.ToExcel ThisWorkbook.Worksheets("Sheet1").[a1]
End With
End Sub
 
Upvote 0
I did it with Chrome:

VBA Code:
Public d As ChromeDriver

Sub AuM()
Dim col As Object
Set d = New ChromeDriver
With d
    .Timeouts.PageLoad = 30000:    .Timeouts.Server = 30000
    .get "https://www.nasdaq.com/market-activity/stocks/arch"
    .Wait 2300
    While .ExecuteScript("return document.readyState") <> "complete"
        .Wait (5000)
    Wend
    MsgBox .ExecuteScript("return document.readyState")
    Application.Wait Now + TimeValue("0:00:02")
    Set col = .FindElementByXPath _
    ("/html/body/div[2]/div/main/div/div[5]/div/div[1]/div/div[2]/table")
    col.ScrollIntoView                                  ' force table to load
    .Wait 2000
    col.AsTable.ToExcel ThisWorkbook.Worksheets("Sheet1").[a1]
End With
End Sub
If the web page is already loaded in an IE window, this code correctly identifies the desired table.

I will be back tomorrow to continue…

VBA Code:
Sub demo()
Dim ie As Object, doc As New HTMLDocument, tb As Object, i%
Set ie = GetIE
doc.body.innerHTML = ie.Document.body.innerHTML
With doc.body
    Set tb = .getElementsByTagName("table")
    MsgBox tb.Length, , "Number of tables"
    For i = 0 To tb.Length - 1
        MsgBox tb(i).Rows.Length, , "Number of rows"
    Next
End With
End Sub

Function GetIE() As Object
For Each GetIE In CreateObject("Shell.Application").Windows()
    If (Not GetIE Is Nothing) And _
    GetIE.Name = "Internet Explorer" Then Exit For  ' found
Next
If GetIE Is Nothing Then _
Set GetIE = CreateObject("InternetExplorer.Application")
GetIE.Visible = True
End Function
I have the exact same problem -- Code works if I use IE but does not pick up tags if I use MSXML2.XMLHTTP60

The one thing I have noticed is that the HTML code has MISSING Tags are between <ng-transclude> .... and <\ng-transclude>

Does anyone have any suggestions on how to fix this?
 
Upvote 0

Forum statistics

Threads
1,224,813
Messages
6,181,107
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top