Not GetElementsByClassName String a Total Nightmare

MisterZ

New Member
Joined
Apr 18, 2016
Messages
6
Goodday the forum. This a question regarding web data grab. GetElementByClassName will not work. Not much good at programming full stop but can loop in Pascal that is about it so I rely heavily on what is on the net. To make sure that it is not a typing error I copied and pasted a vba macro from this site automate web scraping using class with vba | Excel VBA Training Videos
It works for them but not for me.

I have tried
With CreateObject("msxml2.xmlhttp") .Open "GET", Web_URL, False
.send
HTML_Content.body.innerHTML = .responseText
End With
but can only copy the HTML_Content.body.innerHTML into a string variable. The data i am chasing is in the string because i used instring function to count the number of times the class name was used. The string looks like a Total Nightmare and extracting data out of it looks even worse!!! Freaking Fantastic!
I am using Excel 2007 with SP3, I have enabled in the VBA editor References, Microsoft HTML Object Library and Microsoft Internet Controls.
Could it be that I need to enable some further capability within the excel program.
Thanks for your time
MisterZ
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
Hello MisterZ,

This macro will return the complete page source for the URL as a string in the global variable PageSrc. A second global variable holds the HTML Document Object called HTMLdoc.

Code:
' Written: December 31, 2013
' Author:  Leith Ross
' Summary: Returns the HTML DOM from a given URL.

Global HTMLdoc  As Object
Global PageSrc  As String

Function GetHTMLdocument(ByVal URL As String) As Object
      
      ' Intialize Global variables.
        PageSrc = ""
        Set HTMLdoc = Nothing
        
      ' Retrieve the web page's HTML code (page source) from the server.
        With CreateObject("MSXML2.XMLHTTP")
            .Open "GET", URL, True
            .Send
        
            While .readyState <> 4: DoEvents: Wend
        
          ' Check for any connection errors.
            If .statusText <> "OK" Then
                MsgBox "ERROR: " & .Status & " - " & .statusText, vbExclamation
                Exit Function
            End If
        
            PageSrc = .ResponseText
        End With
            
      ' Create an empty HTML Document.
        Set HTMLdoc = CreateObject("htmlfile")
        
      ' Convert the Page Source into an HTML document.
        HTMLdoc.body.innerHTML = PageSrc
        
      ' Close the HTML file.
        HTMLdoc.Close
        
      ' Return the HTML Document Object.
        Set GetHTMLdocument = HTMLdoc
        
End Function
 
Upvote 0
Hi Leith,

thanks for the reply. The Document Object HTMLdoc holds the web page source as a html file that can be searched for html tags and Class Name etc. is this correct? I believe I have, but not sure, a html file of some sort but the problem is searching by using GetelementsbyClassName.

In one attempt variable types were declared as:
Dim element As IHTMLElementDim elements As IHTMLElementCollection

and the line results in an error when searching for a class name, class='results'
Set elements = html.getElementsByClassName("results")
and
Set element = html.getElementsByClassName("results")

I forget what error it said not supported or wrong way of going about it

I appreciate your help.

MisterZ

Hello MisterZ,

This macro will return the complete page source for the URL as a string in the global variable PageSrc. A second global variable holds the HTML Document Object called HTMLdoc.

Code:
' Written: December 31, 2013
' Author:  Leith Ross
' Summary: Returns the HTML DOM from a given URL.

Global HTMLdoc  As Object
Global PageSrc  As String

Function GetHTMLdocument(ByVal URL As String) As Object
      
      ' Intialize Global variables.
        PageSrc = ""
        Set HTMLdoc = Nothing
        
      ' Retrieve the web page's HTML code (page source) from the server.
        With CreateObject("MSXML2.XMLHTTP")
            .Open "GET", URL, True
            .Send
        
            While .readyState <> 4: DoEvents: Wend
        
          ' Check for any connection errors.
            If .statusText <> "OK" Then
                MsgBox "ERROR: " & .Status & " - " & .statusText, vbExclamation
                Exit Function
            End If
        
            PageSrc = .ResponseText
        End With
            
      ' Create an empty HTML Document.
        Set HTMLdoc = CreateObject("htmlfile")
        
      ' Convert the Page Source into an HTML document.
        HTMLdoc.body.innerHTML = PageSrc
        
      ' Close the HTML file.
        HTMLdoc.Close
        
      ' Return the HTML Document Object.
        Set GetHTMLdocument = HTMLdoc
        
End Function
 
Upvote 0
To use getElementsByClassName you need to use early binding - it doesn't work with late binding. You need to add a reference to Microsoft HTML object library and then declare your variable as HTMLDocument.

You'll also need at least ie9 installed on the machine
 
Upvote 0

Forum statistics

Threads
1,223,723
Messages
6,174,123
Members
452,546
Latest member
Rafafa

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top