Can't find META tags in HTML when using WinHTTP Request

Julesdude

Board Regular
Joined
Jan 24, 2010
Messages
197
Hi all,

I am trying to locate Meta tags from the following URL:

https://england.shelter.org.uk/prof...ary_folder/2020_group_-_fiscal_stimulus_paper

However, what is passed from the result string to HTMLDoc.body.all is only a section of the actual HTML and the meta tags at the beginning of the script are absent and so cannot be found. How do I overcome this?

my code is as follows:

Code:
Dim Http2 As New WinHttpRequest
Dim i as long
dim result as string
dim elements
dim element

Set HTMLDoc = New MSHTML.HTMLDocument



    Http2.Open "GET", url, False
    ' send request
    Http2.send
    result = Http2.responseText



    'pass text of HTML document returned
    HTMLDoc.body.all = result


    Set Elements = HTMLDoc.all.tags("META")


        For Each singleElement In Elements 


        ActiveSheet.Cells(i, "A") = url
        ActiveSheet.Cells(i, "B") = "META " & singleElement.Name
            
        ActiveSheet.Cells(i, "D").NumberFormat = "@" ' text format for date
        ActiveSheet.Cells(i, "D") = singleElement.Content
        
        
        i = i + 1
        
        Next 
    
    
    Set Elements = Nothing
 
Hi John, thanks for all your help again. Your first example in last post worked and thanks - I can now retrieve the author text in the page.
The second example, which I add just after the first example, didn't work for me however. I Dim HTMLdoc as a new HTMLDocument. I then pass to it - HTMLDoc.body.innerHTML = .responseText
The first example works and author is extracted, but I get an error for debug.pring elements(0).src - 'object variable or with block variable not set'.

Ultimately in the body of the article I'd want to find any image links and extract the references for them. There's usually only one at the beginning of the article, but it is possible there may be a second further down. How do I capture these?
 
Upvote 0

Excel Facts

Excel Can Read to You
Customize Quick Access Toolbar. From All Commands, add Speak Cells or Speak Cells on Enter to QAT. Select cells. Press Speak Cells.
As I remember, and from what I said in my previous post, with the WinHttpRequest method, HTMLDocument.getElementsByTagName("IMG").Length is zero, for some reason, even though the IMG tag is present in the .responseText. Therefore I think you'll need to request the page with IE to extract the image links. To extract all image links:

Code:
    Dim HTMLdoc As HTMLDocument
    Dim elements As IHTMLElementCollection
    Dim imgElement As HTMLImg
    Set HTMLdoc = IE.document  'IE is InternetExplorer object with page loaded and complete
    Set elements = HTMLdoc.getElementsByTagName("IMG")
    For Each imgElement In elements
        Debug.Print imgElement.src
    Next
 
Upvote 0
They aren't image tags:
Rich (BB code):
    Dim req As Object: Set req = CreateObject("msxml2.xmlhttp")
    req.Open "GET", "https://blog.shelter.org.uk/2018/08/flatlining-wages-surging-rents-and-a-national-affordability-crisis/amp/", False
    req.send
    With CreateObject("htmlfile")
        .body.innerHtml = req.responsetext
        Debug.Print .getElementsByTagName("amp-img")(0).src
    End With
 
Upvote 0

Forum statistics

Threads
1,223,702
Messages
6,173,932
Members
452,539
Latest member
delvey

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top