Pull Data From Website

mikeypsooner

New Member
Joined
May 16, 2017
Messages
33
I would like to pull the data from the tables on the webpage https://finance.yahoo.com/quote/VTI/holdings?p=VTI

Looking to get the Overall Portfolio Composition (%), Sector Weightings (%), Equity Holdings, Bond Ratings, Top 10 Holdings (14.91% of Total Assets), and Fund Overview in tables similar the one below.

Not to familiar with
getElementsByClassName, getElementsByTagName, etc. I general know how to locate the table information but not sure how to write the code to get getElementsByClassName, getElementsByTagName, etc. When I look at the HTML code I am not sure which one to grab for my specific data to get to a table. Maybe some useful vba and html documents to look at?

Any help is much appreciated.


Option Explicit

Sub GetStockData()

Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim HTMLRows As MSHTML.IHTMLElementCollection
Dim HTMLRow As MSHTML.IHTMLElement
Dim wksDest As Worksheet
Dim r As Long

Set wksDest = Sheet2
wksDest.Cells.Clear

Application.ScreenUpdating = False

IE.navigate "https://finance.yahoo.com/quote/GOOGL?p=GOOGL"
IE.Visible = False

With IE
Do While .Busy Or .readyState <> READYSTATE_COMPLETE
DoEvents
Loop
End With

Set HTMLDoc = IE.document

With HTMLDoc.getElementsByClassName("D(ib) Fw(200) Mend(20px)")(0)
wksDest.Range("A1").Value = .Children(0).innerText
wksDest.Range("B1").Value = .Children(1).innerText
End With

Set HTMLRows = HTMLDoc.getElementsByClassName("D(ib) W(1/2) Bxz(bb)")(0).getElementsByTagName("tr")

r = 3
For Each HTMLRow In HTMLRows
wksDest.Cells(r, "A").Value = HTMLRow.Cells(0).innerText
wksDest.Cells(r, "B").Value = HTMLRow.Cells(1).innerText
r = r + 1
Next HTMLRow

Set HTMLRows = HTMLDoc.getElementsByClassName("D(ib) W(1/2) Bxz(bb)")(1).getElementsByTagName("tr")

r = 3
For Each HTMLRow In HTMLRows
wksDest.Cells(r, "D").Value = HTMLRow.Cells(0).innerText
wksDest.Cells(r, "E").Value = HTMLRow.Cells(1).innerText
r = r + 1
Next HTMLRow

Sheet2.Activate

Application.ScreenUpdating = True

MsgBox "Completed...", vbInformation

Set IE = Nothing
Set HTMLDoc = Nothing
Set HTMLRows = Nothing
Set HTMLRow = Nothing
Set wksDest = Nothing

End Sub
 
Expression error as I can not use MsgBox Call ParseYahooFinanceTable(MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI"))

When I use Msgbox MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI") the message box is blank
 
Upvote 0

Excel Facts

Last used cell?
Press Ctrl+End to move to what Excel thinks is the last used cell.
Expression error as I can not use MsgBox Call ParseYahooFinanceTable(MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI"))

When I use Msgbox MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI") the message box is blank

this is invalid syntax...

Code:
MsgBox Call ParseYahooFinanceTable(MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI"))

this is valid...

Code:
Call ParseYahooFinanceTable(MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI"))

Did you do this??

<code class="prettyprint prettyprinted" style=""> ' make sure to include the Microsoft WinHTTP Services in the project
' tools -> references -> Microsoft WinHTTP Services, version 5.1</code>
 
Last edited:
Upvote 0
the source code is being returned from the MakeGetRequest function... within that function it is pulled from the request object after you did your send call
 
Upvote 0
On the link you provided here: VBScript WinHttp.WinHttpRequest.5.1 example with error handling

How do I get that code to work. Can you show an example of writing the html code in cell A1? Code is below
Code:
Option Explicit
'This is my sub to write the html string to cell A1 but it does not work. 
Sub test()
Range("A1").Value = (GetDataFromURL("http://www.808.dk/", "GET", ""))
 '
End Sub




Function GetDataFromURL(strURL, strMethod, strPostData)
  Dim lngTimeout
  Dim strUserAgentString
  Dim intSslErrorIgnoreFlags
  Dim blnEnableRedirects
  Dim blnEnableHttpsToHttpRedirects
  Dim strHostOverride
  Dim strLogin
  Dim strPassword
  Dim strResponseText
  Dim objWinHttp
  lngTimeout = 59000
  strUserAgentString = "http_requester/0.1"
  intSslErrorIgnoreFlags = 13056 ' 13056: ignore all err, 0: accept no err
  blnEnableRedirects = True
  blnEnableHttpsToHttpRedirects = True
  strHostOverride = ""
  strLogin = ""
  strPassword = ""
  Set objWinHttp = CreateObject("WinHttp.WinHttpRequest.5.1")
  objWinHttp.SetTimeouts lngTimeout, lngTimeout, lngTimeout, lngTimeout
  objWinHttp.Open strMethod, strURL
  If strMethod = "POST" Then
    objWinHttp.setRequestHeader "Content-type", _
      "application/x-www-form-urlencoded"
  End If
  If strHostOverride <> "" Then
    objWinHttp.setRequestHeader "Host", strHostOverride
  End If
  objWinHttp.Option(0) = strUserAgentString
  objWinHttp.Option(4) = intSslErrorIgnoreFlags
  objWinHttp.Option(6) = blnEnableRedirects
  objWinHttp.Option(12) = blnEnableHttpsToHttpRedirects
  If (strLogin <> "") And (strPassword <> "") Then
    objWinHttp.SetCredentials strLogin, strPassword, 0
  End If
  On Error Resume Next
  objWinHttp.send (strPostData)
  If Err.Number = 0 Then
    If objWinHttp.Status = "200" Then
      GetDataFromURL = objWinHttp.responseText
    Else
      GetDataFromURL = "HTTP " & objWinHttp.Status & " " & _
        objWinHttp.StatusText
    End If
  Else
    GetDataFromURL = "Error " & Err.Number & " " & Err.Source & " " & _
      Err.Description
  End If
  On Error GoTo 0
  Set objWinHttp = Nothing
End Function
 
Upvote 0
i can try to test it later... not sure you mentioned but you added the reference to the project?
 
Upvote 0
well i was able to grab the html...

gphOOin.png


the problem is actuaklly a typo from me lol

MakeWebRequest is not the name of the function oops :(

Code:
Function MakeGetRequest(url As String) As String
    ' make sure to include the Microsoft WinHTTP Services in the project
    ' tools -> references -> Microsoft WinHTTP Services, version 5.1
    ' http://www.808.dk/?code-simplewinhttprequest
    ' http://msdn.microsoft.com/en-us/library/windows/desktop/aa384106(v=vs.85).aspx
    ' http://www.neilstuff.com/winhttp/


    ' create the request object
    Set req = CreateObject("WinHttp.WinHttpRequest.5.1")
    
    ' make the request, http verb (method), url, false to force syncronous
    ' open(http method, absolute uri to request, async (true: async, false: sync)
    req.Open "GET", url, False


    req.SetRequestHeader "accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
    'req.SetRequestHeader "accept-encoding", "gzip, deflate, sdch, br"
    req.SetRequestHeader "accept-language", "en-US,en;q=0.8"
    req.SetRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"


    ' send request
    ' send post data, should be blank for a get request
    req.Send


    ' read response and return
    MakeGetRequest = req.responseText


End Function

my code has a bug in determining the correct cell (as you can see) but it works

TgCbIQV.png


Code:
Sub DoIt()
    Dim responseText As String
    
    responseText = MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI")


    Call ParseYahooFinanceTable(responseText)
End Sub
 
Last edited:
Upvote 0
OK HERE IS ALL THE CODE NO BUGS AND WORKING:

Code:
Sub DoIt()
    Dim responseText As String
    
    responseText = MakeGetRequest("https://finance.yahoo.com/quote/VTI/profile?p=VTI")


    Call ParseYahooFinanceTable(responseText)
End Sub


Sub ParseYahooFinanceTable(html As String)


    Dim index, tableIndex, removeIndex, row, col As Long
    Dim searchValue, parseValue As String
    Dim switch As Boolean


    row = 1
    col = 1
    switch = False
    index = 1
    tableIndex = 29


    Do
        searchValue = [COLOR=#ff0000]see image below[/COLOR]
        index = InStr(index, html, searchValue)
        
        If index > 0 Then
            index = index + Len(searchValue)


            removeIndex = [COLOR=#ff0000]see image below[/COLOR]
            Cells(row, col).Value2 = Mid(html, index, removeIndex - index)


            If switch Then
                row = row + 1
                col = col - 1
                tableIndex = tableIndex + 4
                switch = False
            Else
                col = col + 1
                tableIndex = tableIndex + 1
                switch = True
            End If
        Else
            MsgBox "Update your macro"
            Exit Sub
        End If
    Loop While tableIndex < 56
    
    Columns.AutoFit
End Sub


Function MakeGetRequest(url As String) As String
    ' make sure to include the Microsoft WinHTTP Services in the project
    ' tools -> references -> Microsoft WinHTTP Services, version 5.1
    ' http://www.808.dk/?code-simplewinhttprequest
    ' http://msdn.microsoft.com/en-us/library/windows/desktop/aa384106(v=vs.85).aspx
    ' http://www.neilstuff.com/winhttp/


    ' create the request object
    Set req = CreateObject("WinHttp.WinHttpRequest.5.1")
    
    ' make the request, http verb (method), url, false to force syncronous
    ' open(http method, absolute uri to request, async (true: async, false: sync)
    req.Open "GET", url, False


    req.SetRequestHeader "accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
    'req.SetRequestHeader "accept-encoding", "gzip, deflate, sdch, br"
    req.SetRequestHeader "accept-language", "en-US,en;q=0.8"
    req.SetRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"


    ' send request
    ' send post data, should be blank for a get request
    req.Send


    ' read response and return
    MakeGetRequest = req.responseText


End Function

COPY MISSING LINES FROM THIS IMAGE (MR EXCEL DOES NOT LIKE HTML FORMATTED TEXT IN POSTS):

lUArwsJ.png


RESULT:

TYQgBCd.png
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,819
Messages
6,181,153
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top