Pull Data From Website

mikeypsooner

New Member
Joined
May 16, 2017
Messages
33
I would like to pull the data from the tables on the webpage https://finance.yahoo.com/quote/VTI/holdings?p=VTI

Looking to get the Overall Portfolio Composition (%), Sector Weightings (%), Equity Holdings, Bond Ratings, Top 10 Holdings (14.91% of Total Assets), and Fund Overview in tables similar the one below.

Not to familiar with
getElementsByClassName, getElementsByTagName, etc. I general know how to locate the table information but not sure how to write the code to get getElementsByClassName, getElementsByTagName, etc. When I look at the HTML code I am not sure which one to grab for my specific data to get to a table. Maybe some useful vba and html documents to look at?

Any help is much appreciated.


Option Explicit

Sub GetStockData()

Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim HTMLRows As MSHTML.IHTMLElementCollection
Dim HTMLRow As MSHTML.IHTMLElement
Dim wksDest As Worksheet
Dim r As Long

Set wksDest = Sheet2
wksDest.Cells.Clear

Application.ScreenUpdating = False

IE.navigate "https://finance.yahoo.com/quote/GOOGL?p=GOOGL"
IE.Visible = False

With IE
Do While .Busy Or .readyState <> READYSTATE_COMPLETE
DoEvents
Loop
End With

Set HTMLDoc = IE.document

With HTMLDoc.getElementsByClassName("D(ib) Fw(200) Mend(20px)")(0)
wksDest.Range("A1").Value = .Children(0).innerText
wksDest.Range("B1").Value = .Children(1).innerText
End With

Set HTMLRows = HTMLDoc.getElementsByClassName("D(ib) W(1/2) Bxz(bb)")(0).getElementsByTagName("tr")

r = 3
For Each HTMLRow In HTMLRows
wksDest.Cells(r, "A").Value = HTMLRow.Cells(0).innerText
wksDest.Cells(r, "B").Value = HTMLRow.Cells(1).innerText
r = r + 1
Next HTMLRow

Set HTMLRows = HTMLDoc.getElementsByClassName("D(ib) W(1/2) Bxz(bb)")(1).getElementsByTagName("tr")

r = 3
For Each HTMLRow In HTMLRows
wksDest.Cells(r, "D").Value = HTMLRow.Cells(0).innerText
wksDest.Cells(r, "E").Value = HTMLRow.Cells(1).innerText
r = r + 1
Next HTMLRow

Sheet2.Activate

Application.ScreenUpdating = True

MsgBox "Completed...", vbInformation

Set IE = Nothing
Set HTMLDoc = Nothing
Set HTMLRows = Nothing
Set HTMLRow = Nothing
Set wksDest = Nothing

End Sub
 
I tried the code below but it has an error on the HTMLDoc.outerhtml line. I just tried to paste the source code in cell A1.


Code:
Sub GetHoldings()


    Dim IE As New SHDocVw.InternetExplorer
    Dim HTMLDoc As New MSHTML.HTMLDocument
    Dim HTMLEles As MSHTML.IHTMLElementCollection
    Dim HTMLEle As MSHTML.IHTMLElement
    Dim wksDest As Worksheet
    Dim r As Long
    Dim c As Long
    Dim i As Long
    Dim j As Long
    
    Set wksDest = Sheet4
    wksDest.Cells.Clear
    
    Application.ScreenUpdating = False
    
    IE.navigate "https://finance.yahoo.com/quote/VTI/holdings?p=VTI"
    IE.Visible = True
    
    With IE
        Do While .Busy Or .readyState <> READYSTATE_COMPLETE
            DoEvents
        Loop
    End With
    
    Set HTMLDoc = IE.document
    
    wksDest.Range("A1").Value = HTMLDoc.outerHTML
    IE.Quit
End Sub
 
Upvote 0

Excel Facts

Can you sort left to right?
To sort left-to-right, use the Sort dialog box. Click Options. Choose "Sort left to right"
so i was looking at the table...

hTQxGNn.png


this value is unique and precedes the word Category...

<span data-reactid="29">


ZhOptb1.png

Seu2EnQ.png

hdHocQb.png


program the pattern into the loop
 
Last edited:
Upvote 0
Code:
Dim index, tableIndex, removeIndex As Long
Dim searchValue, parseValue As String
Dim switch as Boolean

switch = False
index  = 1
tableIndex = 29

Do
    searchValue = Iif(switch,"data-reactid=""" & tableIndex & """>", "<span data-reactid=""" & tableIndex & """>")
    index = InStr(index, HTMLDoc.outerHTML, searchValue)
    If index > 0 Then
        index = index + Len(searchValue)

        [COLOR=#00ff00]'now index is at category for the first loop when tableIndex = 29[/COLOR]

        removeIndex = InStr(index, HTMLDoc.outerHTML, " 
        parseValue = Mid(HTMLDoc.outerHTML, index, removeIndex - index)

        [COLOR=#00ff00]'the value "Category" is now saved in parseValue, so save it somewhere  and let your loop keep grabbing stuff...
        'you have to code it so you know  what to do with everything this loop gets... this loop ends when that  first InStr
        'function fails to find the search value[/COLOR]
    End If

    tableIndex = tableIndex + Iif(switch, 4, 1)
    switch = Not switch [COLOR=#00ff00]'flip the switch[/COLOR]
Loop While index > 0 [COLOR=#00ff00]'if the index is 1 based which i think strings are[/COLOR]
 
Last edited:
Upvote 0
mr excel could not handle the html code and wouldnt display properly so that is a screenshot of the complete code, posting this is driving me crazy xD
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,182
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top