How to pull listed data from several web pages into excel

smccull

New Member
Joined
Jul 26, 2017
Messages
2
First of all I have read ALOT of different answers online in relation to this topic but I have to admit I am really struggling to adapt them to what I need so please any help is very much appreciated!

I need to extract the data listed on the following webpage (Pages 1-7) i.e. Fund Name, Price, Currency etc +Digital – Fund Centre and pull this data through to excel.

I have the below code that will open the IE page (which is working) :

Code:
' return the document containg the DOM of the page strWebAddress
' returns Nothing if the timeout lngTimeoutInSeconds was reached

Public Function GetIEDocument(ByVal strWebAddress As String, Optional ByVal lngTimeoutInSeconds As Long = 15) As MSHTML.HTMLDocument
Dim IE As SHDocVw.InternetExplorer
Dim IEDocument As MSHTML.HTMLDocument
Dim dateNow As Date

' create an IE application, representing a tab
Set IE = New SHDocVw.InternetExplorer


' optionally make the application visible, though it will work perfectly fine in the background otherwise
IE.Visible = True

' open a webpage in the tab represented by IE and wait until the main request successfully finished'
 times out after lngTimeoutInSeconds with a warning
IE.Navigate strWebAddress
dateNow = Now
Do While IE.Busy    
If Now > DateAdd("s", lngTimeoutInSeconds, dateNow) Then Exit Function
Loop

' retrieve the webpage's content (that is, the HTML DOM) and wait until everything is loaded (images, etc.)
' times out after lngTimeoutInSeconds with a warning

Set IEDocument = IE.Document
dateNow = Now
Do While IEDocument.ReadyState <> "complete"    
If Now > DateAdd("s", lngTimeoutInSeconds, dateNow) Then Exit Function
LoopSet GetIEDocument = IEDocument

End Function


However I cannot find the table tag that contains all other tags that I am interested to allow the rest of the code to pull through the data, The below code is what I have so far:

Code:
Public Sub GetTeamData()
    Dim strWebAddress As String
    Dim strH2AnchorContent As String
    Dim IEDocument As MSHTML.HTMLDocument
    Dim objH2 As MSHTML.HTMLHeaderElement
    Dim objTable As MSHTML.HTMLTable
    Dim objRow As MSHTML.HTMLTableRow
    Dim objCell As MSHTML.HTMLTableCell
    Dim lngRow As Long
    Dim lngColumn As Long
    ' initialize some variables that should probably better be passed as paramaters or defined as constants
    strWebAddress = "[url=https://toolkit.financialexpress.net/santanderam]+Digital – Fund Centre[/url]"
    strH2AnchorContent = "Contact"
    ' open page
    Set IEDocument = GetIEDocument(strWebAddress)
    If IEDocument Is Nothing Then
        MsgBox "Timeout reached opening this address:" & vbNewLine & strWebAddress, vbCritical
        Exit Sub
    End If
    ' retrieve anchor element
    For Each objH2 In IEDocument.getElementsByTagName("h2")
        If objH2.innerText = strH2AnchorContent Then Exit For
    Next objH2
    If objH2 Is Nothing Then
        MsgBox "Could not find """ & strH2AnchorContent & """ in DOM!", vbCritical
        Exit Sub
    End If
     ' traverse HTML tree to desired table element
    ' * move up one element in the hierarchy
    ' * skip two elements to proceed to the third (interjected each time with whitespace that is interpreted as an element of its own)
    ' * move down two elements n the hierarchy
    Set objTable = objH2.parentElement _
                        .NextSibling.NextSibling _
                        .NextSibling.NextSibling _
                        .NextSibling.NextSibling _
                        .Children(0) _
                        .Children(0)
    ' iterate over the table and output its contents
    lngRow = 1
    For Each objRow In objTable.Rows
        lngColumn = 1
        For Each objCell In objRow.Cells
            Cells(lngRow, lngColumn) = objCell.innerText
            lngColumn = lngColumn + 1
        Next objCell
        lngRow = lngRow + 1
    Next
End Sub

am assuming if I can locate the correct table tag to enter in the line below:

Code:
<code> strH2AnchorContent = "  "</code>
Then the above will work? If so can anyone help with finding the correct tag or advise where I am going wrong with the above?
Again any help would be REALLY appreciated!
Thanks
 

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.

Forum statistics

Threads
1,224,826
Messages
6,181,192
Members
453,021
Latest member
pingpong7117

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top