First of all I have read ALOT of different answers online in relation to this topic but I have to admit I am really struggling to adapt them to what I need so please any help is very much appreciated!
I need to extract the data listed on the following webpage (Pages 1-7) i.e. Fund Name, Price, Currency etc +Digital – Fund Centre and pull this data through to excel.
I have the below code that will open the IE page (which is working) :
However I cannot find the table tag that contains all other tags that I am interested to allow the rest of the code to pull through the data, The below code is what I have so far:
am assuming if I can locate the correct table tag to enter in the line below:
Then the above will work? If so can anyone help with finding the correct tag or advise where I am going wrong with the above?
Again any help would be REALLY appreciated!
Thanks
I need to extract the data listed on the following webpage (Pages 1-7) i.e. Fund Name, Price, Currency etc +Digital – Fund Centre and pull this data through to excel.
I have the below code that will open the IE page (which is working) :
Code:
' return the document containg the DOM of the page strWebAddress
' returns Nothing if the timeout lngTimeoutInSeconds was reached
Public Function GetIEDocument(ByVal strWebAddress As String, Optional ByVal lngTimeoutInSeconds As Long = 15) As MSHTML.HTMLDocument
Dim IE As SHDocVw.InternetExplorer
Dim IEDocument As MSHTML.HTMLDocument
Dim dateNow As Date
' create an IE application, representing a tab
Set IE = New SHDocVw.InternetExplorer
' optionally make the application visible, though it will work perfectly fine in the background otherwise
IE.Visible = True
' open a webpage in the tab represented by IE and wait until the main request successfully finished'
times out after lngTimeoutInSeconds with a warning
IE.Navigate strWebAddress
dateNow = Now
Do While IE.Busy
If Now > DateAdd("s", lngTimeoutInSeconds, dateNow) Then Exit Function
Loop
' retrieve the webpage's content (that is, the HTML DOM) and wait until everything is loaded (images, etc.)
' times out after lngTimeoutInSeconds with a warning
Set IEDocument = IE.Document
dateNow = Now
Do While IEDocument.ReadyState <> "complete"
If Now > DateAdd("s", lngTimeoutInSeconds, dateNow) Then Exit Function
LoopSet GetIEDocument = IEDocument
End Function
However I cannot find the table tag that contains all other tags that I am interested to allow the rest of the code to pull through the data, The below code is what I have so far:
Code:
Public Sub GetTeamData()
Dim strWebAddress As String
Dim strH2AnchorContent As String
Dim IEDocument As MSHTML.HTMLDocument
Dim objH2 As MSHTML.HTMLHeaderElement
Dim objTable As MSHTML.HTMLTable
Dim objRow As MSHTML.HTMLTableRow
Dim objCell As MSHTML.HTMLTableCell
Dim lngRow As Long
Dim lngColumn As Long
' initialize some variables that should probably better be passed as paramaters or defined as constants
strWebAddress = "[url=https://toolkit.financialexpress.net/santanderam]+Digital – Fund Centre[/url]"
strH2AnchorContent = "Contact"
' open page
Set IEDocument = GetIEDocument(strWebAddress)
If IEDocument Is Nothing Then
MsgBox "Timeout reached opening this address:" & vbNewLine & strWebAddress, vbCritical
Exit Sub
End If
' retrieve anchor element
For Each objH2 In IEDocument.getElementsByTagName("h2")
If objH2.innerText = strH2AnchorContent Then Exit For
Next objH2
If objH2 Is Nothing Then
MsgBox "Could not find """ & strH2AnchorContent & """ in DOM!", vbCritical
Exit Sub
End If
' traverse HTML tree to desired table element
' * move up one element in the hierarchy
' * skip two elements to proceed to the third (interjected each time with whitespace that is interpreted as an element of its own)
' * move down two elements n the hierarchy
Set objTable = objH2.parentElement _
.NextSibling.NextSibling _
.NextSibling.NextSibling _
.NextSibling.NextSibling _
.Children(0) _
.Children(0)
' iterate over the table and output its contents
lngRow = 1
For Each objRow In objTable.Rows
lngColumn = 1
For Each objCell In objRow.Cells
Cells(lngRow, lngColumn) = objCell.innerText
lngColumn = lngColumn + 1
Next objCell
lngRow = lngRow + 1
Next
End Sub
am assuming if I can locate the correct table tag to enter in the line below:
Code:
<code> strH2AnchorContent = " "</code>
Again any help would be REALLY appreciated!
Thanks