Hi I'm trying to figure out how to get html data from websites into excel sheet using vba. I've been looking around for hours specifically trying to figure out how to get pagename and pageid values (to start, I want to get a lot more than that after like website images etc...).
There seems to be a lot about getting tag info like <div> etc..... but I can't seem to find anything on extracting pagename and pageid. Both occur in the header of html code and I can clearly see what it is if I copy the code using right-click>inspect element on any given webpage. But I can't seem to do this.
This particular thread from this forum looks potentially promising but I'm having a problem with it and hoping someone can help.
Link:
http://www.mrexcel.com/forum/excel-questions/391035-trying-extract-data-html-header.html
Here's the code from john_w:
I'm getting this error: "Invalid or unqualified reference"
on this line: Set DetailPageHTMLStringDocument = .Document
".document" is the portion highlighted in the error.
This happens whether I dim the variable DetailPageHTMLStringDocument as object or as HTMLDocument.
I have the reference libraries for ms html object lib as well as ms internet controls turned on as well as several others.
Not sure how t fix this, would appreciate any help or direction.
Every time I think I"m starting to know vba pretty well, I am humbled by some little thing like this. This website has been so great for me over the years. I have contributed back a little in the way of answers but not nearly enough compared to what I have received. I need to make more effort in this area, and will do so.
Again any assistance much appreciated.
There seems to be a lot about getting tag info like <div> etc..... but I can't seem to find anything on extracting pagename and pageid. Both occur in the header of html code and I can clearly see what it is if I copy the code using right-click>inspect element on any given webpage. But I can't seem to do this.
This particular thread from this forum looks potentially promising but I'm having a problem with it and hoping someone can help.
Link:
http://www.mrexcel.com/forum/excel-questions/391035-trying-extract-data-html-header.html
Here's the code from john_w:
Code:
Sub Test()
PopValues "http://www.mrexcel.com", 1, "Sheet1"
End Sub
Sub PopValues(ByVal HyperString As String, ByVal RowCounter As Integer, ByVal SheetName As String)
Dim DetailPageHTMLStringDocument As HTMLDocument, DetailPageTEXTStringDocument As Object
Dim DetailPageHTMLString As String, DetailPageTEXTString As String
Dim StartLink1, StartLink2, StartLink3, StartLink4, StartLink5, StartLink6 As Integer
Dim EndLink1, EndLink2, EndLink3, EndLink4, EndLink5, EndLink6 As Integer
Dim TmpRng As String
Dim Cutstring1, Cutstring2, Cutstring3, Cutstring4, Cutstring5, Cutstring6 As String
Dim Counter As Long
Dim IEforContractDetail As Object
Set IEforContractDetail = CreateObject("InternetExplorer.Application")
Set DetailPageHTMLStringDocument = .Document
With IEforContractDetail
.Visible = True
.Navigate HyperString ' should work for any URL
Do Until .ReadyState = 4: DoEvents: Loop
Do Until .Document.ReadyState = "complete": DoEvents: Loop
Set DetailPageHTMLStringDocument = .Document
End With
Get_and_Print_Head_Element DetailPageHTMLStringDocument
End Sub
I'm getting this error: "Invalid or unqualified reference"
on this line: Set DetailPageHTMLStringDocument = .Document
".document" is the portion highlighted in the error.
This happens whether I dim the variable DetailPageHTMLStringDocument as object or as HTMLDocument.
I have the reference libraries for ms html object lib as well as ms internet controls turned on as well as several others.
Not sure how t fix this, would appreciate any help or direction.
Every time I think I"m starting to know vba pretty well, I am humbled by some little thing like this. This website has been so great for me over the years. I have contributed back a little in the way of answers but not nearly enough compared to what I have received. I need to make more effort in this area, and will do so.
Again any assistance much appreciated.