Brickhouse
New Member
- Joined
- Sep 5, 2016
- Messages
- 29
Hey, so essentially I wrote some code to do a bit of web data scraping, and although the code is not pretty, it "works" for the most part.
The major issue I am running into though is error handling. Initially I didn’t have any error handling, and because some of the html variables can be absent (for example: "li_detail_params size"), i was getting errors. I tried to temporarily solve this with the "on error/next" error handling, but this doesnt actually solve my issue which is: output data structure i.e. prices/sizes etc are getting matched to the wrong locations when printed to cells. What is needed is an error handling method that adds a blank or a dummy value into the cell(s) which have adsent html variables and then continue on with my loop.
Hope this post clearly outlines the issue i am having, and thanks in advance for the feedback!
The major issue I am running into though is error handling. Initially I didn’t have any error handling, and because some of the html variables can be absent (for example: "li_detail_params size"), i was getting errors. I tried to temporarily solve this with the "on error/next" error handling, but this doesnt actually solve my issue which is: output data structure i.e. prices/sizes etc are getting matched to the wrong locations when printed to cells. What is needed is an error handling method that adds a blank or a dummy value into the cell(s) which have adsent html variables and then continue on with my loop.
Hope this post clearly outlines the issue i am having, and thanks in advance for the feedback!
Code:
sub DataDump()
'Declaring Variables
Dim element As IHTMLElement
Dim elements As IHTMLElementCollection
Dim ie As InternetExplorer
Dim html As HTMLDocument
'open Internet Explorer, and go to website
Set ie = New InternetExplorer
ie.Visible = True
ie.navigate "https://www.Website.com" 'Scrubbed
'Wait until IE has loaded the web page
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set html = ie.document
Set elements = html.getElementsByClassName("media-heading")
Dim count As Long
Dim erow As Long
count = 0
On Error Resume Next 'error handling
For Each element In elements
If element.className = "media-heading" Then
'print html values to cells
erow = Blad1.Cells(Rows.count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1) = html.getElementsByTagName("h4")(count).innerText
Cells(erow, 2) = html.getElementsByClassName("li_detail_params first price")(count).innerText
Cells(erow, 3) = html.getElementsByClassName("li_detail_params rooms")(count).innerText
Cells(erow, 4) = html.getElementsByClassName("li_detail_params size")(count).innerText
count = count + 1
End If
Next element
If Err.Number <> 0 Then Debug.Print Err.Description
On Error GoTo 0
MsgBox ("Done")