VBA to Identify Webpart. Need IE HTML Expert.

Barklie

Board Regular
Joined
Jul 4, 2013
Messages
86
Hello,

I am trying to scrape data from a web query using VBA. The problem I have run into is that I can't find a way to copy the data I actually want. I have tried everything I could find on the forums (document.body document.innertext, document.outerHTML, etc.), but nothing is capturing the data I'm after. Below is my code as well as a screenshot of the data I would like to copy. I only want the first result in the query, but I am happy with pulling extraneous information (the whole page and all the results if necessary) as long as the data I want is included. I have tried to read through the HTML to find the location, but at this point I think it will take someone more talented with reading HTML than me.

Code:
Sub QueryScrape()

'Dimensions IE
Dim IE As Object

'Starts loop
Set IE = CreateObject("InternetExplorer.Application")
For i = 2 To 500
    strLink = "http://209.183.255.146/arcgis/rest/services/Solar/Buildings/MapServer/find?searchText=" & i - 1 & "&contains=false&searchFields=&sr=&layers=0&layerDefs=&returnGeometry=true&maxAllowableOffset=&geometryPrecision=&dynamicLayers=&returnZ=true&returnM=true&gdbVersion=&f=html"

'Opens IE
With IE
    .Visible = True
    .Navigate strLink
Do While IE.ReadyState <> 4
            DoEvents
Loop

'Moves data to worksheet
'Range ("A" & i) = WHERE I NEED A WAY TO REFERENCE THE DATA

'Closes IE
IE.Quit

End With

'Separates data into columns
'This is code I will write to separate the data from column A into columns A through L

Next i

End Sub

Data%20Screenshot_zpsaed8xeqw.jpg
[/URL][/IMG]
 

Excel Facts

Difference between two dates
Secret function! Use =DATEDIF(A2,B2,"Y")&" years"&=DATEDIF(A2,B2,"YM")&" months"&=DATEDIF(A2,B2,"MD")&" days"
If I add this to the code just before IE.quit the HTML from the page is returned to strHTML.
Code:
strHTML = IE.document.body.innerhtml
By the way, can't you use the same instance of IE each time rather than creating a new one for every URL?
 
Upvote 0
Thanks Norie. I have updated my code for both your suggestions. However IE.document.body.innerhtml does not have the data on that page either. The code runs fine now, however, the desired data is missing.

Code:
Sub QueryScrape()

'Dimensions IE
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")

'Starts loop
For i = 2 To 3
    strLink = "http://209.183.255.146/arcgis/rest/services/Solar/Buildings/MapServer/find?searchText=" & i - 1 & "&contains=false&searchFields=&sr=&layers=0&layerDefs=&returnGeometry=true&maxAllowableOffset=&geometryPrecision=&dynamicLayers=&returnZ=true&returnM=true&gdbVersion=&f=html"

'Opens IE
With IE
    .Visible = True
    .Navigate strLink
Do While IE.ReadyState <> 4
            DoEvents
Loop

'Moves data to worksheet
Range("A" & i) = IE.document.body.innerhtml

End With

'Separates data into columns
'This is code I will write to separate the data from column A into columns A through L

Next i

End Sub
 
Last edited:
Upvote 0
It's working for me, with that specific code I end up with 2 big chunks of HTML in A2 and A3.
 
Upvote 0
I do as well, but nowhere in those chunks of html is the query results. It's just the HTML code for the query form. If you go to the website Find (Solar/Buildings) (the one being queried in A2), you'll notice query results at the bottom of the page. Those results are not in the HTML chunk that comes up in A2. Thanks!
 
Upvote 0
When I run this code, which splits the HTML at 'results' I'm getting, well, the results in A2 and A3.
Code:
Sub QueryScrape()

'Dimensions IE
Dim IE As Object
Dim I As Long
Dim strLInk As String

    Set IE = CreateObject("InternetExplorer.Application")

    'Starts loop
    For I = 2 To 3
        strLInk = "http://209.183.255.146/arcgis/rest/services/Solar/Buildings/MapServer/find?searchText=" & I - 1 & "&contains=false&searchFields=&sr=&layers=0&layerDefs=&returnGeometry=true&maxAllowableOffset=&geometryPrecision=&dynamicLayers=&returnZ=true&returnM=true&gdbVersion=&f=html"

        'Opens IE
        With IE
            .Visible = True
            .Navigate strLInk
            Do While IE.ReadyState <> 4
                DoEvents
            Loop

            'Moves data to worksheet
            Range("A" & I) = Split(IE.document.body.innerhtml, "results")(1)

        End With

        'Separates data into columns
        'This is code I will write to separate the data from column A into columns A through L

    Next I

End Sub
 
Upvote 0
That's weird. The code I posted was not getting to the results. I just used yours and I am though. Thank you so much. Is there a way to split the results at the end as well. In other words, so that I only get what is between "results" and "] ,"?
 
Upvote 0
Not sure I follow, there are ']' characters throughout the code, specifically in the Polygon 'Field'.

For example.

Polygon:
[312199.98649999965, 4306153.988600001] , [312191.60759999976, 4306149.0605999995] , [312188.1150000002, 4306154.999299999]

What is it you actually want to get from the data?

PS If you wanted to split out each 'object' you could split on the HTML ul tag.
 
Upvote 0
I want pretty much all the variable from the first query results. So Column A will be Object ID, Column B will be Shape, etc., Column L will be the first polygon coordinates, so like [312199.98649999965, 4306153.988600001]. Everything after the first pair of coordinates from the first query result, I don't care about. Basically, I am interested everything from the Object ID from the first results to the first pair of coordinates from the first results.
 
Upvote 0
Again, not sure I follow - do you just want the 1st result?

For example for the first search you would just want this.


<ul>
<i>OBJECTID: </i> 1<br>
<i>Shape: </i> Polygon<br>
<i>Address: </i> 3650 COMMONWEALTH AV<br>
<i>Area: </i> 1362.513125<br>
<i>UsblArea: </i> 363.281625<br>
<i>kWh: </i> 50452.219844<br>
<i>PctArea: </i> 26.662615<br>
<i>SysSize: </i> 5.397327<br>
<i>Savings: </i> 593.70597<br>
<i>Shape_Length: </i> 45.39259<br>
<i>Shape_Area: </i> 126.582118<br>
<i>Polygon:</i><br>
[321510.2087000003, 4300914.736500001]
, [321498.1944000004, 4300919.0973000005]
, [321501.50349999964, 4300928.365499999]
more... <br>
<i>Spatial Reference: </i> 26918
 (26918)
<br>
 
Upvote 0

Forum statistics

Threads
1,221,310
Messages
6,159,176
Members
451,543
Latest member
cesymcox

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top