Retrieving a table from a site

thesimile

New Member
Joined
Feb 27, 2020
Messages
25
Office Version
  1. 365
Platform
  1. Windows
Hi Excel Gurus,
I am trying to retrieve all the links from site The Finder - Individual County Boundary Data
After going into URL, I want to click each alphabet
1706187697572.png

When clicking 'A' , I want to get the table list that is coming below.
1706187740601.png

VBA Code:
I am trying something like
   Dim browser As InternetExplorer
   Dim page As HTMLDocument
   Dim htmlDoc As HTMLDocument
   Dim quotes As Object
   Dim authors As Object
   Set browser = New InternetExplorer
   browser.Visible = True
   browser.Navigate ("https://thefinder.tax.ohio.gov/StreamlineSalesTaxWeb/Download/IndividualSchoolDistricts.aspx")
   Do While browser.Busy: Loop
   Set page = browser.Document
   Set quotes = page.getElementById("Alphabet1_A")
   quotes.Click
But after this im not able to get the list of <td> using QuerySelectorAll() as there is no reliable identifier. Is there anyway to use xpaths to get the list ?

Thanks,
Arun
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Here is the answer if anyone is looking for it
VBA Code:
Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" _
    Alias "URLDownloadToFileA" (ByVal pCaller As Long, ByVal szURL As String, _
    ByVal szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long
Sub OH_SD_Retrieval()
   Dim browser As InternetExplorer
   Dim page As HTMLDocument
   Dim htmlDoc As HTMLDocument
   Dim OH As Object
   Dim OHSiteAlph As Object
   Dim Item As IXMLDOMNodeList
   Dim tbody As Object
   Dim i As Long
   Dim sLetter As String
   Set browser = New InternetExplorer
   browser.Visible = True
   browser.Navigate ("https://thefinder.tax.ohio.gov/StreamlineSalesTaxWeb/Download/IndividualSchoolDistricts.aspx")
   Do While browser.Busy: Loop
   Set page = browser.Document
   Call ClearContent("School District List", "SD_Table")
   Set SD_Range = [SD_Table]
   Rownum = 1
   For j = 65 To 90
    sLetter = Chr(j)
    Set OHSite = page.getElementById("Alphabet1_" + sLetter)
    OHSite.Click
    While browser.Busy Or browser.ReadyState <> 4: DoEvents: Wend
     Start = False
    For i = 0 To page.getElementsByTagName("table").Length - 1
'    MsgBox page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Length
     If Start = True Then
'      MsgBox page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Item(0).outerText
      If page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Item(0).outerText = "" Or page.getElementsByTagName("tbody").Item(i).getElementsByTagName("td").Item(0).outerText = " " Then
       Start = False
      Else
       SD_Range(Rownum, 1).Value = page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Item(0).outerText
       SD_Range(Rownum, 2).Value = page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Item(1).outerText
       Rownum = Rownum + 1
      End If
     Else
      If InStr(page.getElementsByTagName("table").Item(i).innerText, "Contact Us") > 0 Then
       Exit For
      Else
      If page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Length > 5 Then
       If page.getElementsByTagName("table").Item(i).getElementsByTagName("td").Item(6).outerText = "School District Name" Then
        Start = True
       End If
      End If
      End If
     End If
     While browser.Busy Or browser.ReadyState <> 4: DoEvents: Wend
    Next i
   Next j
   browser.Quit
   Set SD_Range = [SD_Table]
'   For i = 1 To SD_Range.Rows.Count
'    imgsrc = SD_Range(i, 3).Value
'    dlpath = [DOWNLOAD_DIR].Value
'    URLDownloadToFile 0, imgsrc, dlpath & SD_Range(i, 2).Value & ".txt", 0, 0
'   Next i
End Sub
Sub ClearContent(SheetName As String, TableName As String)

Set Worksheet = ThisWorkbook.Sheets(SheetName)

With Worksheet.ListObjects(TableName)
        If Not .DataBodyRange Is Nothing Then
            .DataBodyRange.Delete
        End If
    End With

End Sub
Sub scrape_quotes1()
Dim XMLPage As New MSXML2.ServerXMLHTTP60
Dim htmlDoc As New MSHTML.HTMLDocument
Dim URL As String
'   Set browser = New InternetExplorer
'   browser.Visible = True
URL = "https://thefinder.tax.ohio.gov/StreamlineSalesTaxWeb/Download/IndividualSchoolDistricts.aspx"
XMLPage.Open "GET", URL, False
'    XMLPage.send
    XMLPage.send "Alphabet1_A"
    htmlDoc.body.innerHTML = XMLPage.responseText

    'Do something with the information
     Cells(1, 1).Value = htmlDoc.body.innerHTML
End Sub
 
Upvote 0
Solution

Forum statistics

Threads
1,223,987
Messages
6,175,794
Members
452,670
Latest member
nogarth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top