Scraping HTML Table with VBA

montecarlo2012

Well-known Member
Joined
Jan 26, 2011
Messages
986
Office Version
  1. 2010
Platform
  1. Windows
Hello.
After trying around, then
//: help needed anyway

VBA Code:
Sub test()
   Dim ie As Object
   Sheets("Sheet2").Select
   Dim i As Long, strText As String
   Dim y As Long, z As Long, wb As Excel.Workbook, ws As Excel.Worksheet
      Set wb = Excel.ActiveWorkbook
      Set ws = wb.ActiveSheet
      Set ie = CreateObject("InternetExplorer.Application")
         my_url = " https://www.flalottery.com/exptkt/ff.htm                   "
            With ie
               .Visible = True
               .navigate my_url
               .Top = 50
               .Left = 530
               .Height = 800
               .Width = 800
               Do Until Not ie.busy And ie.readyState = 4
               DoEvents
               Loop
            End With
               Dim table As Object, tbody As Object, datarow As Object, thlist As Object, trlist As Object
               Application.Wait Now + TimeValue("00:00:02")
               Set tbody = ie.document.getElementsByTagName("table")(0).getElementsByTagName("tbody")(0)
               'find tha theader
               Set thlist = tbody.getElementsByTagName("tr")(0).getElementsByTagName("th")
               
               'loop through the header column and capture the value.
               Dim ii As Integer
                  For ii = 0 To thlist.Length - 1
                  ws.Cells(z, y).Value = thlist(ii).innerText
                  y = y + 1
                  Next ii
               'get all data row
               Set datarow = tbody.getElementsByTagName("tr")
               'init the data row index and column index.
               y = 2
               z = 4
               'loop through the data row and get all td. and then capture the value.
               Dim jj As Integer
               Dim datarowtdlist As Object
                  For jj = 1 To datarow.Length - 1
                     Set datarowtdlist = datarow(jj).getElementsByTagName("td")
                     'the x variable is used to set the column index.
                     Dim hh As Integer, x As Integer
                     x = y
                     For hh = 0 To datarowtdlist.Length - 1
                     ws.Cells(z, x).Value = datarowtdlist(hh).innerText
                     x = x + 1
                     Next hh
                     z = z + 1
                  Next jj
         Set ie = Nothing
End Sub
This code gives me partial information, the HTML has 113 pages, and I just get 3 maybe
plus a bunch of space and lines are completely not necessary for me at least.
here it is what I don't want.
1665963253934.png

what I really need is
this:

1665963316049.png

So the question is: How to make it to eliminate spaces and lines and the headers like florida etc, winning etc, 16-oct-22, please etc, etc, etc.
Any posibility somebody give me sometime and help me Please.

Thank you for kindness reading this.
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.

Forum statistics

Threads
1,224,819
Messages
6,181,153
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top