VBA to automate getting table from website to excel daily

lea147

New Member
Joined
Dec 23, 2022
Messages
7
Office Version
  1. 365
Platform
  1. Windows
Hello,

Everyday, I have to grab a table from a website and paste it into excel. Sometimes this works by simple copy/paste and sometimes the table pastes all in one cell. When this is the case, I have to use the Get Data > From website tool. However, this becomes very time consuming for me. I was researching possible automatic solutions using VBA to help me out. However, all the sample codes I can find are all a bit different and I cannot figure out a way to get this working on my own.

The best I have been able to do so far is being able to open the website:

Sub GetDailyRates ()
Dim browser As InternetExplorer
Dim page as HTML Document
Set browser = New InternetExplorer
Browser.Visible = True
Browser.navigate ("https://... .com")
End sub

What I am ultimately hoping for is the ability to grab the table from the website (2nd image attached) and add it to the table I am using in Excel already (1st image attached). The website has two tables on it - but I just need to get the first one. The second is the legend and is not needed.

My current Excel table has filters to sort the extra data from the website table as well as some formatting (so the incoming data needs to take on the pre-established table formatting).
The website table only has the institution name in select cells. If possible, the cells below each need to be copied down until the next institution name. It is okay if the institution name is not filled into each cell as long as the cell comes in as actually blank (currently when it is grabbed via Get Data > website, it comes in as a non-blank cell (even though it looks blank).
The date column needs to be populated daily as well, however this is okay to do manually, especially if it causes difficulty doing it automatically.
The data from the website table needs to start in column B (through M) underneath of the current table entries and cannot include the headings.
Once the VBA code is created and working, I just need a simple way to run it daily by command (unless it is possible to have it run at say, 8:30am, every morning).

I thought this would be simpler than it is and I appreciate the help. Thank you so much!

(I have had to remove the names and cannot give out the website due to privacy, but I have done my best to, hopefully, illustrate with images.)
 

Attachments

  • GICSnip.JPG
    GICSnip.JPG
    161.2 KB · Views: 53
  • GICwebsite.JPG
    GICwebsite.JPG
    170.8 KB · Views: 40

Excel Facts

What is the shortcut key for Format Selection?
Ctrl+1 (the number one) will open the Format dialog for whatever is selected.
Can use importXML...
Ex. =IMPORTXML("Team Season Totals - Natural Stat Trick","//tr")

Inspect the page to find what the table is under... In my case //tr. This was tricky and kind of trail and error for me though.
Hi Sleven,
Thanks for the reply.
Your method doesn't work for me. Firstly, I can only get that formula to work in Google Sheets, not Excel. Also it just puts it all into one cell (see attached image). However, this result using the =IMPORTXML formula in Google Sheets is the same result that I get when I copy/paste the table into Excel. It all pastes into one cell, instead of in the table format. This is my initial problem that I need to solve. I was hoping to just automate it at the same time.
 
Upvote 0
Can use importXML...
Ex. =IMPORTXML("Team Season Totals - Natural Stat Trick","//tr")

Inspect the page to find what the table is under... In my case //tr. This was tricky and kind of trail and error for me though.
I think I may have got some sort of this actually working now. However, is there a way to start the table grab from the expanded <tr> line (in the image below) to the end? If I take all <tr> (like yours) the headings are not in the correct places (and I don't want the headings at all, to be honest).
Thanks again!
 

Attachments

  • html.JPG
    html.JPG
    43.6 KB · Views: 23
Upvote 0
The best I have been able to do so far is being able to open the website:

Sub GetDailyRates ()
Dim browser As InternetExplorer
Dim page as HTML Document
Set browser = New InternetExplorer
Browser.Visible = True
Browser.navigate ("https://... .com")
End sub

What I am ultimately hoping for is the ability to grab the table from the website (2nd image attached) and add it to the table I am using in Excel already (1st image attached). The website has two tables on it - but I just need to get the first one. The second is the legend and is not needed.

start the table grab from the expanded <tr> line (in the image below) to the end?
Using your IE automation method, try this for starters:

VBA Code:
Public Sub IE_Import_Table_Data()

    Dim IE As InternetExplorer
    Dim HTMLdoc As HTMLDocument
    Dim table As HTMLTable, tCell As HTMLTableCell, r As Long
    Dim destCell As Range, n As Long
    
    With ActiveSheet
        Set destCell = .Cells(.Rows.Count, "A").End(xlUp).Offset(1) 'next empty cell in column A
        n = 0
    End With
    
    Set IE = New InternetExplorer
    With IE
        .navigate "https://yoursite.com"  'CHANGE THIS
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
        .Visible = True
        Set HTMLdoc = .document
    End With

    Set table = HTMLdoc.getElementsByTagName("table")(0)  '1st table
    
    For r = 3 To table.Rows.Length - 1  'start at 4th row
        For Each tCell In table.Rows(r).Cells
            destCell.Offset(n, tCell.cellIndex).Value = tCell.innerText
        Next
        n = n + 1
    Next
    
End Sub
Requires references to MS Internet Controls and HTML Object Library.
 
Upvote 0
Solution
Using your IE automation method, try this for starters:

VBA Code:
Public Sub IE_Import_Table_Data()

    Dim IE As InternetExplorer
    Dim HTMLdoc As HTMLDocument
    Dim table As HTMLTable, tCell As HTMLTableCell, r As Long
    Dim destCell As Range, n As Long
   
    With ActiveSheet
        Set destCell = .Cells(.Rows.Count, "A").End(xlUp).Offset(1) 'next empty cell in column A
        n = 0
    End With
   
    Set IE = New InternetExplorer
    With IE
        .navigate "https://yoursite.com"  'CHANGE THIS
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
        .Visible = True
        Set HTMLdoc = .document
    End With

    Set table = HTMLdoc.getElementsByTagName("table")(0)  '1st table
   
    For r = 3 To table.Rows.Length - 1  'start at 4th row
        For Each tCell In table.Rows(r).Cells
            destCell.Offset(n, tCell.cellIndex).Value = tCell.innerText
        Next
        n = n + 1
    Next
   
End Sub
Requires references to MS Internet Controls and HTML Object Library.
Thank you very much, John! This is exactly what I asked for.

I'm wondering if you might be able to help me tweak it a little as it is better than I expected.

Is there anyway to have the data come in starting in column B instead of A? (My date column is A in my current table and the data coming from the website needs to go beside it in column B.)

Also, the values that this code is bringing in are retaining the blank-but-not-blank cells in column A. The website has a name listed then blanks until the next name and those "blank" cells are being imported as not blank. Is there any way to make those cells be actually blank so that when I do an F5 (Go To > Special > Blanks) search for blank cells that they will be found (so I can fill them with the data from above? Currently, they are not found because they are not blank even though they appear blank. Perhaps this step could be automated as well? So that each blank cell in the first column only (column 1 on the website, column (hopefully) B in Excel) is filled with the value above it?

Thanks again, this is amazing!
 
Upvote 0
Is there anyway to have the data come in starting in column B instead of A? (My date column is A in my current table and the data coming from the website needs to go beside it in column B.)
Replace the original line with:
VBA Code:
        Set destCell = .Cells(.Rows.Count, "A").End(xlUp).Offset(1, 1) 'next empty cell in column A, offset to column B

Also, the values that this code is bringing in are retaining the blank-but-not-blank cells in column A.
Debug the code and look at tCell.innerText. Typing ?">"&tcell.innerText&"<" in the Immediate window might be helpful to show any spaces. Sometimes what look like spaces in HTML are actually ASCII 160 (hex) characters.

The website has a name listed then blanks until the next name and those "blank" cells are being imported as not blank. Is there any way to make those cells be actually blank
Try this, using the Trim function to remove leading and trailing spaces in the web page table cell of column 1. This replaces the original code:
VBA Code:
    Dim c As Long

    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B only if column 1 in this web page row is not blank
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        End If
        'Web page columns 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next

So that each blank cell in the first column only (column 1 on the website, column (hopefully) B in Excel) is filled with the value above it?
A refinement of the above code (Else clause added):
VBA Code:
    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B from web page only if column 1 in this web page row is not blank, otherwise use column B value from row above
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        Else
            destCell.Offset(n, 0).Value = destCell.Offset(n - 1, 0).Value
        End If
        'Web page column 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next
Hope that helps. I'm afraid you're largely on your own with tweaking the code to make it extract the data in exactly the layout you want, but the above should give you some ideas.
 
Upvote 0
Replace the original line with:
VBA Code:
        Set destCell = .Cells(.Rows.Count, "A").End(xlUp).Offset(1, 1) 'next empty cell in column A, offset to column B


Debug the code and look at tCell.innerText. Typing ?">"&tcell.innerText&"<" in the Immediate window might be helpful to show any spaces. Sometimes what look like spaces in HTML are actually ASCII 160 (hex) characters.


Try this, using the Trim function to remove leading and trailing spaces in the web page table cell of column 1. This replaces the original code:
VBA Code:
    Dim c As Long

    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B only if column 1 in this web page row is not blank
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        End If
        'Web page columns 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next


A refinement of the above code (Else clause added):
VBA Code:
    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B from web page only if column 1 in this web page row is not blank, otherwise use column B value from row above
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        Else
            destCell.Offset(n, 0).Value = destCell.Offset(n - 1, 0).Value
        End If
        'Web page column 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next
Hope that helps. I'm afraid you're largely on your own with tweaking the code to make it extract the data in exactly the layout you want, but the above should give you some ideas.
Thanks again, John.
This is appreciated more than you know!
I hope you have a fantastic weekend!
 
Upvote 0
Replace the original line with:
VBA Code:
        Set destCell = .Cells(.Rows.Count, "A").End(xlUp).Offset(1, 1) 'next empty cell in column A, offset to column B


Debug the code and look at tCell.innerText. Typing ?">"&tcell.innerText&"<" in the Immediate window might be helpful to show any spaces. Sometimes what look like spaces in HTML are actually ASCII 160 (hex) characters.


Try this, using the Trim function to remove leading and trailing spaces in the web page table cell of column 1. This replaces the original code:
VBA Code:
    Dim c As Long

    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B only if column 1 in this web page row is not blank
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        End If
        'Web page columns 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next


A refinement of the above code (Else clause added):
VBA Code:
    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B from web page only if column 1 in this web page row is not blank, otherwise use column B value from row above
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        Else
            destCell.Offset(n, 0).Value = destCell.Offset(n - 1, 0).Value
        End If
        'Web page column 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next
Hope that helps. I'm afraid you're largely on your own with tweaking the code to make it extract the data in exactly the layout you want, but the above should give you some ideas.

Hi John,
I was hoping you might be able to help me out one more time. I've been trying to get the date to populate as well (been working on it since the end of Dec with no luck).
The code that is working, thank you again, I have as posted below.
I am wondering how I might be able to have the date that the data was pulled populate in column A. Right now the code pulls the information from the website perfectly starting with column B. I need column A to be the date. The date needs to not change from the date that the data was grabbed from the website.
(The date is also in the footer of the website, just not in the table. I'm not sure if its easier to just populate the date or to pull it from a footer? I'm at a loss either way.)
I hope that makes sense.
Thank you so much, in advance!


VBA Code:
Public Sub IE_Import_Table_Data()

    Dim IE As InternetExplorer
    Dim HTMLdoc As HTMLDocument
    Dim table As HTMLTable, tCell As HTMLTableCell, r As Long
    Dim destCell As Range, n As Long
    Dim c As Long
 
    With ActiveSheet
        Set destCell = .Cells(.Rows.Count, "A").End(xlUp).Offset(1, 1) 'next empty cell in column A, offset to column B
        n = 0
    End With
 
    Set IE = New InternetExplorer
    With IE
        .navigate "http://www.websiteGoesHere.html"
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
        .Visible = True
        Set HTMLdoc = .document
    End With

    Set table = HTMLdoc.getElementsByTagName("table")(0)  '1st table
 
    For r = 3 To table.Rows.Length - 1  'start at 4th row
        'Populate column B from web page only if column 1 in this web page row is not blank, otherwise use column B value from row above
        Set tCell = table.Rows(r).Cells(0)
        If Trim(tCell.innerText) <> "" Then
            destCell.Offset(n, 0).Value = tCell.innerText
        Else
            destCell.Offset(n, 0).Value = destCell.Offset(n - 1, 0).Value
        End If
        'Web page column 2 to last in this row
        For c = 1 To table.Rows(r).Cells.Length - 1
            Set tCell = table.Rows(r).Cells(c)
            destCell.Offset(n, c).Value = tCell.innerText
        Next
        n = n + 1
    Next
 
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,961
Messages
6,175,652
Members
452,664
Latest member
alpserbetli

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top