How can I edit this code to import local HTML files rather than an online Web Query fetch

d0rian

Active Member
Joined
May 30, 2015
Messages
313
Office Version
  1. 365
The code below -- created w/ help from this forum -- does a great job of cycling through a list of ~300 URLs (listed starting in cell A2 of sheet "URLs") and importing the table data from each into sheet "MXquote", pasting each chunk of table data on the first available row, such that the end result is a master sheet of table data from all 300 URLs.

But it takes ~6 minutes to get through all 300 URLs, because the code is essentially running 300 consecutive Web Queries and loading each page to fetch the data. In the hopes of speeding this up, I've found a way to instead download all 300 sites as local HTML pages (to C:\table_downloads\), which takes < 30 seconds, so I want to adapt the code so that it cycles through and imports tables from local HTML files instead of cycling through a list of URLs.

One constraint is that the # of HTML files in \table_downloads\ may vary because sometimes not all 300 HTML get downloaded properly...so I can't supply a static list of file names to import, but rather I want to instruct Excel to simply import the table data from whatever HTML files are in the \table_downloads\ folder.


Code:
Sub MX_Quote_loop_through()    Dim h1 As Worksheet, h2 As Worksheet
    Dim u1 As Long, u2 As Long
    Dim MyUrl As String
    '
    Application.ScreenUpdating = False
    Application.StatusBar = False
    Set h1 = Sheets("URLs")     'origin
    Set h2 = Sheets("MXquote")   'destiny
    '
    u1 = h1.Range("A" & Rows.Count).End(xlUp).Row
    For i = 2 To u1
        MyUrl = h1.Cells(i, "A").Value
        Application.StatusBar = "import data : " & i - 1 & " of : " & u1 - 1
        u2 = h2.Range("A" & Rows.Count).End(xlUp).Row + 1
        With h2.QueryTables.Add(Connection:="URL;" & MyUrl, Destination:=h2.Range("A" & u2))
            .Name = "negoBHC"
            .FieldNames = True
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlOverwriteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = False
            .RefreshPeriod = 0
            .WebSelectionType = xlSpecifiedTables
            .WebFormatting = xlWebFormattingNone
            .WebTables = "1"
            .WebPreFormattedTextToColumns = True
            .WebConsecutiveDelimitersAsOne = True
            .WebSingleBlockTextImport = False
            .WebDisableDateRecognition = False
            .WebDisableRedirections = False
            .Refresh BackgroundQuery:=False
        End With
        u3 = h2.Range("A" & Rows.Count).End(xlUp).Row
        h2.Range("P" & u2 & ":P" & u3).Value = MyUrl
    Next
    Application.StatusBar = False
    Application.ScreenUpdating = True
    MsgBox "End"
End Sub
 

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.
Can you share the table download or give the code to download as example. It will make it easy to get the code attached to work
 
Upvote 0
Try this

Code:
Sub MX_Quote_loop_through()
    Dim h1 As Worksheet, h2 As Worksheet
    Dim u1 As Long, u2 As Long, i As Long, u3 As Long
    Dim MyUrl As String, [COLOR=#0000ff]wPath [/COLOR]As String
    '
    Application.ScreenUpdating = False
    Application.StatusBar = False
    Set h1 = Sheets("URLs")     'origin
    Set h2 = Sheets("MXquote")   'destiny
    '
[COLOR=#0000ff]    i = 2[/COLOR]
[COLOR=#0000ff]    wPath = "C:\table_downloads\"[/COLOR]
[COLOR=#0000ff]    MyUrl = Dir(wPath & "*.html*")[/COLOR]
[COLOR=#0000ff]    Do While MyUrl <> ""[/COLOR]
        Application.StatusBar = "import data : " & i - 1 & " of : " & u1 - 1
        u2 = h2.Range("A" & Rows.Count).End(xlUp).Row + 1
        With h2.QueryTables.Add(Connection:="URL;" [COLOR=#0000ff]& wPath & "\" & MyUrl[/COLOR], Destination:=h2.Range("A" & u2))
            .Name = "negoBHC"
            .FieldNames = True
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlOverwriteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = False
            .RefreshPeriod = 0
            .WebSelectionType = xlSpecifiedTables
            .WebFormatting = xlWebFormattingNone
            .WebTables = "1"
            .WebPreFormattedTextToColumns = True
            .WebConsecutiveDelimitersAsOne = True
            .WebSingleBlockTextImport = False
            .WebDisableDateRecognition = False
            .WebDisableRedirections = False
            .Refresh BackgroundQuery:=False
        End With
        u3 = h2.Range("A" & Rows.Count).End(xlUp).Row
        h2.Range("P" & u2 & ":P" & u3).Value = MyUrl
        i = i + 1
        MyUrl = Dir()
    Loop
    Application.StatusBar = False
    Application.ScreenUpdating = True
    MsgBox "End"
End Sub
 
Upvote 0
Awesome, thanks for the help, it works! Only follow-up: there's the one line of code beneath the Do While that previously showed text in the Excel StatusBar that read "import data : x of y" showing how many of the files are left to go...but the 'y' value (in x of y) no longer displays...how can I have it set to be the # of HTML files in the folder of source html files?
 
Upvote 0
Try this

Code:
Sub MX_Quote_loop_through()
    Dim h1 As Worksheet, h2 As Worksheet, u2 As Long, i As Long, u3 As Long
    Dim MyUrl As String, wPath As String, fso As Object, n As Long, wfile As Object
    '
    Application.ScreenUpdating = False
    Application.StatusBar = False
    '
    Set h1 = Sheets("URLs")       'origin
    Set h2 = Sheets("MXquote")    'destiny
    '
    i = 1
    wPath = "C:\table_downloads\"
    MyUrl = Dir(wPath & "*.html*")
    '
    Set fso = CreateObject("scripting.filesystemobject")
    For Each wfile In fso.getfolder(wPath).Files
      If LCase(Right(wfile, 4)) = "html" Then n = n + 1
    Next
    '
    Do While MyUrl <> ""
        Application.StatusBar = "import data : " & i & " of : " & n
        u2 = h2.Range("A" & Rows.Count).End(xlUp).Row + 1
        With h2.QueryTables.Add(Connection:="URL;" & wPath & "\" & MyUrl, Destination:=h2.Range("A" & u2))
            .Name = "negoBHC"
            .FieldNames = True
            .RowNumbers = False
            .FillAdjacentFormulas = False
            .PreserveFormatting = True
            .RefreshOnFileOpen = False
            .BackgroundQuery = True
            .RefreshStyle = xlOverwriteCells
            .SavePassword = False
            .SaveData = True
            .AdjustColumnWidth = False
            .RefreshPeriod = 0
            .WebSelectionType = xlSpecifiedTables
            .WebFormatting = xlWebFormattingNone
            .WebTables = "1"
            .WebPreFormattedTextToColumns = True
            .WebConsecutiveDelimitersAsOne = True
            .WebSingleBlockTextImport = False
            .WebDisableDateRecognition = False
            .WebDisableRedirections = False
            .Refresh BackgroundQuery:=False
        End With
        u3 = h2.Range("A" & Rows.Count).End(xlUp).Row
        h2.Range("P" & u2 & ":P" & u3).Value = MyUrl
        i = i + 1
        MyUrl = Dir()
    Loop
    Application.StatusBar = False
    Application.ScreenUpdating = True
    MsgBox "End"
End Sub
 
Upvote 0

Forum statistics

Threads
1,225,739
Messages
6,186,746
Members
453,370
Latest member
juliewar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top