VBA to extract content from HTML tables

Peatawn

New Member
Joined
Apr 22, 2013
Messages
37
Hello community!

I've been stuck for the past few days trying to use VBA to extract text from web pages. I know there are threads about this, and they have been helpful for the most part. Nonetheless, I'm having a hard time adapting the solutions I found to my situation.

I have activated the Microsoft Internet Controls and Microsoft Objects HTML references.

Here is an example of a page from which I'm trying to extract information : http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270

I can extract the information I need from section 1 ("1 - Identification de la résidence") using the following code (this example code extracts "Chartwell Résidence Principale" and "339, rue Principale"):

Code:
Sub ExtractFromWeb()


    Dim IE As InternetExplorer
    Dim HTML As HTMLDocument
    Dim td As Object
    Dim Results As Variant


    Set IE = New InternetExplorerMedium
    IE.Visible = False
    IE.Navigate2 "http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270"


    Do While IE.Busy
        Application.Wait DateAdd("s", 1, Now)
    Loop


    Set HTML = IE.document
    Set td = HTML.getElementsByTagName("td")


    For i = 17 To 18
        Results = Split(td(i).textContent, vbLf)
        Range("A" & i).Value = Application.WorksheetFunction.Clean(Results(2))
    Next i


    IE.Quit
    Set IE = Nothing


    MsgBox "Macro Complete"


End Sub

The problem I'm having is when trying to extract data from the sections where the data is within tables, which could vary in number of rows from one page to another. It makes the tags for the rest of the page uncountable as well. So with more research I found the following code which partially works (I've been using section 3: "3 - Autres résidences privées pour aînés" to make tests):

Code:
Sub TestWebTable()


    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
    Dim x As Long, y As Long
    Dim oRow As Object, oCell As Object
    Dim data
    
    y = 1: x = 1
    
    With CreateObject("msxml2.xmlhttp")
        .Open "GET", "http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270", False
        .send
        oDom.body.innerHTML = .responseText
    End With
    
    With oDom.getElementsByTagName("table")(10)
        ReDim data(1 To .Rows.Length, 1 To .Rows(1).Cells.Length)
        For Each oRow In .Rows
            For Each oCell In oRow.Cells
                data(x, y) = oCell.innerText
                y = y + 1
            Next oCell
            y = 1
            x = x + 1
        Next oRow
    End With
    
    Sheets(5).Cells(1, 1).Resize(UBound(data), UBound(data, 2)).Value = data
    
    MsgBox "Table Macro complete"


End Sub

The issues I'm having with this code are:
1 - It copies the header row.
2 - Hard to tell where to place the data in specific cells without getting undefined errors.
3 - The French accented letters, as well as the following 2 characters are replaced by "�" symbols. So I lose a total of 3 characters every time there is an accented letter. This doesn't happen with the first code that can be used to extract from the first section.

The ultimate goal is to extract the data of nearly 800 pages like these and populate a list in a worksheet. Once I can get the data from sections 3, I'll think I'll be able to extrapolate for the other sections and then loop it for the almost 800 pages. But for the better of me I'm stuck at this point for the past few days. I've been testing many many things just can't seem to get it right. My brain is mush and my spirit is low. That's why I'm seeking help from the community. Any solution or a push in the right direction would be greatly appreciated.

Thanks in advance!
-A discouraged Peatawn
 
Last edited:

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
Hi, to import ection 3: "3 - Autres résidences privées pour aînés"

You can use
Code:
[/I]Sub ResidenceAines()    ActiveWorkbook.Queries.Add Name:="Table 5", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270#lien_3""))," & Chr(13) & "" & Chr(10) & "    Data5 = Source{5}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data5,{{"""", type text}, {""Nom de la résidence"", type text}, {""NEQ"", type number}, {""Adresse"", type text}, {""Municipalité"", type text}, {""Région " & _
        "administrative"", type text}, {""Code postal"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 5"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 5]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_5"
        .Refresh BackgroundQuery:=False
    End With

End Sub[I]
 
Upvote 0
With this macro, I download all tables on a separate sheet, have to look around to make it dynamic

Code:
    ActiveWorkbook.Queries.Add Name:="1 - Identification de la résidence", _
        Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data1 = Source{1}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data1, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Nom et coordonnées de la résidence"", type text}, {" & _
        """Nom et coordonnées de la résidence_1"", type any}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="2 - Identification des exploitants", _
        Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data3 = Source{3}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data3, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Société"", type any}, {""Société_1"", type any}})" & Chr(13) & "" & Chr(10) & "i" & _
        "n" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 2", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data2 = Source{2}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data2,{{""Column1"", type text}, {""Column2"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="6 - Portraits", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data8 = Source{8}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data8, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Portrait de la clientèle"", type text}, {""Portrait " & _
        "de la clientèle_1"", type any}, {""Portrait de la clientèle_2"", type any}, {""Portrait de la clientèle_3"", type any}, {""Portrait de la clientèle_4"", type any}, {""Portrait de la clientèle_5"", type any}, {""Portrait de la clientèle_6"", type text}, {""Portrait de la clientèle_7"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="7 - Les services offerts par la résidence" _
        , Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data14 = Source{14}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data14, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Parmi les services suivants, lesquels sont offert" & _
        "s par votre résidence?"", type number}, {""Parmi les services suivants, lesquels sont offerts par votre résidence?_1"", type any}, {""Parmi les services suivants, lesquels sont offerts par votre résidence?_2"", type any}, {""Parmi les services suivants, lesquels sont offerts par votre résidence?_3"", type any}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="8 - Reconnaissance de la résidence", _
        Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data15 = Source{15}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data15, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Reconnaissance"", type number}, {""Reconnaissance" & _
        "_1"", type any}, {""Reconnaissance_2"", type any}, {""Reconnaissance_3"", type text}, {""Reconnaissance_4"", type text}, {""Reconnaissance_5"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="9 - Caractéristiques de la résidence", _
        Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data17 = Source{17}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data17, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Caractéristiques"", type text}, {""Caractéristiqu" & _
        "es_1"", type any}, {""Caractéristiques_2"", type any}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 10", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data10 = Source{10}[Data]," & Chr(13) & "" & Chr(10) & "    #""Promoted Headers"" = Table.PromoteHeaders(Data10, [PromoteAllScalars=true])," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(#""Promoted Headers"",{{""Dans cet immeuble combien d'unités locatives (cha" & _
        "mbres et logements) sont#(cr)#(lf)#(lf)#(tab)#(tab)#(tab)#(tab)#(tab)#(tab) exploitées en vertu:"", type text}, {""Chambres simples"", type text}, {""Chambres doubles"", type text}, {""Logements"", type text}, {""Total"", Int64.Type}, {""Clientèle de#(cr)#(lf) personnes âgées"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 12", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data12 = Source{12}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data12,{{""Column1"", type text}, {""Column2"", type text}, {""Column3"", type text}, {""Column4"", type text}, {""Column5"", type text}, {""Column6"", type tex" & _
        "t}, {""Column7"", type text}, {""Column8"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 11", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data11 = Source{11}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data11,{{""Column1"", type text}, {""Column2"", type text}, {""Column3"", type text}, {""Column4"", type text}, {""Column5"", type text}, {""Column6"", type tex" & _
        "t}, {""Column7"", type text}, {""Column8"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 13", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data13 = Source{13}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data13,{{""Column1"", type text}, {""Column2"", type text}, {""Column3"", type text}, {""Column4"", type text}, {""Column5"", type text}, {""Column6"", type tex" & _
        "t}, {""Column7"", type text}, {""Column8"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 16", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data16 = Source{16}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data16,{{""Column1"", type text}, {""Column2"", type text}, {""Column3"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 18", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data18 = Source{18}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data18,{{""Column1"", type text}, {""Column2"", Int64.Type}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 4", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data4 = Source{4}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data4,{{"""", type text}, {""Nom et prénom"", type text}, {""Nom et prénom2"", type text}, {""Nom et prénom3"", type text}, {""Nom et prénom4"", type text}, {""No" & _
        "m et prénom5"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 5", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data5 = Source{5}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data5,{{"""", type text}, {""Nom de la résidence"", type text}, {""NEQ"", type number}, {""Adresse"", type text}, {""Municipalité"", type text}, {""Région adminis" & _
        "trative"", type text}, {""Code postal"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 6", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data6 = Source{6}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data6,{{"""", type text}, {""Nom et prénom"", type text}, {""Nom et prénom2"", type text}, {""Nom et prénom3"", type text}, {""Nom et prénom4"", type text}, {""No" & _
        "m et prénom5"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 7", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data7 = Source{7}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data7,{{"""", type text}, {""Nom et prénom"", type text}, {""Occupation"", type text}, {""Fonction"", type text}, {""Fonction2"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Queries.Add Name:="Table 9", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270""))," & Chr(13) & "" & Chr(10) & "    Data9 = Source{9}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data9,{{""Column1"", type text}, {""Column2"", Int64.Type}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""1 - Identification de la résidence"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [1 - Identification de la résidence]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "_1___Identification_de_la_résidence"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""2 - Identification des exploitants"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [2 - Identification des exploitants]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "_2___Identification_des_exploitants"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 2"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 2]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_2"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""6 - Portraits"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [6 - Portraits]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "_6___Portraits"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:=Array( _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""7 - Les services offerts par la résidence"";Extended Propertie" _
        , "s="""""), Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array( _
        "SELECT * FROM [7 - Les services offerts par la résidence]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "_7___Les_services_offerts_par_la_résidence"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""8 - Reconnaissance de la résidence"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [8 - Reconnaissance de la résidence]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "_8___Reconnaissance_de_la_résidence"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""9 - Caractéristiques de la résidence"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [9 - Caractéristiques de la résidence]" _
        )
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "_9___Caractéristiques_de_la_résidence"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 10"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 10]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_10"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 12"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 12]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_12"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 11"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 11]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_11"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 13"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 13]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_13"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 16"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 16]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_16"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 18"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 18]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_18"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 4"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 4]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_4"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 5"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 5]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_5"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 6"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 6]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_6"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 7"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 7]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_7"
        .Refresh BackgroundQuery:=False
    End With
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 9"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 9]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_9"
        .Refresh BackgroundQuery:=False
    End With
 
Upvote 0
Is there a specific library I should add to the references? I get a "Runtime-error '5': Invalid procedure or call argument". I saw stuff about OLEDB in your code so I activated the "Microsoft OLE DB Service Component 1.0 Type Library" and "Microsoft OLE DB Simple Provider 1.5 Library" but to no avail. However, I just got around to look at your code and can't pretend to comprehend first sight what is happening in there... yet. ;P
 
Upvote 0
I basically record the importation of the table:

-Down left next to ready, there is an icon to record.
-In data tab, I go for "from web"
-I put the url
- select table 5 and get the first macro

There is a way to do it for multiple tables (second macro), but I had error. It works if I select load into (the arrow on load button), chose table instead of connection and do not add it to datamodel.
 
Last edited:
Upvote 0
Thanks Kamolga! Though when copying your code it didn't work, you pointed me to the right direction and I was able to record my own and make the adjustments I needed.
 
Upvote 0
Thanks Kamolga! Though when copying your code it didn't work, you pointed me to the right direction and I was able to record my own and make the adjustments I needed.

Great! I guess it is a question of excel versions
 
Upvote 0
The issues I'm having with this code are:
1 - It copies the header row.
2 - Hard to tell where to place the data in specific cells without getting undefined errors.
3 - The French accented letters, as well as the following 2 characters are replaced by "�" symbols. So I lose a total of 3 characters every time there is an accented letter. This doesn't happen with the first code that can be used to extract from the first section.
1. To avoid the header row use a For loop starting at row 1 (row 0 is the header row) instead of a For Each loop.
2. Don't understand what you mean.
3. Convert the XMLhttp.responseBody (not responseText) to Unicode and put that in the HTMLDocument for parsing.

Try this macro:
Code:
Public Sub XMLhttp_Extract_Data()

    Dim httpReq As Object
    Dim URL As String
    Dim HTMLdoc As Object
    Dim table As HTMLTable, tRow As HTMLTableRow
    Dim tableData() As String
    Dim r As Long, c As Long
    Dim destCell As Range
    
    URL = "http://k10.pub.msss.rtss.qc.ca/public/formulaire/K10FormCons.asp?noForm=2270"
    
    With Worksheets(1)
        .Cells.ClearContents
        .Range("A1:F1").Value = Array("Nom de la résidence", "NEQ", "Adresse", "Municipalité", "Région administrative", "Code postal")
        Set destCell = .Range("A2")
    End With
    
    Set httpReq = CreateObject("MSXML2.XMLhttp")
    
    With httpReq
        .Open "GET", URL, False
        .send        
        Set HTMLdoc = New HTMLDocument
        HTMLdoc.body.innerHTML = StrConv(.responseBody, vbUnicode)
    End With
    
    '< table class="formSousTableau" id="tableauResid" width="92%" cellspacing="0" cellpadding="5" >

    Set table = HTMLdoc.getElementById("tableauResid")
    ReDim tableData(1 To table.Rows.Length - 2, 1 To 6)
    For r = 1 To table.Rows.Length - 2
        Set tRow = table.Rows(r)
        For c = 1 To tRow.Cells.Length - 1
            tableData(r, c) = tRow.Cells(c).innerText
        Next
    Next
    
    destCell.Resize(UBound(tableData), UBound(tableData, 2)).Value = tableData
        
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,246
Messages
6,170,987
Members
452,373
Latest member
TimReeks

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top