Hello Alan,
.........I put together a workbook that would download the web tables to a worksheet.........
……..The macro that opens the web page uses the MSXML2 library. This is faster that using Internet Explorer and also runs synchronously with VBA. The data from the tables is collected into a 2-D array. This allows all the data to be written to the worksheet in s single operation.
………….. I want to keep it simple at this point so you can see what is going on and understand the code before expanding it. I am sure you will have questions………..
In the above code ...............
For IE object it looks like this:
Set objTable = IE.Document.All.Item("container2").getElementsByTagName("table") '(0)
where "container2" comes from the site's HTML code of the required table: div id="container2"
[color=darkblue]Set[/color] objTable = appIE.Document.All.Item("container2").getElementsByTagName("table") [color=green]'(0)[/color]
another alterative way with two library refernces
Rich (BB code):Option Explicit 'reference to Microsoft Internet Controls 'reference to Microsoft HTML Object Library Sub Web_Table_Option_One() Dim xml As Object Dim html As Object Dim objTable As Object Dim result As String Dim lRow As Long Dim lngTable As Long Dim lngRow As Long Dim lngCol As Long Dim ActRw As Long
……another alterative way with two library referencesRich (BB code):Option Explicit 'reference to Microsoft Internet Controls 'reference to Microsoft HTML Object Library Sub Web_Table_Option_One() …… . …………. ……………………
Hi pike … I have successfully “destroyed” your code with ‘green comments etc. to the point that I have a very basic understanding of it all. . I had a couple of minor follow up questions: . You are using Late Binding for the external libraries , which I believe is the sensible thing to do when sharing files? . As always one of the things I do to help in my understanding of these types of codes is to try and get the code running with the equivalent Early Binding. This helps me get clear which libraries are used where, and allows me to use intellisense to aid in understanding and developing the code further. In doing so in the case of your code I hit on a couple of things I do not quite understand. Maybe you can help me.: .1 Late binding with __________.6.0 : You use Set xml = CreateObject("MSXML2.XMLHTTP.6.0") Rather than the more usual Set xml = CreateObject("MSXML2.XMLHTTP") . I note that both work in Late binding, but in early Binding only this works Dim xml As MSXML2.XMLHTTP . This for example does not work Dim xml As MSXML2.XMLHTTP.6.0 In Early Binding . - Could you explain your reasoning behind using MSXML2.XMLHTTP.6.0? . – is the answer that you specifically reference version 6 – I note that Early binding worked for me for versions 3, 4, 5, and 6 ( and crashes if I try to use a nonexistent number like MSXML2.XMLHTTP.2.0 ! ) .2 Libraries used. . You suggested you are using these two libraries 1) Microsoft HTML Object Library and 2) Microsoft Internet Controls. . My playing around with Early Binding suggests you are using 1) Microsoft HTML Object Library and 2) Microsoft XML (versions 3, 4 ,5 or 6). Can that be the case, or have I missed something more subtle..? (For example when late binding for the entire program I did not need to reference Microsoft Internet Controls) .3 Set ______ Nothing…. . I expect the answer to this is “yes, but you just left it out for simplicity..” : . Would it be a good idea to include this at the end of your code to be on the safe side?………. Pike . Thanks so much for your code. It was an unexpected bonus to have a solution using an extra library. I cannot wait to get my teeth into that code, stepping through it and “destroying it” with ‘green comment graffiti until I understand…………..…………………………………. . The last three questions are minor points to help me in understanding the code. I have Early and late binding versions successfully running in Xl 2003, 2007, 2010 on Vista and XP. So thanks again for the code Alan …………………………………….. . . . . P.s. More as amusement than anything else I include my destroyed version of your code that despite my best efforts still miraculously works!!!Code:[color=darkblue]Set[/color] xml = [color=darkblue]Nothing[/color]
Code:[color=green]' Pike: "another alterative way with two library refernces".......[/color] [color=green]'reference to Microsoft Internet Controls--WRONG maybe???!! soll Microsoft HTML Object Library http://stackoverflow.com/questions/20495035/vba-what-to-reference-for-htmlfile[/color] [color=green]'reference to Microsoft HTML Object Library>>>> Tick >> Microsoft XML, v6.0' For late binding ( or v4.0 or v5.0 or v3.0 )[/color] [color=darkblue]Sub[/color] Web_Table_Option_OnepikeAlan1() [color=green]'version destroyed with comments etc. by Alan[/color] [color=green]'Dim lRow As Long'?? Not used at all (I think??)[/color] Cells.Clear [color=green]'clear cells and give me time to see...[/color] Application.Wait Now + TimeValue("00:00:02") [color=green]'....if code actually does anything[/color] [color=green]'[/color] [color=darkblue]Dim[/color] xml [color=darkblue]As[/color] [color=darkblue]Object[/color] ' Late Bin...d.... [color=green]'Set xml = CreateObject("MSXML2.XMLHTTP.6.0") ' ... ing --- original pike form (.3.0 .4.0 and .5.0 works too)[/color] [color=darkblue]Set[/color] xml = CreateObject("MSXML2.XMLHTTP") [color=green]' --- More usual form works as well[/color] [color=green]'Dim xml As MSXML2.XMLHTTP: Set xml = New MSXML2.XMLHTTP 'Early Binding ( .6.0 bit don't work?)TOOLS >>> REFERENCES >> tick Microsoft XML, v6.0 (or v3.0 or v4.0 or v5.0) >> "if you dont have version 6.0 for Microsoft XML any previous version should do" > http://www.mrexcel.com/forum/excel-questions/759592-help-createobject-msxml2-xmlhttp-macro.html[/color] [color=green]'[/color] xml.Open "GET", "http://www.ernaehrung.de/lebensmittel/de/F110000/Apfel.php", [color=darkblue]False[/color] 'Probably getting / opening somehow... xml.send [color=green]' to allow a highway in use like that in VBA "open for output as 1 etc.."[/color] [color=darkblue]Dim[/color] result [color=darkblue]As[/color] String: [color=darkblue]Let[/color] result = xml.responseText: Debug.Print result [color=green]'The response - pushed out of the highway. Looks Like HTML info to the site but not the source code[/color] [color=green]'Range("H1").Value = result'For me for a print out copy to text file etc.[/color] [color=green]'Dim html As Object 'late Binding[/color] [color=green]'Set html = CreateObject("htmlfile")[/color] [color=darkblue]Dim[/color] html [color=darkblue]As[/color] HTMLDocument: [color=darkblue]Set[/color] html = [color=darkblue]New[/color] HTMLDocument [color=green]'Early binding TOOLS >>> REFERENCES >>Microsoft HTML Object Library[/color] [color=darkblue]Let[/color] html.body.innerHTML = result [color=green]'Somehow this gives enough info html so that it can be further used to get anything from the full HTML Soruce file[/color] [color=darkblue]Dim[/color] objTable [color=darkblue]As[/color] [color=darkblue]Object[/color] [color=green]'An object to be set that has enough info in that we need to poppel out wot I want[/color] [color=darkblue]Set[/color] objTable = html.getElementsByTagName("table") [color=green]'seems to get 8 items tying up with the 8 tables outputed. This is the one object[/color] [color=darkblue]Dim[/color] ActRw [color=darkblue]As[/color] [color=darkblue]Long[/color] [color=green]'marker row incremented after writing out each table to just after last table[/color] [color=darkblue]Dim[/color] lngRow [color=darkblue]As[/color] Long: [color=darkblue]Dim[/color] lngCol [color=darkblue]As[/color] [color=darkblue]Long[/color] [color=green]'Variables for each row and column in each table[/color] [color=green]'Strategy is to go through each row in the objectTable then along the "cells" (columns) it has in that row and put it sequentially in the spreadsheet[/color] [color=darkblue]Dim[/color] lngTable [color=darkblue]As[/color] Long, TableCount [color=darkblue]As[/color] Long: [color=darkblue]Let[/color] TableCount = objTable.Length [color=green]'Gives 8 length is effectivelly Number/ Count[/color] [color=green]'For lngTable = 0 To objTable.Length - 1 'Going for each table .. -1 is probably the array start at 0 thing.[/color] [color=darkblue]For[/color] lngTable = 1 [color=darkblue]To[/color] objTable.Length - 1 [color=green]'The 1 mod over pike's original takes out something in <table> tags </table> thar is not really a table (that I want)[/color] [color=darkblue]For[/color] lngRow = 0 [color=darkblue]To[/color] objTable(lngTable).Rows.Length - 1 [color=green]'For each object this " objTable(lngTable).Rows.Length - 1 " gives the number of rows in that table[/color] [color=darkblue]For[/color] lngCol = 0 [color=darkblue]To[/color] objTable(lngTable).Rows(lngRow).Cells.Length - 1 [color=green]'This bit " objTable(lngTable).Rows(lngRow).Cells.Length " gives the columns ("Length" again really means Nzumber / Count)[/color] [color=darkblue]Let[/color] ThisWorkbook.Sheets("pike ").Cells(ActRw + lngRow + 1, lngCol + 1).Value = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText [color=green]'This " objTable(items 1 - 8) thing " has a massive amount of stuff in it. In the watch window i have not pinned down yet the bit I want... but This " objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText " has!!!![/color] [color=darkblue]Next[/color] lngCol [color=green]'go to next "Cell" in that table row (=Column in the table)[/color] [color=darkblue]Next[/color] lngRow [color=green]'Go to next row in this table[/color] [color=darkblue]Let[/color] ActRw = ActRw + objTable(lngTable).Rows.Length + 1 [color=green]'make sure the next table starts at somewhere around the end of the last one[/color] [color=darkblue]Next[/color] lngTable [color=green]'Go to the next table[/color] [color=darkblue]Set[/color] xml = [color=darkblue]Nothing[/color] [color=green]'Probably (no longer??) necerssary, but does no harm... who knows - compatibility with earliear versions etc..? that do not automatically shut things off??[/color] [color=darkblue]End[/color] [color=darkblue]Sub[/color] [color=green]''Sub Web_Table_Option_OnepikeAlan1() 'version destroyed with comments etc. by Alan' Usually I put a stop here so I do not loose Watch Window info[/color]
… I am sure you will have questions…..
Experience is great teacher but takes a lot of time. Any questions you have about the code, just ask and I will answer them.
Request.Open "GET", URL, [color=darkblue]True[/color]
Request.send
[color=darkblue]Open[/color] Request [color=darkblue]For[/color] [color=darkblue]Output[/color] [color=darkblue]As[/color] 1 [color=green]' or 2 or 3 etc.....[/color]
[color=green]' Convert the HTML code into an HTML Document Object.[/color]
HTMLdoc.write PageSrc
HTMLdoc.Close
HTMLdoc.body.innerHTML = PageSrc [color=green]'pike equivalent[/color]
[color=darkblue]Next[/color] n
[color=green]' Output the tables and format the columns.[/color]
Rng.Resize(ColumnSize:=colCnt).EntireColumn.AutoFit
Request.Open "GET", URL, True
Open Request For Output As #1
Set html = CreateObject("htmlfile")
html.body.innerHTML = result