VBA and HTML Tables

zayd_88

New Member
Joined
Jun 1, 2012
Messages
11
Hey There,

Need help passing html tables from multiple pages from the same website into custom coloumns in an Excel sheet. I attempted using VBA to do this but I failed... Heres the Site:http://www.asx.com.au/asx/research/listedCompanies.do... Each company table has a unique code linked to a page with its details in an html page. I need to grab the company's name, address, phone number and a point of contact which I noticed is located on a different table... Anyway this is my current attempt.. Note this isnt completed...

Code:
Sub GetComp()
Dim ie As Object, allTables As Object, compTable As Object, posTable As Object, tblRow As Object, tblCell As Object
Dim myArr() As Variant, code As Range, y As Worksheet, cc As String
Dim compArrayTable() As String, posArrayTable() As String
Dim compRef As Long, posRef As Long, sheetRef As Long
Dim comp As String


Set ie = CreateObject("InternetExplorer.Application")
With Application
    .ScreenUpdating = False
    .DisplayAlerts = False
End With
Sheet1.Range("b1:h1").Value = Array( _
    "Company", "Address", "Suburb", "State", "Postcode", "Name", "Postion")

With ie
    For Each code In Range([q1], [q1027].End(3))
                    
                    
         cc = code.Value
         .navigate "http://www.asx.com.au/asx/research/companyInfo.do?by=asxCode&asxCode=" & cc
         Do While .Busy And .readyState <> 4
         Loop
        With .document
          
          Set compTable = .all.tags("table").Item(3)
          Set posTable = .all.tags("table").Item(4)
         
        
         
         ReDim compArrayTable(1 To compTable.Rows.Length - 2, 1 To 5)
         ReDim posArrayTable(1 To posTable.Rows.Length - 2, 1 To 2)
         
        End With
    Next
       
      
End With
        
End Sub

Any help will be greatly appreciated.

Thanks
Zayd :)
 

Excel Facts

Square and cube roots
The =SQRT(25) is a square root. For a cube root, use =125^(1/3). For a fourth root, use =625^(1/4).
How's something like this:
Code:
Sub getTd(Code As String)

    Dim doc As HTMLDocument
    Dim htmTable As HTMLTable
    
    Set doc = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "http://www.asx.com.au/asx/research/companyInfo.do?by=asxCode&asxCode" & Code
        .send
        Do: DoEvents: Loop Until .readyState = 4
        doc.body.innerHTML = .responseText
        .abort
    End With
    
    Set htmTable = doc.getElementsByClassName("company-details")(0)
    
    With htmTable
        Debug.Print .Rows(0).Cells(1).innerText
        Debug.Print .Rows(6).Cells(1).innerText
        Debug.Print .Rows(7).Cells(1).innerText
    End With
    
     Set htmTable = doc.getElementsByClassName("company-details")(1)
    
    With htmTable
        Debug.Print .Rows(0).Cells(1).innerText
    End With

End Sub
 
Upvote 0
Wow you did that so fast :eeek:.. Ill give it a shot.. but just looking at the Sub do I need to pass the string variable in via a different sub?

Oh and thanks heaps Kyle :)
 
Upvote 0
Yes, you'll also need to set a reference to Microsoft HTML Object Library

Called with something like getTd "ACB"

No probs, it should be faster than automating IE :) Hope it helps, it's only a starter for ten since I don't know what you want to do with the returned data
 
Upvote 0
Ahh so in my case I should make this code:
Code:
 For Each code In Range([q1], [q1027].End(3)) cc = code.Value
put in a seperate fiunction, put your function in that block of code so I can pass the string of arrays in? Sorry I feel like im asking very obvious questions here :/
 
Upvote 0
Yes, so a function would be something like the below which returns an array of values
Code:
For Each Code In Range([q1], [q1027].End(3))
    With Code
        .Offset(, 1).Resize(,4).Value = getTd(.Value)
    End With
Next Code


Public Function getTd(Code As String) As Variant
    
    Dim arr(1 To 4) As String
    
    Dim doc As HTMLDocument
    Dim htmTable As HTMLTable
    
    Set doc = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "http://www.asx.com.au/asx/research/companyInfo.do?by=asxCode&asxCode=" & Code
        .send
        Do: DoEvents: Loop Until .readyState = 4
        doc.body.innerHTML = .responseText
        .abort
    End With
    
    Set htmTable = doc.getElementsByClassName("company-details")(0)
    

    With htmTable
        arr(1) = .Rows(0).Cells(1).innerText
        arr(2) = .Rows(6).Cells(1).innerText
        arr(3) = .Rows(7).Cells(1).innerText
    End With

     Set htmTable = doc.getElementsByClassName("company-details")(1)

    With htmTable
        arr(4) = .Rows(0).Cells(1).innerText
    End With

    getTd = arr
End Function
 
Last edited:
Upvote 0
Okay kyle I hope youre still awake, I think I just about got this... Well almost :confused:...I mus be really fustrating lol.. So the code I mentioned above should be in a void function/method right? I'm gonna use this code to set up the coloumn headers for each of the tables I pull
Code:
Sheet1.Range("b1:h1").Value = Array( _
    "Company", "Address", "Suburb", "State", "Postcode", "Name", "Postion")

This code will be at the begin at the begining of the void function/method/sub "whatever its called in VBA"

Now the most crucial part... Your code pulls all the values into an array called arr.. I assume this value is stored into getTD which can be referenced in a sub outside the function getTD.. So in the end I think my code for the sub should sorta look like this?

Code:
Sub setTD()
Sheet1.Range("b1:h1").Value = Array( _
    "Company", "Address", "Suburb", "State", "Postcode", "Name", "Postion")
 For Each Code In Range([q1], [q1027].End(3))
        With Code
            .Offset(, 1).Resize(, 4).Value = getTd(.Value)
            'So here after all the values are pulled I should reference the array  getTD into the worksheet
            '*********Some code*****
        End With
    Next Code
End Sub

Almost there buddy, hope youre still awake :)
 
Upvote 0
Oh and this line:
Code:
 Set htmTable = doc.getElementsByClassName("company-details")(0)
Causes an error when run.. "Object doesnt support property or method :confused:
 
Upvote 0
Ah right, that would be because you don't have ie9. That function was only added then, if you can't upgrade, I'll have a look and come back to you

On the previous question, yes basically although this line:
Code:
.Offset(, 1).Resize(, 4).Value = getTd(.Value)
Puts the data directly into columns R to U and loops through the rows, you'll need to change the offset to a minus number to get it back to column B
 
Upvote 0
Thats not a problem, I can get IE 9... Infact I have it on my laptop which is currently being used by my partner cause its new and faster.. Ill download it to this one now... But as for getting the data into the sheet. Do I need to declare a variable to count the rows of the sheet and add one to it within the for-loop in the sub? oh and How do I get array values into the right spots?

Btw I cant thank you enough for your help.. After all this I wouldnt mind giving you a lil something for your time... and by something I mean cash or a beer lol

:)
 
Upvote 0

Forum statistics

Threads
1,223,231
Messages
6,170,884
Members
452,364
Latest member
springate

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top