Scrapping data from web when it's not in the HTML

chufo_q09

New Member
Joined
Dec 18, 2017
Messages
2
Hey y'all!

So, regarding an "old" thread I saw in this forum, one of the users requested if it was possible to pull data from the tables of the webpage below. The itchy point is that the HTML code of the website doesn't contain the info of the tables, which makes us think of a different solution.


So, my expert friends, what are your proposals? I have no clue right now and it is a really interesting and useful topic for those who want to deep in web scrapping!

Thaaank you, link below:laugh:

http://quote.morningstar.ca/Quicktakes/stock/keyratios.aspx?t=GNTX&region=USA&culture=en-CA&ops=clear


 

Excel Facts

Bring active cell back into view
Start at A1 and select to A9999 while writing a formula, you can't see A1 anymore. Press Ctrl+Backspace to bring active cell into view.
Which data isn't in a table? I've only scraped a dozen or so pages, but this looks to be a bunch of tables to me.

Plan B would be to copy everything across and get the information through Formulas.
 
Upvote 0
Code:
Sub scrapeDATA()

Dim rownum As Long
Dim strUniName As String
Dim appIE As Object
Dim Elements As IHTMLElementCollection
Dim Element As IHTMLElement
Dim HTMLdoc As MSHTml.HTMLDocument
Dim objShellWindows As New SHDocVw.ShellWindows


Application.ScreenUpdating = False
Sheets("ChargeData").Cells.ClearContents


Set appIE = CreateObject("internetexplorer.application")


        With appIE
            .Visible = True
            .Navigate "http://quote.morningstar.ca/Quicktakes/stock/keyratios.aspx?t=GNTX&region=USA&culture=en-CA&ops=clear"
        
            While appIE.Busy Or appIE.ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
        End With
        
        
        Set HTMLdoc = appIE.Document
        ProcessHTMLPage HTMLdoc
        appIE.Quit
                
                
Application.ScreenUpdating = True


End Sub








Sub ProcessHTMLPage(HTMLPage As MSHTml.HTMLDocument)


    Dim HTMLTable As MSHTml.IHTMLElement
    Dim HTMLTables As MSHTml.IHTMLElementCollection
    Dim HTMLInput As MSHTml.IHTMLElement


    Set HTMLTables = HTMLPage.getElementsByTagName("table")
    
    For Each HTMLTable In HTMLTables
    
            For Each HTMLRow In HTMLTable.getElementsByTagName("td")
            
                Debug.Print HTMLTable.innerText
            
            Next HTMLRow
    
    Next HTMLTable
   
End Sub
 
Upvote 0
Looping through rows above, copying all to "ChargeData" Sheet below:


Code:
Sub scrapeDATA()

Dim rownum As Long
Dim strUniName As String
Dim appIE As Object
Dim Elements As IHTMLElementCollection
Dim Element As IHTMLElement
Dim HTMLdoc As MSHTml.HTMLDocument
Dim objShellWindows As New SHDocVw.ShellWindows


Application.ScreenUpdating = False
Sheets("ChargeData").Cells.ClearContents


Set appIE = CreateObject("internetexplorer.application")


        With appIE
            .Visible = True
            .Navigate "http://quote.morningstar.ca/Quicktakes/stock/keyratios.aspx?t=GNTX&region=USA&culture=en-CA&ops=clear"
        
            While appIE.Busy Or appIE.ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
    
            x = .Document.body.innerText
            x = Replace(x, Chr(10), Chr(13))
            x = Split(x, Chr(13))
            Sheets("ChargeData").Range("A1").Resize(UBound(x)) = Application.Transpose(x)
        End With
        
        appIE.Quit
                
                
Application.ScreenUpdating = True


End Sub
 
Upvote 0
yeah!! i get your solution! amazing! but in that case, recovering the data by formulas looks really difficult, isn't it?

isn't it possible to distribute numbers and values spread by columns apart from rows?

thank youuu!
 
Upvote 0

Forum statistics

Threads
1,222,537
Messages
6,166,649
Members
452,059
Latest member
Frank Tennick

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top