VBA to fetch Whole webpage data from list of webpages excel

satish78

Board Regular
Joined
Aug 31, 2014
Messages
218
Hi Friends,

I tried finding VBA code on MRexcel forum. But did not find suitable one.

A VBA Macro to extract whole webpage data including hyperlinks for example url: https://www.dice.com/company/07960AP
[TABLE="width: 64"]
<colgroup><col width="64"></colgroup><tbody>[TR]
[TD="width: 64"][/TD]
[/TR]
</tbody>[/TABLE]
Not only from this website, I want the macro to extract any website page on internet.
Just data and hyperlinks would work.
Later, I will fetch data according to my requirement on different website pages.
Help me.

Satish
 

Excel Facts

Enter current date or time
Ctrl+: enters current time. Ctrl+; enters current date. Use Ctrl+: Ctrl+; Enter for current date & time.
Have a look at this, i'm also stuck on this one, see my last post on the thread, it has the best code so far https://www.mrexcel.com/forum/excel-questions/1040144-url-search.html

The bit it won't do for now is, go to next URL and also paste the data in the next blank row. I'm still working on it, if i get there before you I will post you the code, could you please do the same

Thanks
 
Last edited:
Upvote 0
I been through all replies by others on that post.
Non of those codes are completed. They are in conversation in resolving the errors.
 
Upvote 0
Heres the code,

it was on page 2

Private Sub CommandButton4_Click()

Dim i, j, k, l As Integer
i = 2
k = 2
l = 2
'SHEET2 as sheet with URL
Dim wsSheet As Worksheet, Rows As Long, links As Variant, IE As Object, link As Variant
Set wb = ThisWorkbook
Set wsSheet = wb.Sheets("Sheet2")

'Set IE = InternetExplorer
Set IE = CreateObject("InternetExplorer.Application")
Rows = wsSheet.Cells(wsSheet.Rows.Count, "A").End(xlUp).Row
links = wsSheet.Range("A1:A" & Rows)

'IE Open Time per page 5sec and check links on Sheet2 Column A
With IE
.Visible = True
Application.Wait (Now + TimeValue("00:00:5"))

For Each link In links
.navigate (link)
While .Busy Or .ReadyState <> 4: DoEvents: Wend

For i = 0 To 500
On Error Resume Next

'Paste in sheet and column
Dim rw As Long

rw = Sheet1.Cells(Sheet1.Rows.Count, 1).End(xlUp).Row + 1
Sheets("Sheet1").Range("A" & rw).Value = IE.Document.getElementsByTagName("a").Item(i).innerText

Next i

'Deletes duplicates in column A Sheet1
Columns(1).RemoveDuplicates Columns:=Array(1)

'navigate links
Next link


Set sendAnEmail = IE.Document.getElementsByClassName("a-link-normal pr-email").Item(0)

sendAnEmail.Click

Application.Wait Now + TimeSerial(0, 0, 2)

retrievedEmail = sendAnEmail.innerText
End With

'Close IE Browser
IE.Quit
Set IE = Nothing

End Sub
 
Upvote 0
Its not doing the task exactly. It copies each sentence to each row instead off, copying into single cell.
 
Upvote 0
Anyway to loop through list of URLS instead of single website

Hi

I found some code which extracts source code of webpage. It extracts single page which is mentioned in code
I looking to extract whole webpage content instead of source code and want to loop through list of websites from sheet2

Private Sub HTML_VBA_Excel()
Dim oXMLHTTP As Object
Dim sPageHTML As String
Dim sURL As String

'Change the URL before executing the code
sURL = "http://WWW.medtronic.com"

'Extract data from website to Excel using VBA
Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")
oXMLHTTP.Open "GET", sURL, False
oXMLHTTP.send
sPageHTML = oXMLHTTP.responseText

'Get webpage data into Excel
ThisWorkbook.Sheets(1).Cells(1, 1) = sPageHTML

MsgBox "XMLHTML Fetch Completed"

End Sub
 
Upvote 0
Re: Anyway to loop through list of URLS instead of single website

I'm sure i sent you a thread of what you want, although you did not like how it was extracting the data, the code did allow to use a list ouf url from sheet 2 and paste data into sheet 1. I am currently using it to search 1000 url,

the link to the code I sent you does 500

The bulk of what you want is in that code, just change with the code you have and it might work.
 
Upvote 0
Re: Anyway to loop through list of URLS instead of single website

Here is the code again, see if you can add your data to it, if it works let me know as i might also use it. See the bit in red, change 500 to a larger number to do more url

The bit in green will need changing, the rest of the code will do what you want.

The orange is optional

Code:
Private Sub CommandButton4_Click()

    Dim i, j, k, l As Integer
    i = 2
    k = 2
    l = 2
'SHEET2 as sheet with URL
    Dim wsSheet As Worksheet, Rows As Long, links As Variant, IE As Object, link As Variant
    Set wb = ThisWorkbook
    Set wsSheet = wb.Sheets("Sheet2")

'Set IE = InternetExplorer
    Set IE = CreateObject("InternetExplorer.Application")
    Rows = wsSheet.Cells(wsSheet.Rows.Count, "A").End(xlUp).Row
    links = wsSheet.Range("A1:A" & Rows)

'IE Open Time per page 5sec and check links on Sheet2 Column A
    With IE
        .Visible = True
        Application.Wait (Now + TimeValue("00:00:5"))

        For Each link In links
            .navigate (link)
            While .Busy Or .ReadyState <> 4: DoEvents: Wend

            [COLOR=#ff0000]For i = 1 To 500[/COLOR]
                On Error Resume Next

[COLOR=#008000]'Paste in sheet and column
                Dim rw As Long
                
                rw = Sheet1.Cells(Sheet1.Rows.Count, 1).End(xlUp).Row + 1
                Sheets("Sheet1").Range("A" & rw).Value = IE.Document.getElementsByTagName("a").Item(i).innerText

            Next i[/COLOR]

'Deletes duplicates in column A Sheet1
            Columns(1).RemoveDuplicates Columns:=Array(1)

'navigate links
        Next link


[COLOR=#ff8c00]Set sendAnEmail = IE.Document.getElementsByClassName("a-link-normal pr-email").Item(0)

sendAnEmail.Click

Application.Wait Now + TimeSerial(0, 0, 2)

retrievedEmail = sendAnEmail.innerText
    End With[/COLOR]

'Close IE Browser
    IE.Quit
    Set IE = Nothing

End Sub
 
Upvote 0

Forum statistics

Threads
1,223,889
Messages
6,175,223
Members
452,620
Latest member
dsubash

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top