Scraping Data from Webpage document into Excel

Corried

Board Regular
Joined
Dec 19, 2019
Messages
217
Office Version
  1. 2019
Platform
  1. Windows
  2. Web
Hello Excel Professionals and everyone who volunteer their support.

I am having a problem. I am not sure if "Mr. Worf" or anyone out there can help me…

I am asking for your time and patience and your knowledge, as this may sound complicated to understand…

I have 18 weblinks which are connected into one website, which I would like to scrape all data “text summary” into "sheet1", which is named “January”.

Now each link data scraping is copied from website and pasted using one macro button. If possible…

The data is pasted in order beginning in Cell B3. Here are the links below:

Now on each link or webpage, there is a text summary at the top of each page. Please view the 1st link example. Australia Manufacturing PMI | 1992-2020 Data | 2021-2022 Forecast | Calendar Here is a snapshot highlighted in blue:
Australia Manufacturing PMI.JPG

The problem is, how can someone point me to the right direction? And how can I do this?
Is there a easily or effective way I can use to get the end results. And If so, what do you suggest I do?

Please note. as new data should be updated every time...

Thanks in advance


Cheers.
 
Last edited:

Excel Facts

Round to nearest half hour?
Use =MROUND(A2,"0:30") to round to nearest half hour. Use =CEILING(A2,"0:30") to round to next half hour.
Hi Corried,

Kindly use below macro to extract the required description in specified cells.
(see images)
1. Name of Sheet having URL of all sites - "Sites"
2. Name of Sheet where data extracted - "January"


Sub scrap_Website_className()
On Error Resume Next

Dim HTMLDoc As New HTMLDocument
Dim ieBrowser As New InternetExplorer
Dim lastRow As Byte, counter As Byte

Dim trow As Object

lastRow = WorksheetFunction.CountA(Worksheets("Sites").Range("A:A"))

For counter = 1 To lastRow

'To Open website in Internet Explorer
ieBrowser.navigate Sheets("Sites").Range("A" & counter)

Do
' Wait till the Browser is loaded
Loop Until ieBrowser.readyState = READYSTATE_COMPLETE
Application.Wait (Now() + TimeValue("00:00:03"))

Set HTMLDoc = ieBrowser.document

Sheets("January").Cells(counter + 2, 2) = HTMLDoc.getElementsByTagName("h2")(0).innerText
Next
MsgBox "Content Copied"
End Sub
 

Attachments

  • sites.PNG
    sites.PNG
    28.8 KB · Views: 24
  • text description.PNG
    text description.PNG
    32.8 KB · Views: 27
Upvote 0
Solution
Hi Corried,

Kindly use below macro to extract the required description in specified cells.
(see images)
1. Name of Sheet having URL of all sites - "Sites"
2. Name of Sheet where data extracted - "January"


Sub scrap_Website_className()
On Error Resume Next

Dim HTMLDoc As New HTMLDocument
Dim ieBrowser As New InternetExplorer
Dim lastRow As Byte, counter As Byte

Dim trow As Object

lastRow = WorksheetFunction.CountA(Worksheets("Sites").Range("A:A"))

For counter = 1 To lastRow

'To Open website in Internet Explorer
ieBrowser.navigate Sheets("Sites").Range("A" & counter)

Do
' Wait till the Browser is loaded
Loop Until ieBrowser.readyState = READYSTATE_COMPLETE
Application.Wait (Now() + TimeValue("00:00:03"))

Set HTMLDoc = ieBrowser.document

Sheets("January").Cells(counter + 2, 2) = HTMLDoc.getElementsByTagName("h2")(0).innerText
Next
MsgBox "Content Copied"
End Sub
Hi Saurabhj

The code you have provide for me have worked wonderfully. I appreciate the extra step that you have given to me by creating the steps so simple.

At first. I had a couple of errors. As I did not have Microsoft HTML Object Library & Microsoft Internet Control check box tick. Once this were rectified, a matter of seconds. I got the info.

I just want to say.

Thank you very much.

 
Upvote 0

Forum statistics

Threads
1,224,820
Messages
6,181,159
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top