Web Scrape Error : getElementsByClassName

Excel_Novice_123

New Member
Joined
Apr 2, 2021
Messages
2
Office Version
  1. 365
Platform
  1. Windows
Hi all,

Really grateful if someone can help me with the below. I am trying to get some news links from the website in the below code. It previously used to work fine and recently is causing issues, but I can't seem to figure out why.

In terms of what the code should be doing is extracting the timestamp for the fist 5 news articles in column A. Thereafter in columns b and c extracting the headline and the reference website for each of the first 5 articles.

When i run the code the first For Next loop gives an error "Object variable or with block variable not set".

If I skip over this the second For Next loop gives blank output and doesn't retrieve anything from the website.


VBA Code:
Sub Get_Web_Data()

Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim timeStamp As String

website = "https://www.forexfactory.com/news"
Set request = CreateObject("MSXML2.XMLHTTP")
request.Open "GET", website, False
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
request.send
response = StrConv(request.responseBody, vbUnicode)
html.body.innerHTML = response

For i = 1 To 5
timeStamp = html.getElementsByClassName("flexposts__nowrap flexposts__time nowrap").Item(i - 1).innerText
ActiveSheet.Cells(i + 1, 1).Value = Trim(timeStamp)
Next

Set my_data = html.getElementsByClassName("flexposts__title title")
Dim link
i = 1
For Each elem In my_data
 Set link = elem.getElementsByTagName("a")(0)
 i = i + 1
 If i > 6 Then
 Exit For
 End If
 ActiveSheet.Cells(i, 3).Value = link.href
 ActiveSheet.Cells(i, 2).Value = link.innerText
Next

End Sub
 

Excel Facts

How to fill five years of quarters?
Type 1Q-2023 in a cell. Grab the fill handle and drag down or right. After 4Q-2023, Excel will jump to 1Q-2024. Dash can be any character.
Have you looked at the HTML returned by your request? It includes JavaScript code and data which prevents non-browser requests from seeing the web data. The HTML doesn't contain elements with the class name "flexposts__nowrap flexposts__time nowrap", hence the error.

The easiest solution is to use IE or Chrome to scrape the data. Slower than XMLhttp requests, but they automatically decode the JavaScript, giving you the web data.
 
Upvote 0
the site has request cookies which are not managed by XMLHTTP. it must have been recoded recently if it used to work :(
 
Upvote 0
Thanks very much John_w and Diddi for the responses above, that's very helpful.

Please could i ask if you would have an example of the code that shows how to scrape with IE or Chrome? Thanks
 
Upvote 0
do a search for posts concerning 'selenium' it is the chrome automation tool. not something i have experience in tho
 
Upvote 0
A Selenium thread:

 
Upvote 0
@Worf. is there a thread on installing selenium for win10. i had a try and i cant seem to see it in excel
 
Upvote 0
When I came across Selenium, I used a helpful guide that I can now no longer find - the good news is that Wise Owl recently produced a video on getting started with Selenium (Youtube).
Basically, you need to:

1. Download the executable (see Releases)
2. Install SeleniumBasic
3. Download ChromeDriver (or whatever driver you need for whichever browser you want to use) - you need to make sure that the version of ChromeDriver matches your version of Chrome (v89 = v89).
4. Once the ChromeDriver.EXE file has downloaded, you need to help Selenium find it. According to Wise Owl (See here), you need to copy/move the ChromeDriver to the folder containing SeleniumBasic. I'm don't think that's the only way of doing it - according to the website, you just need to include the ChromeDriver location in your PATH environment variable (instructions).
5. You will need to add Selenium Basic to your project by adding it as a reference in the VBE.

I'd recommend looking at the VBScript examples provided in the package. Note that some of the examples use different drivers (for Firefox etc), so don't be surprised if they don't work - you will need to adjust the code to makeit work with your particular driver.
 
Upvote 0
thx. i got the example running under chrome
 
Upvote 0

Forum statistics

Threads
1,223,532
Messages
6,172,875
Members
452,486
Latest member
standw01

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top