I have been scraping data from Yahoo Finance pages for years with VBA code in Excel (now in Excel 2010) by opening an IE window and working with the data from the document HTML. Aside from adjusting for Y.F. making changes often I've been able to get everything I need either from stock quote pages or from portfolios I have created. That is until earlier this month. Suddenly this message began appearing on many pages opened in Internet Explorer:
"Some parts of this page is not supported on your current browser version. Please upgrade the browser to latest."
I'm working on a Win 10 PC with Excel 2010 and IE 11.
the message doesn't appear on Edge, Firefox, Opera, or on my iphone. One reason is that IE is no longer supported by Microsoft and Finance Yahoo has, I guess, decided to follow their encouragement to stop development on web pages for IE. I'm sure there are other reasons but it has laid bare the vulnerability of anyone's VBA code that relies on opening an IE window for any scraping. Trouble is, I can't seem to find another approach that avoids the error message.
I found some great APIs for a lot of the data but there are other pages (such as the portfolio pages) and other sites that I also scrape data from that I either can't find an API for or have confirmed that the company doesn't offer an API for the data. So I need a way to scrape data.
VBA in Excel (and I guess other Office products) only supports Internet Explorer. There is no native way to open a different browser. I sure would welcome some help and discussion on this issue. I understand that Office 365 is working on a new webview for Edge but that is of little help to those of us who don't subscribe.
Here is a list of the ways I have tried to get around this and the results I have had so far (all failures). Please comment, offer suggestions for further research, or laugh and tell me to give up (hopefully not that). Also please tell me how to format this better if it looks bad; I'm new at this. To decide if an approach is successful I looked for the error message above and for specific data for stocks such as AAPL, ABT, ADBE in my portfolios.
1) VBA in an Excel macro:
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The response includes the error message and is missing the chart. When I use the URL for one of my portfolios, the message is included but the stock data is missing.
2) also VBA in Excel Macro:
Same as above except xmlhttp is defined as New WinHttpRequest and the .setRequestHeader line is removed.
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The results are the same.
3) I tried using Selenium (both basic and wrapper) and figured out it doesn't work with Firefox any more. I couldn't get it to work with Opera either. I was able to get some response from PhantomJS but haven't been able to get the data back in a way that I can examine it.
I don't know enough of the Selenium commands to look more closely at the response but the set objnode1 statement returned an error.
4) I gave up and wrote a Python script with BeautifulSoup to try to get the data:
The beginning of this response includes: html id="atomic" class="NoJs featurephone" lang="en-US"
This also returned the error message and when I used the URL for one of my portfolios it did not return any stock data.
Sorry for the long post. I've been searching and trying as much as I can but I'm stumped. I'm hoping I have overlooked something simple (I can hope) or that I will have at least saved someone else some time so they don't have to try the same things again that I've tried. I find it hard to believe that all of these approaches are being caught by Finance Yahoo and are being treated the same way as they treat IE but that's what my testing is showing. Any help or suggestions would be most appreciated. By the way the first line of the HTML (as seen from the developers tools) in various browsers looks like this:
Firefox: html id="atomic" class="firefox desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport HideNavrail hasScrolled scrollDown" lang="en-US"
IE: html class=" lightweight ie-11 desktop JsEnabled" id=atomic" lang="en-US"
Opera (surprisingly): html id="atomic" class="chrome desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport hasScrolled scrollDown ShowNavrail" lang="en-US"
Could the JSEnabled and the NoJS things be a clue to the problem here? There seems to be a pattern but I have no clue what it means or how to do anything about it if it is a clue.
Thanks
Phil
"Some parts of this page is not supported on your current browser version. Please upgrade the browser to latest."
I'm working on a Win 10 PC with Excel 2010 and IE 11.
the message doesn't appear on Edge, Firefox, Opera, or on my iphone. One reason is that IE is no longer supported by Microsoft and Finance Yahoo has, I guess, decided to follow their encouragement to stop development on web pages for IE. I'm sure there are other reasons but it has laid bare the vulnerability of anyone's VBA code that relies on opening an IE window for any scraping. Trouble is, I can't seem to find another approach that avoids the error message.
I found some great APIs for a lot of the data but there are other pages (such as the portfolio pages) and other sites that I also scrape data from that I either can't find an API for or have confirmed that the company doesn't offer an API for the data. So I need a way to scrape data.
VBA in Excel (and I guess other Office products) only supports Internet Explorer. There is no native way to open a different browser. I sure would welcome some help and discussion on this issue. I understand that Office 365 is working on a new webview for Edge but that is of little help to those of us who don't subscribe.
Here is a list of the ways I have tried to get around this and the results I have had so far (all failures). Please comment, offer suggestions for further research, or laugh and tell me to give up (hopefully not that). Also please tell me how to format this better if it looks bad; I'm new at this. To decide if an approach is successful I looked for the error message above and for specific data for stocks such as AAPL, ABT, ADBE in my portfolios.
1) VBA in an Excel macro:
Code:
Sub newget()
Dim resp As String
Dim xmlhttp As New MSXML2.XMLHTTP60
With xmlhttp
.Open "GET", "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL", False
.setRequestHeader "User-Agent", "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25"
.send
resp = .ResponseText
End With
End Sub
The response includes the error message and is missing the chart. When I use the URL for one of my portfolios, the message is included but the stock data is missing.
2) also VBA in Excel Macro:
Same as above except xmlhttp is defined as New WinHttpRequest and the .setRequestHeader line is removed.
The beginning of this response includes: html id="atomic" class="NoJs netscape desktop" lang="en-US"
The results are the same.
3) I tried using Selenium (both basic and wrapper) and figured out it doesn't work with Firefox any more. I couldn't get it to work with Opera either. I was able to get some response from PhantomJS but haven't been able to get the data back in a way that I can examine it.
Code:
Sub newerget()
Dim selWD As SeleniumWrapper.WebDriver
Dim objnode1 As Object
Dim objNode2 As Object
Set selWD = New SeleniumWrapper.WebDriver
selWD.Start "phantomjs", "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL"
selWD.setTimeout ("120000")
selWD.setImplicitWait (5000)
Set objnode1 = selWD.findElementById("atomic")
End Sub
4) I gave up and wrote a Python script with BeautifulSoup to try to get the data:
Code:
from bs4 import BeautifulSoup
from urllib.request import urlopen
html=urlopen('https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL')
print(html.read())
This also returned the error message and when I used the URL for one of my portfolios it did not return any stock data.
Sorry for the long post. I've been searching and trying as much as I can but I'm stumped. I'm hoping I have overlooked something simple (I can hope) or that I will have at least saved someone else some time so they don't have to try the same things again that I've tried. I find it hard to believe that all of these approaches are being caught by Finance Yahoo and are being treated the same way as they treat IE but that's what my testing is showing. Any help or suggestions would be most appreciated. By the way the first line of the HTML (as seen from the developers tools) in various browsers looks like this:
Firefox: html id="atomic" class="firefox desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport HideNavrail hasScrolled scrollDown" lang="en-US"
IE: html class=" lightweight ie-11 desktop JsEnabled" id=atomic" lang="en-US"
Opera (surprisingly): html id="atomic" class="chrome desktop JsEnabled layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport hasScrolled scrollDown ShowNavrail" lang="en-US"
Could the JSEnabled and the NoJS things be a clue to the problem here? There seems to be a pattern but I have no clue what it means or how to do anything about it if it is a clue.
Thanks
Phil