Web Query -- Web Scraping Stock info from Thousands of Pages

arthur72j

New Member
Joined
Apr 2, 2021
Messages
5
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
  2. MacOS
HI,

I am attempting to pull data from MarketWatch, and am proving unsuccessful thus far, any help would be much appreciated.

I have a list of all Tickers on NASDAQ (over 3000 of them), and am attempting to scrape from using a table with a list of all URLs (="https://www.marketwatch.com/investing/stock/"&B2&"/analystestimates?mod=mw_quote_tab" -> where "&B2&" links to the ticker in that row [B2 is 'A']). I am trying to pull "High", "Low", and "Average" analyst estimates from the table within the webpages (uninterested in "Median" and "Current Price" -> I removed them when linking the webpage to the worksheet.

For individual, and a small number of rows, it works, but when trying to get it to work for even just a list of the S&P500 tickers (500 of them), and for the over 3000 rows I need it to work, it just comes up with with an error and tells me to quit Excel. I have attached photos of my worksheet, and hidden irrelevant columns.

I am working on Excel through a VM (Parallels) on an M1 Mac which I believed may have effected it, however I had a CS friend working on a Windows PC try it and it came out with the same error.

I apologise in advance if there was a thread for this, I couldn't find it.

I tried to upload more photos of the Power Query Editor, but it wouldn't let me.

I would greatly appreciate any help.

1617397363354.png
Screenshot 2021-04-02 at 21.36.19 copy.jpg
 
@ *arthur72j

Does the following help?

VBA Code:
Range("H2").Value = Doc.getElementsByClassName("table__cell w25 ")(11)    ' Get High Price Target
Range("I2").Value = Doc.getElementsByClassName("table__cell w25 ")(13)    ' Get Low Price Target
Range("J2").Value = Doc.getElementsByClassName("table__cell w25 ")(14)    ' Get Average Price Target

JohnnyL,

I haven't used VBA before, so I am not too sure what I am doing, but I have gone in, added a new module, pasted your code (thank you) but I am unsure as to what to set the first line to (I am assuming 'sub' or 'var' followed by get 'ElementsByClassName"). It arises a "Run-time error '424'" message with "Object Required." I apologise for my lack of understanding, but any further clarity would be much appreciated.
 
Upvote 0

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
JohnnyL,

I haven't used VBA before, so I am not too sure what I am doing, but I have gone in, added a new module, pasted your code (thank you) but I am unsure as to what to set the first line to (I am assuming 'sub' or 'var' followed by get 'ElementsByClassName"). It arises a "Run-time error '424'" message with "Object Required." I apologise for my lack of understanding, but any further clarity would be much appreciated.
@ arthur72j It appears that you are using Power Query, unfortunately, I have no experience with that. I did a quick google search and it appears that you can intermix the Power Query and VBA code. How you would do that, I am unsure though. I can whip up some vba code, assuming that all of links to the stocks are located in the 'R' column, but it would be up to you to find out how to get the Power Query to work with VBA code because as I said, I have no experience with Power Query. You may get lucky and another member here may assist you in that matter.
 
Last edited:
Upvote 0
Hi !​
Thank you for your suggestion, but I am still looking to scrape the web personally.
Why not as this website is easy to scrape and as the post #5 code raises an error on my side​
I have done a VBA procedure well working but I'm under PC / Windows obviously so no idea if it can work on your Windows session on Mac …​
I just register to this forum and I did not know workbook attachment is not possible here so you will have​
to start from a brand new workbook and if you are a good enough reader you will be able to test this VBA procedure.​
This test is just to check if this procedure works on your side, not your final solution as this procedure is just the spare​
- like a spare wheel, often uggly but it does the job when necessary ! - of some main 'parallel tasks' procedure,​
if the spare don't fit the main neither …​
Headers on your new test workbook :​
MarketPlace Spare Headers .jpg
• Cell D1 must contain the text Stock Price Targets without merging cells and without any typo like in headers D2:F2
as the VBA procedure read these headers to scrape the webpages …​
• First ticker starts in row #3 in column C. Just fill 20 tickers, no need to fill columns A & B …​
• Once the VBA procedure is located where it must be, you can launch the test just double clicking on cell E1 !​
During the execution the headers D2:F2 and cells below become grey (meaning the procedure is running)​
and you can follow the progress in the status bar …​
The spare VBA procedure to paste to the worksheet module :​

VBA Code:
Private Sub Spare()
      Const W1 = "https://www.marketwatch.com/investing/stock/", W2 = "/analystestimates?mod=mw_quote_tab"
        Dim T$, H, M, V, W, R&, N$(1), S, C%
    With [A1].CurrentRegion.Rows
            If .Count < 3 Then Beep: Exit Sub
            T = ">" & .Cells(4).Text & "</span>"
        With .Item("3:" & .Count).Columns("D:F")
            H = Evaluate(""">""&" & .Rows(0).Address & "&""</td>""")
            M = Evaluate("(" & .Item(-2).Address & "="""")*1+(" & .Item(-1).Address & "="""")*1>0")
            V = .Value2
            W = .Item(0).Value2
            Union(.Rows(0), .Cells).Font.ColorIndex = 15
    With CreateObject("WinHttp.WinHttpRequest.5.1")
            On Error Resume Next
        For R = 1 To UBound(V)
            Application.StatusBar = "          Requesting row #" & R + 2
            If R Mod 6 = 0 Then DoEvents
            Err.Clear
           .Open "GET", W1 & W(R, 1) & W2, False
           .setRequestHeader "DNT", "1"
           .send
            If Err.Number = 0 Then
                If M(R, 1) Then
                    N(0) = Split(Split(.responseText, "class=""company__name"">")(1), "<")(0)
                    N(1) = Split(Split(Split(.responseText, "class=""company__market"">")(1), "<")(0), ": ")(1)
                    [A:B].Rows(R + 2).Value2 = N
                    Erase N
                End If
                    S = Split(.responseText, T)
                    If UBound(S) > 0 Then S = Split(Split(S(1), "</tbody>")(0), "<tbody>")
            End If
        For C = 1 To UBound(V, 2)
            If Err.Number Then
                V(R, C) = "¤"
            ElseIf .Status = 200 Then
                If UBound(S) > 0 Then
                    S = Split(S(1), H(C))
                    If UBound(S) > 0 Then V(R, C) = Split(Split(S(1), ">$")(1), "<")(0) Else V(R, C) = "?"
                Else
                    V(R, C) = "!"
                End If
            Else
                V(R, C) = "§"
            End If
        Next C, R
    End With
           .Value2 = V
            Application.Speech.Speak IIf(Application.CountA(.Cells) - Application.Count(.Cells), _
                                         "Some request failed !", "Done"), True
            On Error GoTo 0
            Application.StatusBar = False
            Union(.Rows(0), .Cells).Font.ColorIndex = xlAutomatic
        End With
    End With
End Sub

Private Sub Worksheet_BeforeDoubleClick(ByVal Target As Range, Cancel As Boolean)
    If Target.Address = "$E$1" Then Cancel = True: Spare
End Sub
 
Upvote 0
Working on my side on two computers, different Excel & Windows versions …​
First, what are the signs (characters) returned ?​
Windows & Excel version ?​
Check for any typo in cells D1, D2, E2 & F2 …​
 
Upvote 0
Working on my side on two computers, different Excel & Windows versions …​
First, what are the signs (characters) returned ?​
Windows & Excel version ?​
Check for any typo in cells D1, D2, E2 & F2 …​
If Err.Number Then
V(R, C) = "¤"

Windows 7 / Excel2013

No Typos
 
Upvote 0
@ arthur72j It appears that you are using Power Query, unfortunately, I have no experience with that. I did a quick google search and it appears that you can intermix the Power Query and VBA code. How you would do that, I am unsure though. I can whip up some vba code, assuming that all of links to the stocks are located in the 'R' column, but it would be up to you to find out how to get the Power Query to work with VBA code because as I said, I have no experience with Power Query. You may get lucky and another member here may assist you in that matter.

Does the attachment look close to what you are shooting for?
 

Attachments

  • Test.PNG
    Test.PNG
    113.8 KB · Views: 12
Upvote 0
V(R, C) = "¤"
So it seems the error is at request level but to be sure put this block in comment and retry :​
VBA Code:
                If M(R, 1) Then
                    N(0) = Split(Split(.responseText, "class=""company__name"">")(1), "<")(0)
                    N(1) = Split(Split(Split(.responseText, "class=""company__market"">")(1), "<")(0), ": ")(1)
                    [A:B].Rows(R + 2).Value2 = N
                    Erase N
                End If
 
Upvote 0

Forum statistics

Threads
1,225,749
Messages
6,186,802
Members
453,373
Latest member
Ereha

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top