Automatically extract website data to a spreadsheet

Corried

Board Regular
Joined
Dec 19, 2019
Messages
217
Office Version
  1. 2019
Platform
  1. Windows
  2. Web
Hello, Thank you for sharing..

I am looking for courses on how to automatically extract data from a website to an excel spreadsheet.

The excel file contains fifty links. Each link has the same process, to create a stock market report for fifty companies. Use macro and chrome to generate data.

Can anyone show me a step by step process on how I can do this?

Please watch the video and Excel workbook below to learn more.



I hope you all understand.

Cheers
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
I have a question: are these two columns scraping the same measure?

52-Week Low & High5 Year High & Low
 
Upvote 0
Uood night Worf. I hope I understand your question. Both measurements may look the same but they are not. The 52 week high and low price may or may not have the same price for the 5 year high and low... Hope I answered the questions. I wish I saw your message earlier...
 
Upvote 0
All right. It is 11 PM and I am going offline.
I will be back in a couple of days...
 
Upvote 0
Here is what I have so far, the code starts with column C:

VBA Code:
Public driver As New ChromeDriver
Sub Money()
Dim pt As WebElement, key As New Selenium.Keys, s$
driver.get "https://www.msn.com/en-us/money/stockdetails/fi-bjqejc?duration=1D"
Application.Wait Now + TimeValue("0:00:06")
Set pt = driver.FindElementByXPath("//*[@id=""searchBox""]/input")
pt.Click
pt.SendKeys ActiveSheet.[c2]
Application.Wait Now + TimeValue("0:00:06")
DoEvents
Set pt = driver.FindElementByCss("#searchBox > span > span > svg")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
DoEvents
[a2] = driver.URL
[b2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[2]/div[1]/div[2]/span[1]").Text
[d2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[8]/div[2]").Text
[e2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[3]/div[1]/div/div[1]").Text
s = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[2]").Text
s = s & " / " & driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[1]").Text
[h2] = s
[j2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[5]/div[2]").Text
' 5-year
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[1]/div[2]/div/div/button[8]").Click
Application.Wait Now + TimeValue("0:00:06")
[i2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div/table/tbody/tr[2]/td[12]/div").Text
' analysis
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[3]/div[2]/div/div/button[5]/span").Click
Application.Wait Now + TimeValue("0:00:10")
driver.SwitchToNextWindow
s = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[2]/li[2]/p").Text
s = s & " / " & driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[3]/li[2]/p").Text
[g2] = s
[L2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[6]/li[2]/p").Text
[k2] = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[4]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' growth
Application.Wait Now + TimeValue("0:00:12")
'MsgBox driver.URL
Set pt = driver.FindElementByCss _
("#main > div.content-div.fullwidth.loaded > div.main-region.maincontainer.fullwidth.stckdtl" _
& " > div.dynaloadable > div:nth-child(4) > div > div > div.key-stats-area > ul > li:nth-child(2)")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
[F2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[2]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' company
driver.FindElementByXPath("//*[@id=""profile""]/a").Click
Application.Wait Now + TimeValue("0:00:06")
driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[2]/div/button[1]").Click
Application.Wait Now + TimeValue("0:00:06")
[m2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[1]").Text
End Sub
 
Upvote 0
To clarify, the code above reads the information at column C and populates the other columns.
This preliminary version works with a single row, only one company is scanned.
 
Upvote 0
To clarify, the code above reads the information at column C and populates the other columns.
This preliminary version works with a single row, only one company is scanned.
Okay. So it sounds like, once I paste the ticker symbol is into column C and code extracted the other data automatically.

And it is so, how many rows the code can function with out breaking or freezing the same during the process...

Thanks in advance
 
Upvote 0
Here is what I have so far, the code starts with column C:

VBA Code:
Public driver As New ChromeDriver
Sub Money()
Dim pt As WebElement, key As New Selenium.Keys, s$
driver.get "https://www.msn.com/en-us/money/stockdetails/fi-bjqejc?duration=1D"
Application.Wait Now + TimeValue("0:00:06")
Set pt = driver.FindElementByXPath("//*[@id=""searchBox""]/input")
pt.Click
pt.SendKeys ActiveSheet.[c2]
Application.Wait Now + TimeValue("0:00:06")
DoEvents
Set pt = driver.FindElementByCss("#searchBox > span > span > svg")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
DoEvents
[a2] = driver.URL
[b2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[2]/div[1]/div[2]/span[1]").Text
[d2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[8]/div[2]").Text
[e2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[3]/div[1]/div/div[1]").Text
s = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[2]").Text
s = s & " / " & driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[1]").Text
[h2] = s
[j2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[5]/div[2]").Text
' 5-year
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[1]/div[2]/div/div/button[8]").Click
Application.Wait Now + TimeValue("0:00:06")
[i2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div/table/tbody/tr[2]/td[12]/div").Text
' analysis
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[3]/div[2]/div/div/button[5]/span").Click
Application.Wait Now + TimeValue("0:00:10")
driver.SwitchToNextWindow
s = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[2]/li[2]/p").Text
s = s & " / " & driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[3]/li[2]/p").Text
[g2] = s
[L2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[6]/li[2]/p").Text
[k2] = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[4]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' growth
Application.Wait Now + TimeValue("0:00:12")
'MsgBox driver.URL
Set pt = driver.FindElementByCss _
("#main > div.content-div.fullwidth.loaded > div.main-region.maincontainer.fullwidth.stckdtl" _
& " > div.dynaloadable > div:nth-child(4) > div > div > div.key-stats-area > ul > li:nth-child(2)")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
[F2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[2]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' company
driver.FindElementByXPath("//*[@id=""profile""]/a").Click
Application.Wait Now + TimeValue("0:00:06")
driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[2]/div/button[1]").Click
Application.Wait Now + TimeValue("0:00:06")
[m2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[1]").Text
End Sub

Here is what I have so far, the code starts with column C:

VBA Code:
Public driver As New ChromeDriver
Sub Money()
Dim pt As WebElement, key As New Selenium.Keys, s$
driver.get "https://www.msn.com/en-us/money/stockdetails/fi-bjqejc?duration=1D"
Application.Wait Now + TimeValue("0:00:06")
Set pt = driver.FindElementByXPath("//*[@id=""searchBox""]/input")
pt.Click
pt.SendKeys ActiveSheet.[c2]
Application.Wait Now + TimeValue("0:00:06")
DoEvents
Set pt = driver.FindElementByCss("#searchBox > span > span > svg")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
DoEvents
[a2] = driver.URL
[b2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[2]/div[1]/div[2]/span[1]").Text
[d2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[8]/div[2]").Text
[e2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[3]/div[1]/div/div[1]").Text
s = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[2]").Text
s = s & " / " & driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[1]/div[2]/div/div[3]/span[1]").Text
[h2] = s
[j2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div[2]/div[5]/div[2]").Text
' 5-year
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[1]/div[2]/div/div/button[8]").Click
Application.Wait Now + TimeValue("0:00:06")
[i2] = driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[2]/div/div[2]/div/table/tbody/tr[2]/td[12]/div").Text
' analysis
driver.FindElementByXPath("//*[@id=""root""]/div[1]/div/div[5]/div/div/div[1]/div/div[3]/div[2]/div/div/button[5]/span").Click
Application.Wait Now + TimeValue("0:00:10")
driver.SwitchToNextWindow
s = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[2]/li[2]/p").Text
s = s & " / " & driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[3]/li[2]/p").Text
[g2] = s
[L2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div/div[2]/div/ul[6]/li[2]/p").Text
[k2] = driver.FindElementByXPath _
("//*[@id=""main""]/div[2]/div[2]/div[2]/div[4]/div/div/div[5]/div[1]/div[2]/div[1]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' growth
Application.Wait Now + TimeValue("0:00:12")
'MsgBox driver.URL
Set pt = driver.FindElementByCss _
("#main > div.content-div.fullwidth.loaded > div.main-region.maincontainer.fullwidth.stckdtl" _
& " > div.dynaloadable > div:nth-child(4) > div > div > div.key-stats-area > ul > li:nth-child(2)")
pt.Click
Application.Wait Now + TimeValue("0:00:06")
[F2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div[3]/" & _
"div/div/div[5]/div[1]/div[2]/div[2]/div[1]/div/div/div/ul[1]/li[2]/span[1]/p").Text
' company
driver.FindElementByXPath("//*[@id=""profile""]/a").Click
Application.Wait Now + TimeValue("0:00:06")
driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[2]/div/button[1]").Click
Application.Wait Now + TimeValue("0:00:06")
[m2] = driver.FindElementByXPath("//*[@id=""main""]/div[2]/div[2]/div[2]/div/div[3]/div/div[2]/div[1]").Text
End Sub
Good day Worf,

I while waiting for your response. I am getting an error 13. I am not clear. Everything seems to be updated.

Please image of error below.

Thanks in advance
 

Attachments

  • Error 13.JPG
    Error 13.JPG
    21.3 KB · Views: 19
Upvote 0
Maybe you have an outdated executable Chrome driver at the Selenium Basic folder.

What is the date of that file? This is the first thing to check so we can be sure it is the correct one.
 
Upvote 0

Forum statistics

Threads
1,223,630
Messages
6,173,453
Members
452,514
Latest member
cjkelly15

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top