Scaping data from specific URL into Excel

MalanieT

New Member
Joined
Mar 15, 2021
Messages
1
Office Version
  1. 2016
Platform
  1. Windows
Good day All,

I am new on this forum. I hope someone will be able to assist me.
I have an excel document with a few Companies which I need to update pricing on products of a certain brand. To do this manually is a nightmare.
Is there a way that I can put a URL for that certain brand in a cell and then excel can provide me with the item names/codes of the product with the pricing?
Below is a sample url I will use:

Example of what I will need is taken from the first product:

Column A (Product Name) Column B (Retail Price)(Not always there) Column C (Current Price)
Brother Ink Tank DCP-T510W 3in1 Printer with WiFi R 3,699 R 3,199

It then need to look for all products listed under that URL and load it onto Excel
I'd like to have the product names all in Column A, Price A in Column B and Price B in Column C

I hope this is possible to be done.
I tried the web query function, but it does not pick the link up as a table, so I can't import the info through web query.

Would this be possible?
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
the site will be difficult to scrape. it has complex cookies which have to be sent and received so you cannot use techniques like ie automation or xmlhttp. it may be possible using selenium, but i dont use that myself.
 
Upvote 0
Is there a way that I can put a URL for that certain brand in a cell and then excel can provide me with the item names/codes of the product with the pricing?
Hi,​
are you still in need Malanie ?​
As your sample webpage is very not difficult to scrape under Ms IE - but slow - and the code will work only for this website (Takealot.com)​
If you are still in need, give me another sample url on the same website …​
 
Upvote 0
Hello, try this
VBA Code:
Const myURL As String = "https://www.takealot.com/all?sort=Relevance&custom=brother-printers"

Sub Scraping()
    Dim IE As Object
    Dim Doc As Object
    Dim elements As Object
    Dim element As Object
    Dim r As Long

    Set IE = CreateObject("InternetExplorer.Application")
    
    With IE
        .navigate myURL
        .Visible = True
        Do While .Busy: DoEvents: Loop
        Do While .readyState <> 4: DoEvents: Loop
    End With

    Set Doc = IE.document
    
    Cells.Clear
    
    Application.Wait (Now + TimeValue("00:00:07"))
    
    Set elements = Doc.getElementsByClassName("product-card-module_title-wrapper_1sj9D")
    
    For Each element In elements
        r = r + 1
        Cells(r, 1) = Replace(Split(element.innertext, Chr(13))(3), Chr(10), "")
        Cells(r, 2) = Doc.getElementsByClassName("currency plus currency-module_currency_29IIm")(r - 1).innertext
    Next element

    IE.Quit
    Set IE = Nothing
    Set Doc = Nothing
    Set elements = Nothing
End Sub
 
Upvote 0
EDIT: I have now seen that there are two prices
Try this
VBA Code:
Const myURL As String = "https://www.takealot.com/all?sort=Relevance&custom=brother-printers"

Sub Scraping()
    Dim IE As Object
    Dim Doc As Object
    Dim elements As Object
    Dim element As Object
    Dim price As Object
    Dim r As Long

    Set IE = CreateObject("InternetExplorer.Application")
    
    With IE
        .navigate myURL
        .Visible = True
        Do While .Busy: DoEvents: Loop
        Do While .readyState <> 4: DoEvents: Loop
    End With

    Set Doc = IE.document
    
    Cells.Clear
    
    Application.Wait (Now + TimeValue("00:00:07"))
    
    Set elements = Doc.getElementsByClassName("product-card-module_title-wrapper_1sj9D")
    Set price = Doc.getElementsByClassName("currency plus currency-module_currency_29IIm")
    
    For Each element In elements
        r = r + 1
        Cells(r, 1) = Replace(Split(element.innertext, Chr(13))(3), Chr(10), "")
        Cells(r, 2) = price(r - 1).innertext
        Cells(r, 3) = price(r).innertext
    Next element

    IE.Quit
    Set IE = Nothing
    Set Doc = Nothing
    Set elements = Nothing
End Sub
 
Upvote 0
Hi MrGes !​
At the first run, no data uploaded but it well works with the second launch, maybe 'cause of your Wait codeline (often a trap) …​
As just well observing how the webpage works any Wait statement is necessary …​
Another point : the full product names can be loaded rather than the ones finishing by 3 dots.​
Anyway I will wait until any answer of Malanie …​
 
Upvote 0
Try this code
VBA Code:
Const myURL As String = "https://www.takealot.com/all?sort=Relevance&custom=brother-printers"

Sub Scraping()
    Dim IE As Object
    Dim Doc As Object
    Dim elements As Object
    Dim element As Object
    Dim price As Object
    Dim r As Long

    Set IE = CreateObject("InternetExplorer.Application")
    
    With IE
        .navigate myURL
        .Visible = True
        Do While .Busy: DoEvents: Loop
        Do While .readyState <> 4: DoEvents: Loop
    End With

    Set Doc = IE.document
    
    Cells.Clear
    
    Application.Wait (Now + TimeValue("00:00:8"))
    
    Set elements = Doc.getElementsByClassName("product-card-module_title-wrapper_1sj9D")
    Set price = Doc.getElementsByClassName("currency plus currency-module_currency_29IIm")
 
    For Each element In elements
        r = r + 1
        Cells(r, 1) = Replace(Split(element.innertext, Chr(13))(5), Chr(10), "")
        Cells(r, 2) = price(r - 1).innertext
        Cells(r, 3) = price(r).innertext
    Next element

    IE.Quit
    Set IE = Nothing
    Set Doc = Nothing
    Set elements = Nothing
End Sub
 
Upvote 0
As it can - must - be done without any Wait codeline just well observing how works the webpage …​
 
Upvote 0

Forum statistics

Threads
1,223,952
Messages
6,175,595
Members
452,657
Latest member
giadungthienduyen

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top