Excel 2016 VBA - Scrape data from web HTML

alpha2007

New Member
Joined
Jun 20, 2021
Messages
24
Office Version
  1. 2016
Platform
  1. Windows
Hi guys,
I have some VBA coding experience, but I am still an intermediate.

I face the following problem:

The HTML code of the data I want to scrape cannot be accessed by using getElemetbyID, getElementbyName, getElementsbyClassName or getElementsbyTagName
Does anyone know of another way to scrape the needed data?

The respective HTML code is:


VBA Code:
<div class="row-fluid b-margin financials clearfix">
    <div class="span12">
        <div class="row-fluid lgFinancials">
            <div class="span6 specs">
                <p class="price asking help  odd">
                    <span class="title">Asking Price:</span>
                    <b>$99,900
                        <span class="help" title="The total asking price of the business for sale."><i class="fa fa-question-circle"></i></span></b>
                </p>
            </div><div class="span6 specs">
                <p class="price help  ">
                    <span class="title">Cash Flow:</span>
                    <b>$131,138
                        <span class="help" title="Arrived at by &quot;starting with your net (before tax) profit. Then, add back in any payments made to the owner, interest and any depreciation of assets.&quot; For example, if the net profit before taxes was $100,000 and the owner was paid $70,000 then the cash flow is $170,000."><i class="fa fa-question-circle"></i></span></b>
                </p>
            </div>
        </div>
        <div class="row-fluid">
            
            <div class="span6 specs">
                <p class="help  odd">
                    <span class="title">Gross Revenue:</span>
                    <b>$244,969
                        <span class="help" title="All income the business received before any cost-of-sales or expenses have been deducted."><i class="fa fa-question-circle"></i></span></b>
                </p><p class="notDisclosed help  ">
                    <span class="title">EBITDA:</span>
                    <b>N/A
                        <span class="help" title="Earnings Before Interest, Taxes, Depreciation and Amortization"><i class="fa fa-question-circle"></i></span></b>
                </p><p class="help  odd">
                    <span class="title">FF&amp;E:</span>
                    <b>$16,000
                        <span class="help" title="Furniture, fixtures and equipment that will remain with the business, such as desks, office cubicles, decor elements of a restaurant or showroom, computers and office machines, pots and pans, dishes, display cases, manufacturing equipment, etc., depending on the type of business."><i class="fa fa-question-circle"></i></span></b>
                </p>
            </div>

Would be great if I could get a solution here on the forum

Thanks!
Tony
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
Hi Tony, welcome to the forum.

If you specify which information you would like to retrieve, someone here should be able to provide you with a solution.

Cheers!
 
Upvote 0
Thanks Domenic!

I would like to retrieve the information of the amount of the Asking Price (in this example 99,000)

<span class="title">Asking Price:</span>
<b>$99,900


Thanks Marc L!

Could you please give an example of how to use querySelector in this case?

Thanks!
 
Upvote 0
Try...

VBA Code:
htmlDoc.querySelector("p.price.asking > b").innerText

...where htmlDoc is a variable assigned the HTML document. The CSS selector string p.price.asking > b translates as the first <b> element where the parent is a <p> element having the classes price and asking.

Hope this helps!
 
Last edited:
Upvote 0
Solution
Hello, I was too fast!

I have added more data to be scraped from that HTML code and I received three time the same result (Columns D, E and F) although it should be different

And the fourth data extraction did not produce a result instead of "N/A" as result

It's the same HTML as above

Here is my VBA code

VBA Code:
Sub webElementsWithoutID() 's()
    
    Dim IE As Object
    Dim doc As HTMLDocument


    ThisWorkbook.Sheets("data").Range("AZ1").Value = "=CountA(A:A)"
    intRows = ThisWorkbook.Sheets("data").Range("AZ1").Value


    Set IE = CreateObject("InternetExplorer.application")
    IE.Visible = True


        For rowNo = 1 To intRows
        
        strUrl = ThisWorkbook.Sheets("data").Range("A" & rowNo).Text
        IE.navigate strUrl
        
            Do While IE.Busy Or IE.ReadyState <> 4
                Application.Wait DateAdd("s", 1, Now)
            Loop
        
            Set doc = IE.document
            
        
            strTitle = doc.getElementsByClassName("bfsTitle")(0).innerText
            ThisWorkbook.Sheets("data").Range("B" & rowNo).Value = strTitle
        
            strSubTitle = doc.getElementsByClassName("span8")(0).innerText
            ThisWorkbook.Sheets("data").Range("C" & rowNo).Value = strSubTitle
        
            askingprice = doc.querySelector("p.price.asking.help > b").innerText
            ThisWorkbook.Sheets("data").Range("D" & rowNo).Value = askingprice
            
          
            cashflow = doc.querySelector("p.price > b").innerText
            ThisWorkbook.Sheets("data").Range("E" & rowNo).Value = cashflow
        
            grossrevenue = doc.querySelector("p.help > b").innerText
            ThisWorkbook.Sheets("data").Range("F" & rowNo).Value = grossrevenue
        
            ebitda = doc.querySelector("p.notDisclosed > b").innerText
            ThisWorkbook.Sheets("data").Range("G" & rowNo).Value = ebitda1
        
        
        Next
 
    IE.Quit
    Set IE = Nothing
    MsgBox "done"
    


End Sub


Please help!

Thank you
 
Upvote 0
How about:

doc.querySelector("p.price.asking > b").innerText
doc.querySelector("p.price.help > b").innerText
doc.querySelector("p.help. odd > b").innerText
doc.querySelector("p.notDisclosed.help > b").innerText
 
Upvote 0
The following code will loop through each element that has the classname "title", check whether it's "Asking Price", "Cash Flow", etc, and then assign each value to their corresponding variable. (Note that I tried this same solution with querySelectorAll, but Excel kept crashing. So I went with getElementsByClassname instead.)

VBA Code:
    Dim ele As Object
    Dim txt As String
    Dim askingPrice As String
    Dim cashFlow As String
    Dim grossRevenue As String
    Dim ebitda As String
    Dim ffe As String
    
    For Each ele In doc.getElementsByClassName("title")
        txt = ele.parentElement.innerText
        If Left(txt, 12) = "Asking Price" Then
            askingPrice = Trim(Mid(txt, InStrRev(txt, ":") + 1))
        ElseIf Left(txt, 9) = "Cash Flow" Then
            cashFlow = Trim(Mid(txt, InStrRev(txt, ":") + 1))
        ElseIf Left(txt, 13) = "Gross Revenue" Then
            grossRevenue = Trim(Mid(txt, InStrRev(txt, ":") + 1))
        ElseIf Left(txt, 6) = "EBITDA" Then
            ebitda = Trim(Mid(txt, InStrRev(txt, ":") + 1))
        ElseIf Left(txt, 4) = "FF&E" Then
            ffe = Trim(Mid(txt, InStrRev(txt, ":") + 1))
        End If
    Next ele

However, if you want to omit "ffe", you can use the following code instead...

VBA Code:
    Dim txt As String
    Dim askingPrice As String
    Dim cashFlow As String
    Dim grossRevenue As String
    Dim ebitda As String
    Dim i As Long
    
    With doc.getElementsByClassName("title")
        i = 0
        While i < .Length - 1
            txt = .Item(i).parentElement.innerText
            If Left(txt, 12) = "Asking Price" Then
                askingPrice = Trim(Mid(txt, InStrRev(txt, ":") + 1))
            ElseIf Left(txt, 9) = "Cash Flow" Then
                cashFlow = Trim(Mid(txt, InStrRev(txt, ":") + 1))
            ElseIf Left(txt, 13) = "Gross Revenue" Then
                grossRevenue = Trim(Mid(txt, InStrRev(txt, ":") + 1))
            ElseIf Left(txt, 6) = "EBITDA" Then
                ebitda = Trim(Mid(txt, InStrRev(txt, ":") + 1))
            End If
            i = i + 1
        Wend
    End With
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top