carolinetoynbee
New Member
- Joined
- Feb 22, 2022
- Messages
- 1
- Office Version
- 2013
- Platform
- Windows
Hi All
I am scraping a website, using XMLHTTP, that holds information about football matches into an Excel sheet. Most of the data scrapes easily but I cannot access data held within a button.
The the page I'm scraping is Latest Football Fixtures & Betting Odds | Soccer Base
Here's my VBA to scrape the table cells. Each row of values has a tag id="tgc123456" where "123456" is a six digit value unique to that football match.
All pretty straightforward and it works fine.
At the bottom of the table HTML is the following. Each cell has the same id as the row but suffixed with "h", "d" or "a":
I've added the following to my code to pull in these three values
It doesn't pull in any data, it just leaves the Excel cells blank. I've tried looking at the attributes of the tag to no success. Does anyone know how I can grab the data held in the button? It's the inner text I'm looking for but I could equally grab the value of the attribute "data-price-decimal" if easier.
Thanks in advance for any help.
I am scraping a website, using XMLHTTP, that holds information about football matches into an Excel sheet. Most of the data scrapes easily but I cannot access data held within a button.
The the page I'm scraping is Latest Football Fixtures & Betting Odds | Soccer Base
Here's my VBA to scrape the table cells. Each row of values has a tag id="tgc123456" where "123456" is a six digit value unique to that football match.
VBA Code:
Set hTable = html.querySelector(".soccerGrid") 'soccerGrid is the table name
Dim td As Object, tr As Object, th As Object, HTMLElement As Object, r As Long, c As Long
For Each tr In hTable.getElementsByTagName("tr")
r = r + 1: c = 1
For Each th In tr.getElementsByTagName("th")
ws.Cells(r, c) = th.innerText
c = c + 1
Next th
For Each td In tr.getElementsByTagName("td")
ws.Cells(r, c) = td.innerText
c = c + 1
Next td
Next tr
All pretty straightforward and it works fine.
At the bottom of the table HTML is the following. Each cell has the same id as the row but suffixed with "h", "d" or "a":
<td id="tgc836466h">
<button type="button" class="btn btnBet btnLight" data-price-decimal="6" data-price-fraction="5" title="6 with bet365 - Click to bet">6</button>
</td>
<td id="tgc836466d">
<button type="button" class="btn btnBet btnLight" data-price-decimal="4" data-price-fraction="3" title="4 with bet365 - Click to bet">4</button>
</td>
<td id="tgc836466a">
<button type="button" class="btn btnBet btnLight" data-price-decimal="1.55" data-price-fraction="11/20" title="1.55 with bet365 - Click to bet">1.55</button>
</td>
I've added the following to my code to pull in these three values
VBA Code:
'pull in data for "tgc123456h" , "tgc123456d" and "tgc123456a" from buttons
tagName = tr.ID
If tagName <> "" Then
ws.Cells(r, c + 1) = html.getElementById(tagName & "h").innerText
ws.Cells(r, c + 2) = html.getElementById(tagName & "d").innerText
ws.Cells(r, c + 3) = html.getElementById(tagName & "a").innerText
End If
It doesn't pull in any data, it just leaves the Excel cells blank. I've tried looking at the attributes of the tag to no success. Does anyone know how I can grab the data held in the button? It's the inner text I'm looking for but I could equally grab the value of the attribute "data-price-decimal" if easier.
Thanks in advance for any help.