VBA code to extract a table

Abdmujib

Board Regular
Joined
May 15, 2022
Messages
123
Office Version
  1. 2021
Platform
  1. Windows
I have a website where I want to extract 3 tables but, the tables are not defined or structure in a way excel or google sheets could extract it.

The website is "GRC - BIL | Gran Canaria - Bilbao"


The tables consist of the result of matches that have been played.

First table is the result for last five matches for the home team. The second is for the Away team.
The third table is for the previous encounter between the two teams.

There is "see more" after each table, to show more games, I want to extract more upto 10 matches for each table. So it has to be able to get the matches in the see more where applicable. NB: The link, is just for the purpose of building the working template, I can put other similar link afterwards, and not all the link would have upto five matches or has "see more". Please put it in consideration.

So, I want the excel workbook to have two sheets, first sheet to be named "Data", I want all the 3 tables to be fetched into the "Data" sheet, not stacked on top each other. I.e not sharing the same colums. E.g, Table 1 might run from Column B - G, and Table 2 from Column I - N etc.
For the second sheet to be named "Result" and Cell B3 of this sheet should have the link for the VBA, so it will be easily to change the link for another match. If it's impossible, the link can be in the code.

Thank you so much.

I hope someone can help me out
Thanks in advance
 

Excel Facts

Ambidextrous Undo
Undo last command with Ctrl+Z or Alt+Backspace. If you use the Undo icon in the QAT, open the drop-down arrow to undo up to 100 steps.
@Abdmujib , I only work with VBA. I cannot extract data from Websites. Sorry.
I meant like using a macro code to extract it.
VBA Code:
Sub SeleniumSportradarManualB409()
'>>> Sep 26 2024
Dim TbColl As Object, tArr, myItm As Object
Dim I As Long, J As Long, myTim As Single
Dim WPage As Object
Dim myUrl As String
Dim TCColl As Object, PosPos As Long
'
'CONFIGURATION:
myUrl = "https://s5.sir.sportradar.com/sportradar/en/23/season/123235/fixtures"     '<<< The web page
''Sheets("Tables").Select                     '<<< The sheet to use    REMOVED, WILL WORK WITH THE ACTIVESHEET
Range("A:T").ClearContents                  '<<< Clear the sheet?
'
myTim = Timer
Set WPage = CreateObject("Selenium.WebDriver")
WPage.Start "Chrome", myUrl                         ' <11
''WPage.Start "edge", myUrl                          ' <22
WPage.Get "/"
'
On Error Resume Next
J = 0
J = Range("A:T").Find(What:="*", After:=Range("A1"), _
              SearchOrder:=xlByRows, _
              SearchDirection:=xlPrevious).Row
On Error GoTo 0
J = J + 2

Stop                                                                          '<<< SELECT THE TAB YOU WISH, THEN "CONTINUE" THE MACRO

Debug.Print vbCrLf & "Start", "J= " & J, WPage.Url, Format(Timer - myTim, "0.00")
Set TbColl = WPage.FindElementsByTag("table")
Debug.Print "Tables found: " & TbColl.Count, Format(Timer - myTim, "0.00")
For I = 1 To TbColl.Count
    Cells(J, 2).Value = "Table #" & I
    tArr = TbColl(I).AsTable.Data
    '>>>
    Dim iI As Long, jJ As Long
    For iI = 1 To UBound(tArr)
        For jJ = 1 To UBound(tArr, 2)
            If (Len(tArr(iI, jJ)) Mod 2 = 0) And (Len(tArr(iI, jJ)) > 3) Then
                PosPos = InStr(1, tArr(iI, jJ), ":", vbTextCompare)
                If PosPos > 0 And PosPos < 4 And InStr(3, tArr(iI, jJ), Left(tArr(iI, jJ), 3), vbTextCompare) > 0 Then
                    tArr(iI, jJ) = "'" & Left(tArr(iI, jJ), Len(tArr(iI, jJ)) / 2)
                End If
            End If
        Next jJ
    Next iI
    '<<<
    '>>> Block Sep 26 2024
    Set TCColl = TbColl(I).FindElementsByClass("col-xs-12")
    jJ = 2
    For iI = 1 To UBound(tArr)
        If jJ > TCColl.Count Then Exit For
        If Len(tArr(iI, 2)) > 3 Then
             PosPos = InStr(1, tArr(iI, 2), "-" & TCColl(jJ).Text, vbTextCompare)
             If PosPos > 0 Then
                tArr(iI, 2) = Replace(tArr(iI, 2), "-" & TCColl(jJ).Text, " # " & TCColl(jJ).Text, , , vbTextCompare)
                jJ = jJ + 2
             End If
        End If
    Next iI
'<<<
    Cells(J + 1, 1).Resize(UBound(tArr), UBound(tArr, 2)).Value = tArr
    Debug.Print "Table #" & I, UBound(tArr) & " * " & UBound(tArr, 2)
    J = J + UBound(tArr) + 3
Next I
'
Debug.Print "END", I, J, Format(Timer - myTim, "0.00")
'WPage.Quit
MsgBox ("Collected...")
WPage.Quit
End Sub
something similar to that, someone helped me to write it then, to extract tables from a website that does not bave a well defined structure table
 
Upvote 0
As I always say, these exercises take time and patience.
So I had to wait until I had excess time and patience....

Importing from that site could be done for example using the following macro:
Code:
Sub Fromflashscore()
'Selenium
Dim WPage As Object, myUrl As String, GetC
Dim CollA As Object, HtHColl As Object, HtHTr As Object, HtHTd As Object
Dim I As Long, NextA As Long, J As Long, K As Long
'
'Crea Driver:
Set WPage = CreateObject("Selenium.WebDriver")
myUrl = "https://www.flashscore.com/match/4Gqs8Wqn/#/h2h/overall"         'Your url
WPage.Start "chrome", myUrl     'Either A
'WPage.Start "edge", myUrl      '... or B
'
WPage.Get "/"
WPage.Wait 200
'
Set CollA = WPage.FindElementsById("onetrust-accept-btn-handler")
If CollA.Count > 0 Then             'Eventuale pulsante Accept
    CollA(1).Click
    WPage.Wait 500
End If
'
'Elements to be imported by class:
GetC = Array("h2h__date", "h2h__event", "h2h__homeParticipant", "h2h__awayParticipant", "h2h__result", "h2h__icon")
'
Set HtHColl = WPage.FindElementsByClass("h2h__section")
For I = 1 To HtHColl.Count
    NextA = Cells(Rows.Count, "A").End(xlUp).Row + 2
    Cells(NextA, 1) = HtHColl(I).FindElementsByTag("div")(1).Text
    Set HtHTr = HtHColl(I).FindElementsByClass("h2h__row")
    For J = 1 To HtHTr.Count
        NextA = NextA + 1
        For K = 0 To UBound(GetC)
            Cells(NextA, 1 + K).Value = Replace(HtHTr(J).FindElementsByClass(GetC(K))(1).Text, Chr(10), " - ", , , vbTextCompare)
        Next K
    Next J
Next I
WPage.Quit
Set WPage = Nothing
With Range("C:E")
    .EntireColumn.AutoFit
    .WrapText = False
End With
MsgBox ("Imported...")
End Sub
This is valid for H2H tab only and for the shown info only
 
Upvote 0
Solution
This version of code responds to some of your questions:
1)the data are accumulated into the sheet "data", split to 3 adjacent blocks
2)each table is expanded once for "Show more matches"
3)the old Sub Fromflashscore has been modified for 1 and 2 above and now is called Sub FromflashscoreParam; this need to be "Called" by a higher level Sub that passes it a Url
4)a sample Sub Caller is included, that calls one url only; it should be easy to add a loop that takes several URLs from another sheet and sequencially be passed to Sub FromflashscoreParam

The new code:
VBA Code:
Dim WPage As Object

Sub FromflashscoreParam(ByVal tUrl As String)
'Selenium
Dim myUrl As String, GetC, Last As Long
Dim CollA As Object, HtHColl As Object, HtHTr As Object, HtHTd As Object
Dim I As Long, NextA As Long, J As Long, K As Long
'
'Crea Driver:
If WPage Is Nothing Then
    Set WPage = CreateObject("Selenium.WebDriver")
End If
'myUrl = "https://www.flashscore.com/match/4Gqs8Wqn/#/h2h/overall"         'Your url
myUrl = tUrl
WPage.Start "chrome", myUrl     'Either A
'WPage.Start "edge", myUrl      '... or B
'
WPage.Get "/"
WPage.Wait 200
'
Set CollA = WPage.FindElementsById("onetrust-accept-btn-handler")
If CollA.Count > 0 Then             'Eventuale pulsante Accept
    CollA(1).Click
    WPage.Wait 500
End If
'
'Elements to be imported by class:
GetC = Array("h2h__date", "h2h__event", "h2h__homeParticipant", "h2h__awayParticipant", "h2h__result", "h2h__icon")
'
'Show more matches:
Set CollA = WPage.FindElementsByClass("showMore")
For I = 1 To 3
    If I > CollA.Count Then Exit For
    CollA(I).Click
    WPage.Wait 500
Next I
'
Last = LastR(Range("A:Z"))
Set HtHColl = WPage.FindElementsByClass("h2h__section")
For I = 1 To HtHColl.Count
    NextA = Last + 2
    Cells(NextA, 1 + (I - 1) * (UBound(GetC) + 2)) = HtHColl(I).FindElementsByTag("div")(1).Text
    Set HtHTr = HtHColl(I).FindElementsByClass("h2h__row")
    For J = 1 To HtHTr.Count
        NextA = NextA + 1
        For K = 0 To UBound(GetC)
            Cells(NextA, 1 + K + (I - 1) * (UBound(GetC) + 2)).Value = Replace(HtHTr(J).FindElementsByClass(GetC(K))(1).Text, Chr(10), " - ", , , vbTextCompare)
        Next K
    Next J
Next I
End Sub



Sub Caller()
Sheets("data").Select
'Loop for urls here >>>
Call FromflashscoreParam("https://www.flashscore.com/match/4Gqs8Wqn/#/h2h/overall")
'<<<<
'
'Ending:
WPage.Quit
Set WPage = Nothing
With Range("C:E, I:L, P:R")
    .EntireColumn.AutoFit
    .WrapText = False
End With
MsgBox ("Imported...")

End Sub




Function LastR(ByRef ckRan As Range) As Long

On Error Resume Next
LastR = ckRan.Find(What:="*", After:=ckRan.Cells(1, 1), _
              SearchOrder:=xlByRows, _
              SearchDirection:=xlPrevious).Row
On Error GoTo 0

End Function
Copy this into an empty standard module, so that the line Dim WPage As Object is on top of the module; then start Sub Caller
It should not be difficult to tune to your need...
 
Upvote 0

Forum statistics

Threads
1,224,809
Messages
6,181,076
Members
453,020
Latest member
mattg2448

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top