Extracting data from the Racing Post Website

white_lightning

New Member
Joined
Jun 7, 2011
Messages
22
Hi Guys

I am looking to extract basic info from the Racing Post Website. Stuff like horse form, opening prices etc

Just wondering if anyone could give me some pointers with either doing it my self (I have some programming knowledge but its basic) or getting someone else to program it for me.

Is this going to be ultra difficult to code myself?

Thanks
 
white lightning

Could you give some more specifics?

There are a fair few cards, race and horses.

Also there's a lot of statistics.

If you can just pick a particular meeting and race it would give me a start.:)

PS I know I could do that myself but it really would be better if you pointed me in the right direction.

Mike

I'm not sure what you mean by 'derived'.

The page for the horse is not 'derived' form javascript - it's not even opened using script.

All the javascript seems to do set the size of the popup window for the horse's page.
 
Upvote 0

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop
@Norie - Thanks in advance

I have loads of systems/strategies that I would love to try out once i know how to extract some data. I have simplified one system just for example purposes.

1) I want to go thru the days cards and pick out only races that have 'handicap' in the race title.
2) From those selected races I would want to only select races that have between 3 and 13 runners only.
3) From those races I would want to select any horse that had won its last race.

So you would start by going to the days racing card -

http://www.racingpost.com/horses2/cards/home.sd

This should give you a list of all the races where the word handicap can be seen in the race title. For our example the first race today that would qualify would be 'Cork 3.05 Kerry Group Handicap 6f'

You would then click on the race name link to be taken to the following race card -

http://www.racingpost.com/horses2/cards/card.sd?race_id=534379&r_date=2011-06-12

On this screen, 'no of runners' is near the top and the horses form is in the table under 'DRAW' and horses that won there last race will have a number '1' at the far RIGHT of the group of numbers.

In this example, this race would not qualify as it exceeds the no of runners, 14 where the max is 13.

The next race that would qualify in our example would be:

CORK 5.15 Charleville Cheese Handicap and if you where to click on that race you would see that it has 11 runners which would make it a qualifying race and then you would look at the recent form for each horse in the race to see if any horse had a number 1 in the far right column of digits under DRAW which would indicate that it won its last race. In our example, Solent Ridge would be our Selection for this race. We would then move onto another race etc.

I hope the above info is easy enough for you to understand and i really do appreciate any help that you can offer.
 
Upvote 0
I've done something for the page with all the cards, ie the first link.

It simply gets all the meetings and their races.

I also did something for Bath that got all the races and runners.

That included the form, weight, etc for each horse.

Mind you the headings for that were slightly out, but that can easily be fixed.

So it's basically all the data unfiltered.


I know that isn't exactly what you want.

What I was thinking is that once you've got all the data you can do any filtering or searching in Excel.

Doing that would probably be easier than searching the pages.

I'll try and post some examples later of what I've been able to do so far.

Please don't laughter at it, I didn't have any time to clean the data up.:)
 
Upvote 0
@Norie

I cant thank you enough for the help, i certainly wont be laughing.

Im i right in thinking that it is better to scrap a load of data to excel and then do the filtering/calculations there rather than only scraping the data required?
 
Upvote 0
Well I don't quite know but to me it makes more sense to get all the data, perhaps massage it a bit to so it's in a workable format.

Then do any filtering/searching needed.

Obviously you might not get all the data, for example the criteria for which races you are interested in might not be too hard to use to at least filter the relevant
meetings.

eg if there a races you are interested in in Bath, Bangor and Ayr just get the race cards for those 3 meetings.

Anyway here's an example of a card from the first link, obviously it's today so not much to choose from.

<TABLE style="WIDTH: 429pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=571><COLGROUP><COL style="WIDTH: 52pt; mso-width-source: userset; mso-width-alt: 2523" width=69><COL style="WIDTH: 307pt; mso-width-source: userset; mso-width-alt: 14957" width=409><COL style="WIDTH: 29pt; mso-width-source: userset; mso-width-alt: 1389" width=38><COL style="WIDTH: 41pt; mso-width-source: userset; mso-width-alt: 2011" width=55><TBODY><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; WIDTH: 52pt; HEIGHT: 15pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 height=20 width=69>CORK (IRE)</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; WIDTH: 307pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 width=409> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; WIDTH: 29pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 width=38> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; WIDTH: 41pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 width=55> </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 height=20>http://www.racingpost.com/horses/course_home.sd?crs_id=596</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 height=20> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63 height=20>GOING: Round course - GOOD TO FIRM (Light watering to maintain); Sprint course - GOOD (Good to firm in places). </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63> </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>02:35</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Irish Stallion Farms European Breeders Fund Auction Maiden 6f </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>03:05</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Kerry Group Handicap 6f </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>03:35</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Galtee Midsummer Sprint Stakes (Listed Race) 5f </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>04:10</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Kerry Group Noblesse Stakes (Group 3) (Fillies & Mares) 1m4f </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>04:40</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Golden Olive Maiden 1m150y </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>05:15</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Charleville Cheese Handicap 1m150y </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR><TR style="HEIGHT: 15pt" height=20><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 15pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl64 height=20 align=right>05:45</TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Low Low Handicap 7f </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Card </TD><TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl63>Betting </TD></TR></TBODY></TABLE>

I've done some simple formatting but only manually.

Would something liket this be of any use, perhaps just as a starting point?:)

PS I know there might be some extra unneeded data but to exclude it and ony return the exact data you want would be pretty complicated.
 
Upvote 0
@Norie

yeah that looks ok. It will be a starting point and something for me to work with. If you could send me the code i could have a look and get back to you with any questions. il PM you my email address.

thanks
 
Upvote 0
This is th every messy code I used to get the cards for all the meetings on the main page.

I also try code for getting the cards from each meeting's page.

It's quite similar to the first code but getting to the pages themselves is a little more complicated.

I've not come up with anything for that yet, but there are a couple of things I'm thinking of.

Anyway, here's the code.

Run in it in an empty workbook, and please remember it's far from perfect.

Oh and it adds a worksheet for each card.

Tried to name the worksheets with the meeting name but ran into some trouble with one of them at the weekend so left it.:)

Main Code
Rich (BB code):
Option Explicit
 
Sub RaceMeetingCard()
Dim IE As Object
Dim doc As Object
Dim divRaces
Dim divsCol As Object
Dim divCard As Object
Dim elmt As Object
Dim lnk As Object
Dim strURL As String
Dim ws As Worksheet
Dim rng As Range

    strURL = "http://www.racingpost.com/horses2/cards/home.sd"
 
    ' Bath 11 June 2011
    'strURL = "http://www.racingpost.com/horses2/cards/meeting_of_cards.sd?crs_id=5&r_date=2011-06-11"
 
    Set IE = CreateObject("InternetExplorer.Application")

    With IE
        .navigate strURL

        Do While .Busy: DoEvents: Loop
        Do While .ReadyState <> 4: DoEvents: Loop

        '.Visible = True - this is optional
 
        Set doc = IE.Document
 
        Set divRaces = doc.getElementById("races_result")
 
        Set divsCol = divRaces.getelementsbytagname("DIV")
 
        For Each divCard In divsCol
 
            If divCard.CLASSNAME = "crBlock" Then   
 '
                Set ws = Worksheets.Add(After:=Worksheets(Worksheets.Count))
 
                Set rng = ws.Range("A1")
 
                For Each elmt In divCard.all

                    Select Case elmt.tagname

                        Case "TABLE"
                            Select Case elmt.CLASSNAME

                                Case "raceHead"
                                    rng.Value = elmt.innerText
                                    'ws.Name = rng.Value
                                    Set lnk = elmt.getelementsbytagname("A")(0)
                                    rng.Offset(2).Value = lnk
                                    Set rng = rng.Offset(6)
 
                                Case "cardsGrid"
                                    GetTableData elmt, rng
                                    Set rng = rng.Offset(elmt.Rows.Length + 1)
 
                            End Select
 
                        Case "P"

                            Select Case elmt.CLASSNAME

                                Case "border"
                                    rng.Value = elmt.innerText
                                    Set rng = rng.Offset(1)

                                Case "bull show"
                                    Set lnk = elmt.getelementsbytagname("A")(0)
                                    ws.Range("A5").Value = lnk
 
                            End Select

                    End Select
 
                Next elmt
            End If
 
            ws.Cells.WrapText = False
            ws.Range("B1:D1").EntireColumn.AutoFit

        Next divCard
 
        IE.Quit
 
        Set IE = Nothing
 
    End With
 
    Application.Goto Worksheets(1).Range("A1"), scroll
 
End Sub

Sub to get data from a table.
Rich (BB code):
Sub GetTableData(ByRef tbl, rng As Range)
Dim cl As Object
Dim rw As Object
Dim I As Long

    For Each rw In tbl.Rows
 
        For Each cl In rw.Cells

            rng.Value = cl.outerText

            Set rng = rng.Offset(, 1)

        Next cl

        Set rng = Cells(rng.Row + 1, 1)

    Next rw

End Sub

I've run that a couple of times for today and it works for me.

The only card it doesn't seem to get is the 'Televised Races' one.
 
Upvote 0
@Norie

Thanks for that, il give it a go. It will give me something to go on and perhaps when i have learned more vba with this training dvd i have it will make perfet sense.

cheers.
 
Upvote 0
Hi, First off, thanks for your kindness in spending time to help. I tried this today but got errors: Sub or Function not defined. The debugger points to this line: GetTableData elmt, rng
Pls check.
Regards.

Run in it in an empty workbook, and please remember it's far from perfect.
 
Upvote 0
Could you please check you copied all the code?:)
 
Upvote 0

Forum statistics

Threads
1,224,560
Messages
6,179,520
Members
452,923
Latest member
JackiG

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top