VBA Data Scrape Macro Using Frequency Analysis

htmagic

New Member
Joined
Dec 2, 2009
Messages
40
Office Version
  1. 2019
Platform
  1. Windows
Hello,

Although I have lurked on this forum for some time now, this is my first technical post to the group asking for programming advice.
I have some knowledge of VBA, learning it as I go, and some of the advanced statistical functions of Excel.
Some of the posts and replies of this group have gotten me this far as a result.

I have written a program to analyze the winning Powerball Lottery numbers. The first five groups of numbers use 1-59 whereas the Powerball number is 1-35. I am using the statistical function frequency to analyze the numbers and provide me the number of times a certain number has been "pulled" for the winning number.

My question is, I am doing a data scrape using a VBA macro (called Draw) of a website. I paste this into a worksheet called WebScrape (Sheet 1) which contains this raw data.
I would like to add a button to CHECK RESULTS which updates the data from the webpage. The data results come in as one string and then I use a macro (called Parse) to parse the data into a series of two digit numbers.

VBA Code looks like this:

HTML:
Sub Draw()
'
' Draw Macro
' Scrape data from website
'

'
    With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;http://www.tnlottery.com/winningnumbers/default.aspx#pwrball", _
        Destination:=Range("$A$1"))
        .Name = "default.aspx#pwrball_1"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = True
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = """dgPowerBallWinners"""
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With
End Sub
Sub Parse()
'
' Parse Macro
' Parse data
'
' Keyboard Shortcut: Ctrl+t
'
    Application.Goto Reference:="R2C3"
    Range(Selection, Selection.End(xlDown)).Select
    Selection.TextToColumns Destination:=ActiveCell.Offset(0, 3).Range("A1"), _
        DataType:=xlDelimited, TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter _
        :=True, Tab:=False, Semicolon:=False, Comma:=False, Space:=True, _
        Other:=True, OtherChar:="-", FieldInfo:=Array(Array(1, 1), Array(2, 1), Array _
        (3, 1), Array(4, 1), Array(5, 1), Array(6, 1)), TrailingMinusNumbers:=True
End Sub

OK, now the WebScrape page looks like this:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Draw Date[/TD]
[TD]
Draw Result
[/TD]
[TD]Details[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]PB[/TD]
[/TR]
[TR]
[TD]7/11/2012[/TD]
[TD]05-22-36-49-55 23[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]3[/TD]
[TD]5[/TD]
[TD]29[/TD]
[TD]39[/TD]
[TD]59[/TD]
[TD]29[/TD]
[/TR]
[TR]
[TD]7/7/2012[/TD]
[TD]03-05-29-39-59 29[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]14[/TD]
[TD]19[/TD]
[TD]35[/TD]
[TD]39[/TD]
[TD]56[/TD]
[TD]33[/TD]
[/TR]
[TR]
[TD]7/4/2012[/TD]
[TD]14-19-35-39-56 33[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]7[/TD]
[TD]15[/TD]
[TD]20[/TD]
[TD]41[/TD]
[TD]44[/TD]
[TD]22[/TD]
[/TR]
[TR]
[TD]6/30/2012[/TD]
[TD]07-15-20-41-44 22[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]6[/TD]
[TD]34[/TD]
[TD]40[/TD]
[TD]46[/TD]
[TD]58[/TD]
[TD]6[/TD]
[/TR]
[TR]
[TD]6/27/2012[/TD]
[TD]06-34-40-46-58 06[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]1[/TD]
[TD]3[/TD]
[TD]41[/TD]
[TD]44[/TD]
[TD]53[/TD]
[TD]30[/TD]
[/TR]
[TR]
[TD]6/23/2012[/TD]
[TD]01-03-41-44-53 30[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]11[/TD]
[TD]17[/TD]
[TD]29[/TD]
[TD]56[/TD]
[TD]57[/TD]
[TD]14[/TD]
[/TR]
[TR]
[TD]6/20/2012[/TD]
[TD]11-17-29-56-57 14[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]8[/TD]
[TD]14[/TD]
[TD]15[/TD]
[TD]16[/TD]
[TD]27[/TD]
[TD]26[/TD]
[/TR]
</tbody>[/TABLE]

The numbers to the right of the details are the parsed numbers. Can I do this automatically when I data scrape the webpage?

Then I have a worksheet (called Data) with the date and the winning numbers, with the Powerball (PB) number at the end. A third worksheet (called Calculations) analyzes the data table using the FREQUENCY function. On this page, I analyze the winning numbers (1-59) with a BIN (1-59) and then the Frequency (FREQ) using the following formula:

=FREQUENCY(Data!$C$2:$G$43,$B$2:$B$60)

I do a similar PB BIN (1-35) and PB FREQ for the Poweball (PB) numbers. The array works well and produces values. Then I have a column for the SORT BIN and SORT FREQ and SORT PB BIN and SORT PB FREQ. I copy and then paste the array values then sort according to FREQ, highest to lowest values.

My question here is can I append to this array, making it grow as I hit the button CHECK RESULTS, pulling the latest lottery numbers from the webpage and transferring them into the
Data worksheet? I would like to add to the array and let it grow larger as it adds more dates and numbers to the Data worksheet. Then the array would automatically update as more data is added.

I am sorry for the long post and complicated description but I cannot add my Excel file here as HTML Makeer apparently is no longer available. I also cannot post attachments.
Thank you in advance for your help on this problem.

MagicBill
 

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
here is some code that will take the data from the "WebScrape" Worksheet and place it in the "Data" worksheet... You don't have to ruitn the "Parse" SubRoutine, this code will handle it for you.

Code:
Sub Process()
    Dim WsWeb As Worksheet
    Dim WsData As Worksheet
    
    Dim WsWebRowNo As Long
    Dim WsDataRowNo As Long
    
    Dim Draw As String
    Dim N As Variant
    Dim I As Integer
    
    Set WsWeb = ThisWorkbook.Worksheets("WebScrape")
    Set WsData = ThisWorkbook.Worksheets("Data")
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row + 1
    
    For WsWebRowNo = 2 To WsWeb.Cells(WsWeb.Rows.Count, "A").End(xlUp).Row
        WsData.Cells(WsDataRowNo, "A") = WsWeb.Cells(WsWebRowNo, "B")
        
        Draw = Replace(WsWeb.Cells(WsWebRowNo, "C"), " ", "-")
        N = Split(Draw, "-")
        For I = 0 To UBound(N)
            WsData.Cells(WsDataRowNo, 2 + I) = N(I)
        Next I
        
        WsDataRowNo = WsDataRowNo + 1
    Next
End Sub

If you can provide some screen shots of the "Data" and "Calculation worksheets, I am sure that more automation can be achieved
 
Last edited:
Upvote 0
I made a couple of changes to the code... There is a SUB called "Process". If you execute the "Process" macro, it will extract the data from the web site and put it on the "WebScrape" tab. It will then transpose the data to the "Data" tab as you describe.

It will create two Named Ranges. One called "DrawRng" for the Draw Numbers and another called "PbRng" for the PowerBal range. These range names should be used in the Frequency function

Here is the code

Code:
Option Explicit
Sub Process()
    Dim WsWeb As Worksheet
    Dim WsData As Worksheet
    Dim qt As QueryTable
    
    Dim WsWebRowNo As Long
    Dim WsDataRowNo As Long
    
    Dim Draw As String
    Dim N As Variant
    Dim I As Integer
    
    Dim MaxDateData As Date
    
    Set WsWeb = ThisWorkbook.Worksheets("WebScrape")
    Set WsData = ThisWorkbook.Worksheets("Data")
    
    'Extract the informationn from the website
    Call DeleteQT
    WsWeb.Cells.Clear
    Call DrawNumbers
    Set qt = WsWeb.QueryTables(1)
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row + 1
    MaxDateData = Application.WorksheetFunction.Max(WsData.Columns("A:A"))
    For WsWebRowNo = 2 To WsWeb.Cells(WsWeb.Rows.Count, "A").End(xlUp).Row
        If WsWeb.Cells(WsWebRowNo, "B") > MaxDateData Then
            WsData.Cells(WsDataRowNo, "A") = WsWeb.Cells(WsWebRowNo, "B")
            
            Draw = Replace(WsWeb.Cells(WsWebRowNo, "C"), " ", "-")
            N = Split(Draw, "-")
            For I = 0 To UBound(N)
                WsData.Cells(WsDataRowNo, 2 + I) = N(I)
            Next I
            
            WsDataRowNo = WsDataRowNo + 1
        End If
    Next
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row
    
    ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C2:R" & WsDataRowNo & "C6"
    ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C7:R" & WsDataRowNo & "C7"
    
End Sub
Function DrawNumbers()
    Dim Ws As Worksheet
    
    Set Ws = ThisWorkbook.Worksheets("WebScrape")
    With Ws.QueryTables.Add(Connection:= _
        "URL;http://www.tnlottery.com/winningnumbers/default.aspx#pwrball", Destination:=Ws.Range("$A$1"))
        .Name = "pwrball"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = "dgPowerBallWinners"
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With
End Function
Function DeleteQT()
    Dim qt As QueryTable
    For Each qt In ThisWorkbook.Worksheets("WebScrape").QueryTables
        qt.Delete
    Next
End Function


Here is the example for the Fequency function
Code:
=FREQUENCY(DrawRng,A2:A60)
 
Upvote 0
Thank you. But I need the dates transferred as well. Right now, the data from WebScrape pastes over the date column in cell B2 in the Data sheet. When I extract the data from the website, it picked up the following in Column A which I hid.
In WebScrape, we have:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Game Name
[/TD]
[TD]Draw Date
[/TD]
[TD][TABLE="width: 115"]
<colgroup><col width="115"></colgroup><tbody>[TR]
[TD="width: 115"]Draw Result
[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[TD]Details
[/TD]
[/TR]
[TR]
[TD]Powerball
[/TD]
[TD]7/14/2012
[/TD]
[TD]04-16-32-37-46 13
[/TD]
[TD]details
[/TD]
[/TR]
[TR]
[TD]Powerball
[/TD]
[TD]7/11/2012
[/TD]
[TD]05-22-36-49-55 23
[/TD]
[TD]details
[/TD]
[/TR]
</tbody>[/TABLE]


And in the Data worksheet, the dates start in Column B. So we need to start pasting dates into column B vs. A. So would I change this line to B?

MaxDateData = Application.WorksheetFunction.Max(WsData.Columns("A:A"))

And please provide advice on naming the tables as you did, both for the draw results and the powerball number.

Thank you,

MagicBill

</pre>
 
Upvote 0
The Code provided already names the tables... You just need to use the table names in your "frequency" function call. Once you have the Frequency call set up per the example I provided, you should not need to change it again because the code I provided updates the Range of the "named" table and the Frequency Call references the "Named" Table (or range)

As to the other issue... The Date is in Column "A" on the Data Tab. It may should up as a number (41104). You just need to reformat it a date. I was going by the example you provided and was not aware that there was a hidden column. I will revise the code and resubmit.
 
Last edited:
Upvote 0
The following code will place the information (i.e. PowerBall and Date) in the columns specified ("A" and "B") on the "Data" Tab

Code:
Option Explicit
Sub Process()
    Dim WsWeb As Worksheet
    Dim WsData As Worksheet
    Dim qt As QueryTable
    
    Dim WsWebRowNo As Long
    Dim WsDataRowNo As Long
    
    Dim Draw As String
    Dim N As Variant
    Dim I As Integer
    
    Dim MaxDateData As Date
    
    Set WsWeb = ThisWorkbook.Worksheets("WebScrape")
    Set WsData = ThisWorkbook.Worksheets("Data")
    
    'Extract the informationn from the website
    Call DeleteQT
    WsWeb.Cells.Clear
    Call DrawNumbers
    Set qt = WsWeb.QueryTables(1)
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row + 1
    MaxDateData = Application.WorksheetFunction.Max(WsData.Columns("B:B"))
    For WsWebRowNo = 2 To WsWeb.Cells(WsWeb.Rows.Count, "A").End(xlUp).Row
        If WsWeb.Cells(WsWebRowNo, "B") > MaxDateData Then
            WsData.Cells(WsDataRowNo, "A") = WsWeb.Cells(WsWebRowNo, "A")
            WsData.Cells(WsDataRowNo, "B") = WsWeb.Cells(WsWebRowNo, "B")
            
            Draw = Replace(WsWeb.Cells(WsWebRowNo, "C"), " ", "-")
            N = Split(Draw, "-")
            For I = 0 To UBound(N)
                WsData.Cells(WsDataRowNo, 3 + I) = N(I)
            Next I
            
            WsDataRowNo = WsDataRowNo + 1
        End If
    Next
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row
    
    ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C2:R" & WsDataRowNo & "C6"
    ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C7:R" & WsDataRowNo & "C7"
    
End Sub
Function DrawNumbers()
    Dim Ws As Worksheet
    
    Set Ws = ThisWorkbook.Worksheets("WebScrape")
    With Ws.QueryTables.Add(Connection:= _
        "URL;http://www.tnlottery.com/winningnumbers/default.aspx#pwrball", Destination:=Ws.Range("$A$1"))
        .Name = "pwrball"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = "dgPowerBallWinners"
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With
End Function
Function DeleteQT()
    Dim qt As QueryTable
    For Each qt In ThisWorkbook.Worksheets("WebScrape").QueryTables
        qt.Delete
    Next
End Function
 
Upvote 0
Thank you. But it still appears that the draw results and powerball numbers from the WEbScrape worksheet are shifted when pasted to the Data page. I'm trying to understand the coding but am not smart enough to know what is going on yet.

So this line of code,

WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row

ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C2:R" & WsDataRowNo & "C6"
ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C7:R" & WsDataRowNo & "C7"


DrawRng refers to cell B2 of Worksheet Data? The dates start in cell B2 whereas the (parsed) data starts in C2. The first PB number starts in cell H2.

So I still think this is off by a column. Should we have counted from "B" instead of "A"? or just adjust the R2C2 to R2C3 and R2C7 to R2C8?

Thank you for your assistance and your patience.
It is getting closer. And I'm learning more subtleties in VBA.

MagicBill
 
Upvote 0
You are correct. The Range Names are off by one.

Here is what you need.

Code:
 ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C3:R" & WsDataRowNo & "C7"
    ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C8:R" & WsDataRowNo & "C8"
 
Last edited:
Upvote 0
You are correct. The Range Names are off by one.

Here is what you need.

Code:
 ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C3:R" & WsDataRowNo & "C7"
    ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C8:R" & WsDataRowNo & "C8"

Thank you, Mr. Downey! Now I showed this code to a more advanced VBA programmer and he was impressed with your split solution.
I'm still having trouble with the DrawRng name.

I failed to mention when I analyze the frequency, I also start in Column B for the BIN (in the Ws"Calculations"). A is a blank column.
When I use the following code:

Code:
=FREQUENCY(DrawRng,B2:B60)

I get different values. Is there another part of the code I should modify?
I so wish I could attach the file here but I can't and the HTML Maker doesn't seem to work for me.
Thanks again for your patience. And I finally think I'm understanding this portion of the code:

Code:
[COLOR=#0000ff]For WsWebRowNo = 2 To WsWeb.Cells(WsWeb.Rows.Count, "A").End(xlUp).Row
        If WsWeb.Cells(WsWebRowNo, "B") > MaxDateData Then
            WsData.Cells(WsDataRowNo, "A") = WsWeb.Cells(WsWebRowNo, "A")
            WsData.Cells(WsDataRowNo, "B") = WsWeb.Cells(WsWebRowNo, "B")
            
            Draw = Replace(WsWeb.Cells(WsWebRowNo, "C"), " ", "-")
            N = Split(Draw, "-")
            For I = 0 To UBound(N)
                WsData.Cells(WsDataRowNo, 3 + I) = N(I)
            Next I[/COLOR]

This is the paste of the parsed data. When you say
Code:
[COLOR=#0000ff]WsData.Cells(WsDataRowNo, 3 + I) = N(I)[/COLOR]
, it is really adding columns from the parsed data, right?
Clever. Thank you for this elegant code. :)

MagicBill
 
Upvote 0

Forum statistics

Threads
1,223,888
Messages
6,175,217
Members
452,619
Latest member
Shiv1198

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top