VBA Data Scrape Macro Using Frequency Analysis

htmagic

New Member
Joined
Dec 2, 2009
Messages
40
Office Version
  1. 2019
Platform
  1. Windows
Hello,

Although I have lurked on this forum for some time now, this is my first technical post to the group asking for programming advice.
I have some knowledge of VBA, learning it as I go, and some of the advanced statistical functions of Excel.
Some of the posts and replies of this group have gotten me this far as a result.

I have written a program to analyze the winning Powerball Lottery numbers. The first five groups of numbers use 1-59 whereas the Powerball number is 1-35. I am using the statistical function frequency to analyze the numbers and provide me the number of times a certain number has been "pulled" for the winning number.

My question is, I am doing a data scrape using a VBA macro (called Draw) of a website. I paste this into a worksheet called WebScrape (Sheet 1) which contains this raw data.
I would like to add a button to CHECK RESULTS which updates the data from the webpage. The data results come in as one string and then I use a macro (called Parse) to parse the data into a series of two digit numbers.

VBA Code looks like this:

HTML:
Sub Draw()
'
' Draw Macro
' Scrape data from website
'

'
    With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;http://www.tnlottery.com/winningnumbers/default.aspx#pwrball", _
        Destination:=Range("$A$1"))
        .Name = "default.aspx#pwrball_1"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = True
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = """dgPowerBallWinners"""
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With
End Sub
Sub Parse()
'
' Parse Macro
' Parse data
'
' Keyboard Shortcut: Ctrl+t
'
    Application.Goto Reference:="R2C3"
    Range(Selection, Selection.End(xlDown)).Select
    Selection.TextToColumns Destination:=ActiveCell.Offset(0, 3).Range("A1"), _
        DataType:=xlDelimited, TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter _
        :=True, Tab:=False, Semicolon:=False, Comma:=False, Space:=True, _
        Other:=True, OtherChar:="-", FieldInfo:=Array(Array(1, 1), Array(2, 1), Array _
        (3, 1), Array(4, 1), Array(5, 1), Array(6, 1)), TrailingMinusNumbers:=True
End Sub

OK, now the WebScrape page looks like this:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Draw Date[/TD]
[TD]
Draw Result
[/TD]
[TD]Details[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]PB[/TD]
[/TR]
[TR]
[TD]7/11/2012[/TD]
[TD]05-22-36-49-55 23[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]3[/TD]
[TD]5[/TD]
[TD]29[/TD]
[TD]39[/TD]
[TD]59[/TD]
[TD]29[/TD]
[/TR]
[TR]
[TD]7/7/2012[/TD]
[TD]03-05-29-39-59 29[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]14[/TD]
[TD]19[/TD]
[TD]35[/TD]
[TD]39[/TD]
[TD]56[/TD]
[TD]33[/TD]
[/TR]
[TR]
[TD]7/4/2012[/TD]
[TD]14-19-35-39-56 33[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]7[/TD]
[TD]15[/TD]
[TD]20[/TD]
[TD]41[/TD]
[TD]44[/TD]
[TD]22[/TD]
[/TR]
[TR]
[TD]6/30/2012[/TD]
[TD]07-15-20-41-44 22[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]6[/TD]
[TD]34[/TD]
[TD]40[/TD]
[TD]46[/TD]
[TD]58[/TD]
[TD]6[/TD]
[/TR]
[TR]
[TD]6/27/2012[/TD]
[TD]06-34-40-46-58 06[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]1[/TD]
[TD]3[/TD]
[TD]41[/TD]
[TD]44[/TD]
[TD]53[/TD]
[TD]30[/TD]
[/TR]
[TR]
[TD]6/23/2012[/TD]
[TD]01-03-41-44-53 30[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]11[/TD]
[TD]17[/TD]
[TD]29[/TD]
[TD]56[/TD]
[TD]57[/TD]
[TD]14[/TD]
[/TR]
[TR]
[TD]6/20/2012[/TD]
[TD]11-17-29-56-57 14[/TD]
[TD]details[/TD]
[TD][/TD]
[TD]8[/TD]
[TD]14[/TD]
[TD]15[/TD]
[TD]16[/TD]
[TD]27[/TD]
[TD]26[/TD]
[/TR]
</tbody>[/TABLE]

The numbers to the right of the details are the parsed numbers. Can I do this automatically when I data scrape the webpage?

Then I have a worksheet (called Data) with the date and the winning numbers, with the Powerball (PB) number at the end. A third worksheet (called Calculations) analyzes the data table using the FREQUENCY function. On this page, I analyze the winning numbers (1-59) with a BIN (1-59) and then the Frequency (FREQ) using the following formula:

=FREQUENCY(Data!$C$2:$G$43,$B$2:$B$60)

I do a similar PB BIN (1-35) and PB FREQ for the Poweball (PB) numbers. The array works well and produces values. Then I have a column for the SORT BIN and SORT FREQ and SORT PB BIN and SORT PB FREQ. I copy and then paste the array values then sort according to FREQ, highest to lowest values.

My question here is can I append to this array, making it grow as I hit the button CHECK RESULTS, pulling the latest lottery numbers from the webpage and transferring them into the
Data worksheet? I would like to add to the array and let it grow larger as it adds more dates and numbers to the Data worksheet. Then the array would automatically update as more data is added.

I am sorry for the long post and complicated description but I cannot add my Excel file here as HTML Makeer apparently is no longer available. I also cannot post attachments.
Thank you in advance for your help on this problem.

MagicBill
 
Ok... The issue is that Column "A" in the "DATA" tab does not have information filled in to coinside with the dates. Fill the "PowerBall" work down to row 43 and you should get what you expect.


When the Maco is executed and it finds new data from the Web, it will transpose that information over to the "DATA" tab and Place the Word PowerBall in Column "A"
 
Upvote 0

Excel Facts

Can a formula spear through sheets?
Use =SUM(January:December!E7) to sum E7 on all of the sheets from January through December
It works! Thank you! I had to read and reread your instructions and once I copied Powerball all the way down, then it works.

Thank you.

MagicBill
 
Upvote 0
Yeah... The code assumes that Column "A" is populate. It determines the last "Used Row" based on column "A"


Here is the code statement that determine that last row in the "DATA" tab.

WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row

The code then uses this information to set the Two Ranges. Where is the relative code


ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C3:R" & WsDataRowNo & "C7"
ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C8:R" & WsDataRowNo & "C8"

Seeing that just A1 was filled by "Powerball", Named Ranges where not what we wanted.

In the future, as you run the Macro, It will automatically fill in Column "A" when it fills in the Date and other numbers!
 
Upvote 0
Now for the Sort macro.
I used the record macro feature and came up with this macro.

Code:
[COLOR=#0000ff]Sub Sort()
'
' Sort Macro
'

'
    Range("B2").Select
    Range(Selection, Selection.End(xlDown)).Select
    Range("B2:C60").Select
    Selection.Copy
    ActiveWindow.SmallScroll Down:=-24
    Range("E2").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
        :=False, Transpose:=False
    Range("B2").Select
    ActiveWindow.SmallScroll Down:=3
    Range("B2:B36").Select
    Application.CutCopyMode = False
    Selection.Copy
    ActiveWindow.SmallScroll Down:=-18
    Range("G2").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
        :=False, Transpose:=False
    Range("D2").Select
    Range(Selection, Selection.End(xlDown)).Select
    Application.CutCopyMode = False
    Selection.Copy
    Range("H2").Select
    Selection.PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks _
        :=False, Transpose:=False
    Range("E1").Select
    Range(Selection, Selection.End(xlDown)).Select
    Range("E1:F60").Select
    Application.CutCopyMode = False
    ActiveWorkbook.Worksheets("Calculations").Sort.SortFields.Clear
    ActiveWorkbook.Worksheets("Calculations").Sort.SortFields.Add Key:=Range( _
        "F2:F60"), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:= _
        xlSortNormal
    With ActiveWorkbook.Worksheets("Calculations").Sort
        .SetRange Range("E1:F60")
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
    Range("G1").Select
    Range(Selection, Selection.End(xlDown)).Select
    Range("G1:H36").Select
    ActiveWorkbook.Worksheets("Calculations").Sort.SortFields.Clear
    ActiveWorkbook.Worksheets("Calculations").Sort.SortFields.Add Key:=Range( _
        "H2:H36"), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:= _
        xlSortNormal
    With ActiveWorkbook.Worksheets("Calculations").Sort
        .SetRange Range("G1:H36")
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
    Range("D2").Select
End Sub[/COLOR]

It seems to work but it isn't elegant. Any smoother ways to do this? Also, this code pasted itself into Module2.
Can I pin this into Module 1 before the Function runs? It didn't work there.
Do I need another button to run the Sort subroutine or can I make this entire macro automatic?

Thanks,

MagicBill
 
Upvote 0
Try this:

Code:
Sub Sort()
    With Worksheets("Calculations")
        .Range("E2:F60").Value = Range("B2:C60").Value
        .Range("G2:G36").Value = Range("B2:B36").Value

        With .Range("D2", .Range("D2").End(xlDown))
            .Range("H2").Resize(.Rows.Count).Value = .Value
        End With

        .Range("E1:F60").Sort Key1:=.Range("F1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
        .Range("G1:H36").Sort Key1:=.Range("H1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
    End With
End Sub
 
Upvote 0
Try this:

Code:
[COLOR=#0000ff]Sub Sort()
    With Worksheets("Calculations")
        .Range("E2:F60").Value = Range("B2:C60").Value
        .Range("G2:G36").Value = Range("B2:B36").Value

        With .Range("D2", .Range("D2").End(xlDown))
            .Range("H2").Resize(.Rows.Count).Value = .Value
        End With

        .Range("E1:F60").Sort Key1:=.Range("F1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
        .Range("G1:H36").Sort Key1:=.Range("H1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
    End With
End Sub[/COLOR]
Dear shg,

Thank you! I tried it and it only did part of it. You missed the PB FREQ. So I saw what you did and modified the code to this:
Code:
[COLOR=#0000ff]Sub Sort()
    With Worksheets("Calculations")
        .Range("E2:F60").Value = Range("B2:C60").Value
        .Range("G2:G36").Value = Range("B2:B36").Value
       [/COLOR][COLOR=#008080] .Range("H2:H36").Value = Range("D2:D36").Value
[/COLOR][COLOR=#0000FF]

        With .Range("D2", .Range("D2").End(xlDown))
            .Range("H2").Resize(.Rows.Count).Value = .Value
        End With

        .Range("E1:F60").Sort Key1:=.Range("F1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
        .Range("G1:H36").Sort Key1:=.Range("H1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
    End With
End Sub[/COLOR]
With the extra line of code in green, it works great! Thank you!

MagicBill
 
Upvote 0
Dear shg,

Thank you! I tried it and it only did part of it. You missed the PB FREQ. So I saw what you did and modified the code to this:
Code:
[COLOR=#0000ff]Sub Sort()
    With Worksheets("Calculations")
        .Range("E2:F60").Value = Range("B2:C60").Value
        .Range("G2:G36").Value = Range("B2:B36").Value
       [/COLOR][COLOR=#008080] .Range("H2:H36").Value = Range("D2:D36").Value
[/COLOR][COLOR=#0000FF]

        With .Range("D2", .Range("D2").End(xlDown))
            .Range("H2").Resize(.Rows.Count).Value = .Value
        End With

        .Range("E1:F60").Sort Key1:=.Range("F1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
        .Range("G1:H36").Sort Key1:=.Range("H1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
    End With
End Sub[/COLOR]
With the extra line of code in green, it works great! Thank you!

MagicBill

Now how do I automate this so it runs in the Process module? Do I set the Sort code as a function and call this function right after the line of code Call UpdateCalcWS(WsCalc)?

Thanks in advance.

MagicBill
 
Upvote 0
Here is the wole thing together.... (Thanks for the code SHG!)

Code:
Option Explicit
Type typeRec
    Bin As Integer
    Freq As Integer
End Type
Sub Process()
    Dim WsWeb As Worksheet
    Dim WsData As Worksheet
    Dim WsCalc As Worksheet
    Dim qt As QueryTable
    
    Dim WsWebRowNo As Long
    Dim WsDataRowNo As Long
    Dim Draw As String
    Dim N As Variant
    Dim I As Integer
    
    Dim MaxDateData As Date
    
    Set WsWeb = ThisWorkbook.Worksheets("WebScrape")
    Set WsData = ThisWorkbook.Worksheets("Data")
    Set WsCalc = ThisWorkbook.Worksheets("Calculations")
    
    'Extract the informationn from the website
    Call DeleteQT
    WsWeb.Cells.Clear
    Call DrawNumbers
    Set qt = WsWeb.QueryTables(1)
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row + 1
    MaxDateData = Application.WorksheetFunction.Max(WsData.Columns("B:B"))
    For WsWebRowNo = 2 To WsWeb.Cells(WsWeb.Rows.Count, "A").End(xlUp).Row
        If WsWeb.Cells(WsWebRowNo, "B") > MaxDateData Then
            WsData.Cells(WsDataRowNo, "A") = WsWeb.Cells(WsWebRowNo, "A")
            WsData.Cells(WsDataRowNo, "B") = WsWeb.Cells(WsWebRowNo, "B")
            
            Draw = Replace(WsWeb.Cells(WsWebRowNo, "C"), " ", "-")
            N = Split(Draw, "-")
            For I = 0 To UBound(N)
                WsData.Cells(WsDataRowNo, 3 + I) = N(I)
            Next I
            
            WsDataRowNo = WsDataRowNo + 1
        End If
    Next
    
    WsDataRowNo = WsData.Cells(WsData.Rows.Count, "A").End(xlUp).Row
    
    ThisWorkbook.Names.Add Name:="DrawRng", RefersToR1C1:="=Data!R2C3:R" & WsDataRowNo & "C7"
    ThisWorkbook.Names.Add Name:="PbRng", RefersToR1C1:="=Data!R2C8:R" & WsDataRowNo & "C8"
    
    
    Call UpdateCalcWS(WsCalc)
End Sub
Function UpdateCalcWS(WsCalc As Worksheet)
    'Add Formula Arrays to Caculations WorkSheet
    WsCalc.Range("C2:C60").FormulaArray = "=FREQUENCY(DrawRng,RC[-1]:R[58]C[-1])"
    WsCalc.Range("D2:D36").FormulaArray = "=FREQUENCY(pbRng,RC[-2]:R[34]C[-2])"
    
    'Copy the Draw Freqs to a new range
    WsCalc.Range("E2:F60").Value = Range("B2:C60").Value
    WsCalc.Range("G2:G36").Value = Range("B2:B36").Value
    WsCalc.Range("H2:H36").Value = Range("D2:D36").Value
    WsCalc.Range("E1:F60").Sort Key1:=WsCalc.Range("F1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
    WsCalc.Range("G1:H36").Sort Key1:=WsCalc.Range("H1"), Order1:=xlDescending, _
                              Header:=xlYes, _
                              MatchCase:=False, _
                              Orientation:=xlTopToBottom
End Function
Function SortIt(WsCalc As Worksheet, strRng As String, SortCol As String)
    Application.CutCopyMode = False
    
    WsCalc.Sort.SortFields.Clear
    WsCalc.Sort.SortFields.Add Key:=WsCalc.Columns(SortCol), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
    
    With WsCalc.Sort
        .SetRange Range(strRng)
        .Header = xlYes
        .MatchCase = False
        .Orientation = xlTopToBottom
        .SortMethod = xlPinYin
        .Apply
    End With
End Function
Function DrawNumbers()
    Dim WS As Worksheet
    
    Set WS = ThisWorkbook.Worksheets("WebScrape")
    With WS.QueryTables.Add(Connection:= _
        "URL;http://www.tnlottery.com/winningnumbers/default.aspx#pwrball", Destination:=WS.Range("$A$1"))
        .Name = "pwrball"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = "dgPowerBallWinners"
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With
End Function
Function DeleteQT()
    Dim qt As QueryTable
    For Each qt In ThisWorkbook.Worksheets("WebScrape").QueryTables
        qt.Delete
    Next
End Function
 
Upvote 0

Forum statistics

Threads
1,223,898
Messages
6,175,274
Members
452,628
Latest member
dd2

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top