Hi,
I wanted to extract some information from a web page. I had a few questions about it.
My query is similar to this.
I read one of the answers and could understand the code upto the querying part:
However, I couldn't understand the actual scraping part. So my questions:
1) Do I write code to scrap data from parts of the website? Can I specifiy the HTML tags so it searches for those everytime it goes through a site?
2) The site that I want to import data from might have a lot of blank pages too but with the similar lay out as every other site (i.e. same site with tags but no data). How I make the code skip those pages?
3) The website requires the user to log-in to access the data. Can I somehow login once manually and let the macro browse and import all the similar websites?
I am pretty new to VBA, hoping to get some help.
Thanks!
I wanted to extract some information from a web page. I had a few questions about it.
My query is similar to this.
I read one of the answers and could understand the code upto the querying part:
Code:
Sub QueryWeb()
Dim i As Integer
Dim firstRow As Integer
Dim lastRow As Integer
Dim nextRow As Integer
Dim URLstart As String
Dim URLend As String
Dim shStats As Worksheet
Dim shQuery As Worksheet
Dim rgQuery As Range
Dim found As Range
Dim TimeOutWebQuery
Dim TimeOutTime
Dim objIE As Object
Application.ScreenUpdating = False
URLstart = "http://stats.espncricinfo.com/ci/engine/stats/index.html?class=2;filter=advanced;orderby=start;page="
URLend = ";size=200;spanmax1=12+Jul+2012;spanmin1=13+Jul+2009;spanval1=span;template=results;type=batting;view=innings;wrappertype=print"
Application.DisplayAlerts = False
On Error Resume Next
Sheets("Stats").Delete
On Error GoTo 0
Application.DisplayAlerts = True
Sheets.Add after:=Sheets(Sheets.Count)
ActiveSheet.Name = "Stats"
Set shStats = Sheets("Stats")
For i = 1 To 47
Sheets.Add after:=Sheets(Sheets.Count)
Set shQuery = ActiveSheet
Set objIE = CreateObject("InternetExplorer.Application")
With objIE
.Visible = False
.Navigate CStr(URLstart & i & URLend)
End With
TimeOutWebQuery = 10
TimeOutTime = DateAdd("s", TimeOutWebQuery, Now)
Do Until objIE.ReadyState = 4
DoEvents
If Now > TimeOutTime Then
objIE.stop
GoTo ErrorTimeOut
End If
Loop
However, I couldn't understand the actual scraping part. So my questions:
1) Do I write code to scrap data from parts of the website? Can I specifiy the HTML tags so it searches for those everytime it goes through a site?
2) The site that I want to import data from might have a lot of blank pages too but with the similar lay out as every other site (i.e. same site with tags but no data). How I make the code skip those pages?
3) The website requires the user to log-in to access the data. Can I somehow login once manually and let the macro browse and import all the similar websites?
I am pretty new to VBA, hoping to get some help.
Thanks!
Last edited: