I've been having problems trying to run a crude VBA web scraper, attempting to loop through relatively large numbers of pages to consolidate the data I'm looking for. Essentially, it'll work for a set number of loops, and then just freezes - I get the processing circle, but nothing's happening. This typical number varies with each scraper/website I've tried it on, but it's generally between 10-30 loops. I've had to build a workbook save into the function, because the loop never runs through to completion - have to alt+f4 excel each time, reopen and reset the start number on the loop, and continue.
The code I'm using can be seen below:
I basically import the data from the web page/query on one sheet, paste the stuff I want over into another sheet, delete any columns that could contain data from the query on the import page, and go to the next step. The steps are set to import the proper names and urls, as listed on another sheet. I've been trying to do some research, and have tried setting background query to false, putting in a wait time anywhere from 1 to 10 seconds before each loop runs, etc. Have been frustrated in my attempts.
Is this (possibly) just me getting locked by the site(s) I'm querying - so there's nothing to really do from my end - or is there something I can improve in my code?
Thanks for any help you can give me.
The code I'm using can be seen below:
Code:
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Dim runrow As Long
runrow = Range("a2").End(xlDown).Row
For n = 2 To runrow
On Error Resume Next
Dim URLstats As String
Dim Links As Worksheet
Dim site As String
Dim name As String
Dim linkrow As Long
Dim Cleaner As Worksheet
Set Cleaner = Sheets("Cleaner")
Set Links = Sheets("Links")
linkrow = Links.Range("a2").End(xlDown).Row
site = Links.Range("b" & n).Value
name = Links.Range("c" & n).Value
With Cleaner.QueryTables.Add(Connection:=site, _
Destination:=Cleaner.Range("$A$1"))
.name = name
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = False
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Dim tryrow As Long
Dim lastrow As Long
Dim advrow As Long
Dim pasterow As Long
Dim Advanced As Worksheet
Set Advanced = Sheets("Advanced")
pasterow = Advanced.Range("b1").End(xlDown).Row
Cleaner.Range("a1").Copy
Advanced.Range("b" & pasterow + 1).PasteSpecial
Cleaner.Range("a3").Copy
Advanced.Range("c" & pasterow + 1).PasteSpecial
Cleaner.Range("a4").Copy
Advanced.Range("d" & pasterow + 1).PasteSpecial
With Cleaner.Range("a:a")
tryrow = .Find(What:="Season Totals", After:=.Cells(.Rows.Count), LookIn:=xlValues, LookAt:=xlWhole, _
Searchdirection:=xlNext, MatchCase:=False, SearchFormat:=False).Row
End With
Cleaner.Range("a" & tryrow - 1).Copy
Advanced.Range("e" & pasterow + 1).PasteSpecial
Cleaner.Range("b" & tryrow - 1).Copy
Advanced.Range("f" & pasterow + 1).PasteSpecial
Advanced.Range("a" & pasterow + 1).Value = name
Cleaner.Columns("A:dz").Delete Shift:=xlToLeft
ActiveWorkbook.Save
Next n
Application.ScreenUpdating = True
Application.DisplayAlerts = True
End Sub
I basically import the data from the web page/query on one sheet, paste the stuff I want over into another sheet, delete any columns that could contain data from the query on the import page, and go to the next step. The steps are set to import the proper names and urls, as listed on another sheet. I've been trying to do some research, and have tried setting background query to false, putting in a wait time anywhere from 1 to 10 seconds before each loop runs, etc. Have been frustrated in my attempts.
Is this (possibly) just me getting locked by the site(s) I'm querying - so there's nothing to really do from my end - or is there something I can improve in my code?
Thanks for any help you can give me.