Need help to write this VBA code better

Sharid

Well-known Member
Joined
Apr 22, 2007
Messages
1,066
Office Version
  1. 2016
Platform
  1. Windows
I need help in writing this much better. Currently it is two codes I have put together to make one. It runs the first half first and then the second half after the first is done. I know this can be written much better, but it is out of my depth.

What it does,
Process 1
1) Search urls in sheet1 row2 down
2) uses Regxp to get emails and phone numbers,
Process 2
Searches for social media links

In short the code runs once looking for email and phone number and the a second time through the same urls to look for social media links. Therefore it will go through 100 urls 200 times as it runs twice.

It should get all the data once, and also be able to skip and dead urls,

Please could someone take a look and save my life. I had a better code, but could not get that to work either so went back to scratch and I am trying a more basic approach Stackoverflow

VBA Code:
Private Sub CommandButton1_Click()
    Dim IE As Object: Set IE = CreateObject("InternetExplorer.Application")
    Dim Rxp As Object: Set Rxp = CreateObject("VBScript.RegExp")
    Dim emailMatch As Object, phoneMatch  As Object, S$, cel As Range
    Dim Html As htmlDocument

    For Each cel In Sheets("Sheet1").Range("A2:A" & Sheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).row)
        With IE
            .Visible = False
            .navigate cel
            While .Busy Or .readyState <> 4: DoEvents: Wend
            Set Html = .document
        End With
        
        With Rxp
            .Pattern = "[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
            Set emailMatch = .Execute(Html.body.innerHTML)
            .Pattern = "(?:\+1)?(?:\+[0-9])?\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})"
            Set phoneMatch = .Execute(Html.body.innerHTML)
        End With
        
        If emailMatch.Count >= 1 Then
            cel(1, 2) = emailMatch(0)
        Else:
            cel(1, 2) = "Not Found"
        End If
        
        If phoneMatch.Count >= 1 Then
            cel(1, 3) = phoneMatch(0)
        Else:
            cel(1, 3) = "Not Found"
        End If
    Next cel
    
'##################################################################################################
'######################################### SECOND PROCESSS ########################################
'##################################################################################################

Dim counter As Long
Dim website As Range
Dim row As Long
Dim continue As Boolean
Dim respHead As String

''''The row where website addresses start
row = 2
    continue = True

Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
    Do While continue
'''Could set this to first cell with URL then OFFSET columns to get next web site
Set website = ThisWorkbook.Worksheets("Sheet1").Range("A" & row)
        If Len(website.Value) < 1 Then
            continue = False
        Exit Sub
        End If

        If website Is Nothing Then
            continue = False
        End If

'''Debug.Print website
    With http
        On Error Resume Next
        .Open "GET", website.Value, False
        .send

'''If the website sent a valid response to our request, URLS ARE IN COLUMN A
    If Err.Number = 0 Then
        If .Status = 200 Then
            Html.body.innerHTML = http.responseText
                Set links = Html.getElementsByTagName("a")
'''COLUMN D = TWITTER
   For Each link In links
      If InStr(UCase(link.outerHTML), "FACEBOOK") Then
      website.Offset(0, 3).Value = link.href
   End If
'''COLUMN E = TWITTER
    If InStr(UCase(link.outerHTML), "TWITTER") Then
        website.Offset(0, 4).Value = link.href
    End If

Next
    End If
    Set website = Nothing
Else
'''Debug.Print "Error loading page IN COLUMN H"
    website.Offset(0, 8).Value = "Error with website address"
    End If
On Error GoTo 0
 End With
row = row + 1
Loop

Complete.Show '#### THIS FORM DOES NOT SHOW AT THE END ####
''' CLOSE BROWSER
IE.Quit
Set IE = Nothing
Set ElementCol = Nothing

End Sub
 

Excel Facts

Do you hate GETPIVOTDATA?
Prevent GETPIVOTDATA. Select inside a PivotTable. In the Analyze tab of the ribbon, open the dropown next to Options and turn it off
Hi

I am still stuck on this, If anyone can help then that would be super
 
Upvote 0
I have changed it to this which seems to work a bit faster and does not get stuck on URLS that are dead.

VBA Code:
Private Sub CommandButton1_Click()
Dim wb As Workbook
Dim wsSheet As Worksheet, links As Variant, link As Variant
Dim rw As Long
Dim regxp As New RegExp, post As Object, phone_list As Object, email_list As Object
Dim Html As New HTMLDocument

''''SHEET1 as sheet with URL
    Set wb = ThisWorkbook
    Set wsSheet = wb.Sheets("Sheet1")
   
    rw = wsSheet.Cells(wsSheet.Rows.Count, "A").End(xlUp).Row
    links = wsSheet.Range("A2:A" & rw)

For Each link In links
    'Set doc = NewHTMLDocument(CStr(link))
      Set Html = NewHTMLDocument(CStr(link))
With regxp
''' ########## Phone Numbers Pattern ###########
        .Pattern = "(?:\+1)?(?:\+[0-9])?\(?([0-9]{4})\)?[-. ]?([0-9]{4})[-. ]?([0-9]{3}?)" '"(?:\+1)?(?:\+[0-9])?\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{3}?)"
        .Global = False
        .IgnoreCase = True
        Set phone_list = .Execute(Html.body.innerHtml)
''' ########## Email Pattern ###########
        .Pattern = "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)"
        .Global = False
       .IgnoreCase = True
        Set email_list = .Execute(Html.body.innerHtml)
    

'''########## PHONE LIST ############# ADD TO SHEET
On Error Resume Next
    If phone_list(0) Is Nothing Then
        On Error Resume Next
            Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = "-"
    Else
        On Error Resume Next
            Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = phone_list(0)
    End If
'''########## EMAIL LIST ############# ADD TO SHEET
On Error Resume Next
      If email_list(0) Is Nothing Then
        On Error Resume Next
            Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "C").End(xlUp).Row + 1, "C").Value = "-"
    Else
        On Error Resume Next
            Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "C").End(xlUp).Row + 1, "C").Value = email_list(0)
    End If
End With
''''navigate links
     Next link

End Sub

Public Function NewHTMLDocument(strURL As String) As Object
    Dim objHTTP As Object, objHTML As Object, strTemp As String
    Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
    objHTTP.Open "GET", strURL, False
    objHTTP.send
    If objHTTP.Status = 200 Then
        strTemp = objHTTP.responseText
    Set objHTML = CreateObject("htmlfile")
    objHTML.body.innerHtml = strTemp
    Set NewHTMLDocument = objHTML
Else
   'There has been an error
End If
End Function

I still need to include PARTS of this code in to it, but can not work it out

VBA Code:
'##################################################################################################
'######################################### SECOND PROCESSS ########################################
'##################################################################################################

Dim counter As Long
Dim website As Range
Dim row As Long
Dim continue As Boolean
Dim respHead As String

''''The row where website addresses start
row = 2
    continue = True

Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
    Do While continue
'''Could set this to first cell with URL then OFFSET columns to get next web site
Set website = ThisWorkbook.Worksheets("Sheet1").Range("A" & row)
        If Len(website.Value) < 1 Then
            continue = False
        Exit Sub
        End If

        If website Is Nothing Then
            continue = False
        End If

'''Debug.Print website
    With http
        On Error Resume Next
        .Open "GET", website.Value, False
        .send

'''If the website sent a valid response to our request, URLS ARE IN COLUMN A
    If Err.Number = 0 Then
        If .Status = 200 Then
            Html.body.innerHTML = http.responseText
                Set links = Html.getElementsByTagName("a")
'''COLUMN D = TWITTER
   For Each link In links
      If InStr(UCase(link.outerHTML), "FACEBOOK") Then
      website.Offset(0, 3).Value = link.href
   End If
'''COLUMN E = TWITTER
    If InStr(UCase(link.outerHTML), "TWITTER") Then
        website.Offset(0, 4).Value = link.href
    End If

Next
    End If
    Set website = Nothing
Else
'''Debug.Print "Error loading page IN COLUMN H"
    website.Offset(0, 8).Value = "Error with website address"
    End If
On Error GoTo 0
 End With
row = row + 1
Loop

Complete.Show '#### THIS FORM DOES NOT SHOW AT THE END ####
''' CLOSE BROWSER
IE.Quit
Set IE = Nothing
Set ElementCol = Nothing

End Sub

Main Part that I need is this
VBA Code:
            Html.body.innerHTML = http.responseText
                Set links = Html.getElementsByTagName("a")
'''COLUMN D = TWITTER
   For Each link In links
      If InStr(UCase(link.outerHTML), "FACEBOOK") Then
      website.Offset(0, 3).Value = link.href
   End If
'''COLUMN E = TWITTER
    If InStr(UCase(link.outerHTML), "TWITTER") Then
        website.Offset(0, 4).Value = link.href
    End If

Next
 
Upvote 0

Forum statistics

Threads
1,223,214
Messages
6,170,771
Members
452,353
Latest member
strainu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top