Extracting Hyperlinks

SAM2203

New Member
Joined
Apr 28, 2011
Messages
11
Dear All,

I have a problem and I hope that you guys will be a great help. I have a column of hyperlinks (around 500 entries) now I want to extract hyperlinks of all entries in to a adjacent column.

What is the easiest way of doing it besides manually copying the links in to the destination cells


Thanks
 
Hello SAM2203,

Sorry for the delay, I was having some internet issues. Here is the macro. It looks in column "A" starting with row and extracts the URL from the hyperlink. The URL is place in column "B" of the same row. You can change "A1" (marked in bold) to whichever cell you want to start with. If you want to place the URL in any column other than the column to the right of the hyperlink, let me know.
Rich (BB code):
Sub ExtractURL()

  Dim Cell As Range
  Dim Rng As Range
  Dim RngEnd As Range
  Dim Wks As Worksheet
  
    Set Wks = ActiveSheet
    
    Set Rng = Wks.Range("A1")
    Set RngEnd = Wks.Cells(Rows.Count, Rng.Column).End(xlUp)
    If RngEnd.Row < Rng.Row Then Exit Sub Else Set Rng = Wks.Range(Rng, RngEnd)
    
      For Each Cell In Rng
        If Cell.Hyperlinks.Count <> 0 Then
           Cell.Offset(0, 1) = Cell.Hyperlinks(1).Address
        End If
      Next Cell
    
End Sub
Sincerely,
Leith Ross

Thanks a lot ! Saved a lot of time !!!...

Please guide on the following as well

Let say for the following URL <table style="width: 507px; height: 36px;" border="0" cellpadding="0" cellspacing="0"><col width="328"><tr height="20"> <td style="height: 15pt; width: 246pt;" width="328" height="20">http://www.conferencealerts.com/seeconf.mv?q=ca16sa83
I want to extract certain fields like website name, organization name etc in to excel sheet. All hyperlinks we just extracted have the same format.
How can

1. Website & "organized by" can be extracted

2. and guide me on the start up tutorial on writing macros

Thanks again
SAM
</td> </tr></table>
 
Upvote 0

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.
Hello Sam,

I think I follow you. Can you give me a couple of examples to be sure?

Sincerely,
Leith Ross
 
Upvote 0
Hello Sam,

I think I follow you. Can you give me a couple of examples to be sure?

Sincerely,
Leith Ross

Example 1

URL: http://www.conferencealerts.com/seeconf.mv?q=ca1ih6h3

Things to be extracted

Website: http://www.insightinfo.com/chroniccare
Organized by:
Insight Information

Example 2

URL: http://www.conferencealerts.com/seeconf.mv?q=ca16sa83
Website: http://www.isquareit.ac.in/biotechworkshop
Organized by: School of Biotechnology, International Institute of Information Technology, Pune



I have the URL (All thanks to you) now on that URL there are two things "Website" and "Organized by" that I am interested in. I want these two things to be in separate columns in excel
 
Upvote 0
Hello Sam,

Thanks for the information. I can extract that information for you. Where do you want the information copied to?

Sincerely,
Leith Ross
 
Upvote 0
Hello Sam,

Thanks for the information. I can extract that information for you. Where do you want the information copied to?

Sincerely,
Leith Ross

in the same excel sheet. Can you mail me you email address so I can send you the the spreadsheet. IF that is okay with you !
 
Upvote 0
Hello Sam,

That would be a big help. I will send you email address by private message.
 
Upvote 0
Hello Sami,

For the benefit of all others, here is the macro code...
Rich (BB code):
'Written: March 15, 2011
'Author:  Leith Ross

Public PageSource As String
Public httpRequest As Object

Function GetURLStatus(ByVal URL As String, Optional AllowRedirects As Boolean)

  Const WinHttpRequestOption_UserAgentString = 0
  Const WinHttpRequestOption_EnableRedirects = 6
  
  
    On Error Resume Next
       Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
         If httpRequest Is Nothing Then
            Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5")
         End If
       Err.Clear
    On Error GoTo 0

    httpRequest.Option(WinHttpRequestOption_UserAgentString) = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
    httpRequest.Option(WinHttpRequestOption_EnableRedirects) = AllowRedirects


  'Clear any pervious web page source information
    PageSource = ""
    
        'Add protocol if missing
          If InStr(1, URL, "://") = 0 Then
             URL = "http://" & URL
          End If

          'Launch the HTTP httpRequest synchronously
            On Error Resume Next
              httpRequest.Open "GET", URL, False
              If Err.Number <> 0 Then
                'Handle connection errors
                  GetURLStatus = Err.Description
                  Err.Clear
                  Exit Function
              End If
            On Error GoTo 0
           
          'Send the http httpRequest for server status
            On Error Resume Next
              httpRequest.Send
              httpRequest.WaitForResponse
              If Err.Number <> 0 Then
                ' Handle server errors
                  PageSource = "Error"
                  GetURLStatus = Err.Description
                  Err.Clear
              Else
                'Show HTTP response info
                  GetURLStatus = httpRequest.Status & " - " & httpRequest.StatusText
                'Save the web page text
                  PageSource = httpRequest.responsetext
              End If
            On Error GoTo 0
            
End Function
Rich (BB code):
Sub GetWebsiteAndOrganizer()

  Dim Cell As Range
  Dim I As Long, N As Long
  Dim Organizer As String
  Dim Rng As Range
  Dim RngEnd As Range
  Dim Status As String
  Dim Wks As Worksheet
  Dim Website As String
  
    Set Wks = Worksheets("Sheet1")
    
    Set Rng = Wks.Range("B2")
    Set RngEnd = Wks.Cells(Rows.Count, Rng.Column).End(xlUp)
    If RngEnd.Row < Rng.Row Then Exit Sub Else Set Rng = Wks.Range(Rng, RngEnd)
    
      For Each Cell In Rng
        Status = GetURLStatus(Cell.Text, True)
        If UCase(Status) Like "*OK*" Then
           I = InStr(1, PageSource, "Website: ")
           Website = Mid(PageSource, I, N - I)
           
           I = InStr(N, PageSource, "Contact name: ") + 18
           N = InStr(I, PageSource, "<")
           Organizer = Mid(PageSource, I, N - I)
           
           Cell.Offset(0, 1).Resize(1, 2).Value = Array(Website, Organizer)
        Else
           Cell.Offset(0, 1) = Status
        End If
      Next Cell
      
End Sub
http://www.mrexcel.com/forum/) + 21<br />            N = InStr(I, PageSource,
 
Last edited:
Upvote 0
Hello Sami,

For the benefit of all others, here is the macro code...
Rich (BB code):
'Written: March 15, 2011
'Author:  Leith Ross

Public PageSource As String
Public httpRequest As Object

Function GetURLStatus(ByVal URL As String, Optional AllowRedirects As Boolean)

  Const WinHttpRequestOption_UserAgentString = 0
  Const WinHttpRequestOption_EnableRedirects = 6
  
  
    On Error Resume Next
       Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
         If httpRequest Is Nothing Then
            Set httpRequest = CreateObject("WinHttp.WinHttpRequest.5")
         End If
       Err.Clear
    On Error GoTo 0

    httpRequest.Option(WinHttpRequestOption_UserAgentString) = "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"
    httpRequest.Option(WinHttpRequestOption_EnableRedirects) = AllowRedirects


  'Clear any pervious web page source information
    PageSource = ""
    
        'Add protocol if missing
          If InStr(1, URL, "://") = 0 Then
             URL = "http://" & URL
          End If

          'Launch the HTTP httpRequest synchronously
            On Error Resume Next
              httpRequest.Open "GET", URL, False
              If Err.Number <> 0 Then
                'Handle connection errors
                  GetURLStatus = Err.Description
                  Err.Clear
                  Exit Function
              End If
            On Error GoTo 0
           
          'Send the http httpRequest for server status
            On Error Resume Next
              httpRequest.Send
              httpRequest.WaitForResponse
              If Err.Number <> 0 Then
                ' Handle server errors
                  PageSource = "Error"
                  GetURLStatus = Err.Description
                  Err.Clear
              Else
                'Show HTTP response info
                  GetURLStatus = httpRequest.Status & " - " & httpRequest.StatusText
                'Save the web page text
                  PageSource = httpRequest.responsetext
              End If
            On Error GoTo 0
            
End Function
Rich (BB code):
Sub GetWebsiteAndOrganizer()

  Dim Cell As Range
  Dim I As Long, N As Long
  Dim Organizer As String
  Dim Rng As Range
  Dim RngEnd As Range
  Dim Status As String
  Dim Wks As Worksheet
  Dim Website As String
  
    Set Wks = Worksheets("Sheet1")
    
    Set Rng = Wks.Range("B2")
    Set RngEnd = Wks.Cells(Rows.Count, Rng.Column).End(xlUp)
    If RngEnd.Row < Rng.Row Then Exit Sub Else Set Rng = Wks.Range(Rng, RngEnd)
    
      For Each Cell In Rng
        Status = GetURLStatus(Cell.Text, True)
        If UCase(Status) Like "*OK*" Then
           I = InStr(1, PageSource, "Website: ")
           Website = Mid(PageSource, I, N - I)
           
           I = InStr(N, PageSource, "Contact name: ") + 18
           N = InStr(I, PageSource, "<")
           Organizer = Mid(PageSource, I, N - I)
           
           Cell.Offset(0, 1).Resize(1, 2).Value = Array(Website, Organizer)
        Else
           Cell.Offset(0, 1) = Status
        End If
      Next Cell
      
End Sub

Thanks loads ! It is such a pleasure to see such help forums and their all available members. Just one thing, unfortunately the macro extracts the wrong field. I requested "Organized by" but Contact name is being extracted. I guess it some little editing would be required.

Example:
for URL: http://www.conferencealerts.com/seeconf.mv?q=ca16sa83

The macro extracts: "Contact name: Prof. Dr. Sheo Mohan Singh"

My requirement was "
Organized by: School of Biotechnology, International Institute of Information Technology, Pune"
 
Upvote 0
Hello Sami,

Can't believe i did that. I even named the variable "Organizer". Here is the updated code...
Rich (BB code):
'Written: April 28, 2011
'Author:  Leith Ross

Sub GetWebsiteAndOrganizer()
 
 'Updated version
 
  Dim Cell As Range
  Dim Cnt As Long
  Dim Data() As Variant
  Dim I As Long, N As Long
  Dim Organizer As String
  Dim Rng As Range
  Dim RngEnd As Range
  Dim Status As String
  Dim Wks As Worksheet
  Dim Website As String
  
    Set Wks = Worksheets("Sheet1")
    
    Set Rng = Wks.Range("B2")
    Set RngEnd = Wks.Cells(Rows.Count, Rng.Column).End(xlUp)
    If RngEnd.Row < Rng.Row Then Exit Sub Else Set Rng = Wks.Range(Rng, RngEnd)
    
      ReDim Data(Rng.Rows.Count - 1, 1)
      
      For Each Cell In Rng
        Status = GetURLStatus(Cell.Text, True)
        If UCase(Status) Like "*OK*" Then
           I = InStr(1, PageSource, "Website: ")
           Website = Mid(PageSource, I, N - I)
           
           I = InStr(N, PageSource, "Organized by: ") + 14
           N = InStr(I, PageSource, "<")
           Organizer = Mid(PageSource, I, N - I)
           
           Data(Cnt, 0) = Website
           Data(Cnt, 1) = Organizer
        Else
           Data(Cnt, 0) = Status
           Data(Cnt, 1) = ""
        End If
        Cnt = Cnt + 1
      Next Cell
      
    Rng.Offset(0, 1).Resize(Cnt, 2).Value = Data
    
End Sub
Sincerely,
Leith Ross


EDIT: Looks like either the colon or the word website causes the underlining. I will send a copy to you via email.
http://www.mrexcel.com/forum/) + 21<br />            N = InStr(I, PageSource,
 
Upvote 0

Forum statistics

Threads
1,224,587
Messages
6,179,738
Members
452,940
Latest member
Lawrenceiow

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top