Can't extract URL from HTML

Sharid

Well-known Member
Joined
Apr 22, 2007
Messages
1,066
Office Version
  1. 2016
Platform
  1. Windows
Hi

Can someone please help me I am trying to get at href link from some html code (as show in GREEN below), Most of my code does what I wants but does not get the right info. This is the HTML bit

HTML:
<div >
        <div class="si-inner">
            <div class="si-content ">
                <div class="bdg-90">
                            <div class="mbg">
        <a href="http://www.mylink.co.uk/usr/spigenuk?_trksid=p2047675.l2559" aria-label="Member ID: spigenuk" id="mbgLink"> <span class="mbg-nw">spigenuk</span></a>
        <span class="mbg-l">

I have tweaked and re-tweaked my code and now feel I might have messed some thing up and can't work out what.

I have highlighted the key bit in RED and Blue below.

There is a list of URL on sheet2 "URL LIST" that the code checks and pastes the data extracted into Sheet1

Code:
Private Sub CommandButton8_Click()

[COLOR=#ff8c00]'Count url in sheet2[/COLOR]
With Worksheets("URL LIST")
    lastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
    Sheets("URL LIST").Range("L1").Value = lastRow
    End With
  
[COLOR=#ff8c00]' Run main code[/COLOR]
Dim wb As Workbook
Dim x As Variant
Dim i, j, k, l As Integer
Dim r As Long, lr As Long
Dim wsSheet As Worksheet, links As Variant, ie As Object, link As Variant
Dim rw As Long

    i = 2
    k = 2
    l = 2
    [COLOR=#ff8c00]'SHEET2 as sheet with URL[/COLOR]
    Set wb = ThisWorkbook
    Set wsSheet = wb.Sheets("URL LIST")
    
    [COLOR=#ff8c00]'Set IE = InternetExplorer[/COLOR]
    Set ie = CreateObject("InternetExplorer.Application")
    
    rw = wsSheet.Cells(wsSheet.Rows.Count, "A").End(xlUp).Row
    links = wsSheet.Range("A1:A" & rw)
    
 [COLOR=#ff8c00]   'IE Open Time per page 5sec and check links on Sheet2 Column A[/COLOR]
    With ie
       .Visible = True
       Application.Wait (Now + TimeValue("00:00:5"))
       
       For Each link In links
           .navigate (link)
           While .Busy Or .READYSTATE <> 4: DoEvents: Wend

Dim doc As HTMLDocument [COLOR=#ff8c00]'variable for document or data which need to be extracted out of webpage[/COLOR]
Set doc = ie.document
Dim dd As Variant
On Error Resume Next
[COLOR=#ff0000]dd = doc.getElementsByClassName("[/COLOR][COLOR=#000080]mbg[/COLOR][COLOR=#ff0000]")(0).innerText[/COLOR]
On Error Resume Next

[COLOR=#ff8c00]'Paste in this sheet[/COLOR]
Sheet1.Cells(Sheet1.Cells(Sheet1.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = dd
         
[COLOR=#ff8c00]'Deletes duplicates in column A Sheet1[/COLOR]
    Columns(1).RemoveDuplicates Columns:=Array(1)
    
    
 [COLOR=#ff8c00]' Put no1 in sheet2 column F[/COLOR]
  Sheets("URL LIST").Range("F" & Rows.Count).End(xlUp).Offset(1, 0).Value = 1
 
[COLOR=#ff8c00] 'Count No1 in sheet2 Column F[/COLOR]
With Worksheets("URL LIST")
    lastRow = .Cells(.Rows.Count, "F").End(xlUp).Row
    Sheets("URL LIST").Range("L2").Value = lastRow
    End With
  Call CommandButton9_Click

[COLOR=#ff8c00]'navigate links[/COLOR]
       Next link
       
[COLOR=#ff8c00]'Close IE Browser[/COLOR]
    .Quit
    End With
    
    Set ie = Nothing
    
End Sub

Thanks for having a look
 

Excel Facts

What is the last column in Excel?
Excel columns run from A to Z, AA to AZ, AAA to XFD. The last column is XFD.
For the record, this is what I am after not what is stated in the first post. I did not realise that when i placed it in the html container it will turn most of it blue.
I have not put it in the html container this time

I need the bit in green

"http://www.mylink.co.uk/usr/spigenuk?_trksid=p2047675.l2559" aria-label="Member ID: spigenuk" id="mbgLink"> spigenuk
 
Last edited:
Upvote 0
Can anyone help on this one I am really stuck.

Thanks
 
Upvote 0
John_w

Thank you very much. :biggrin::biggrin::biggrin:

You are a God among us mortals
 
Upvote 0

Forum statistics

Threads
1,223,214
Messages
6,170,771
Members
452,353
Latest member
strainu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top