Save webpage hyperlink in excel to PDF

Indieswirl

New Member
Joined
Aug 29, 2022
Messages
9
Office Version
  1. 365
Platform
  1. Windows
Hi all,
Please note I am very new to VBA.
I have a list of hyperlinks in Column D and I found the below VBA code. It opens a web browser and give me a pop up to save the PDF. But it seems to have just googled the text in the cell rather than opening the link itself.

What i am really after is a way to:
1. Use the first link D2 which is to a webpage,
2. Save as PDF
3. Saved in location on my computer , hopefully a way to setup by the code,
3. Go to next row down D3 in same column and repeat. (I imagine this would be something like "D2:D63") and a loop of some sort.

Thanks in advance.

VBA Code:
Sub print_PDF()

    Dim Explorer As Object
    Dim eQuery As Long ' return value
    Dim i As Integer
    Dim fTime As Single
    
    Set Explorer = CreateObject("InternetExplorer.Application") ' Connect to Explorer
    Dim url As String
    url = ThisWorkbook.ActiveSheet.Range("D2").Value
    Explorer.Navigate url ' Open document from local or web!

TryAgain:

        'Wait 2 seconds to let IE load

        fTime = Timer

        Do While fTime > Timer - 2

            DoEvents

        Loop

        eQuery = Explorer.QueryStatusWB(6)  ' print command

        If eQuery And 2 Then

            Explorer.ExecWB 6, 2, "", ""   ' Print (6), displaying dialog (2)

            'Wait for 2 seconds while IE prints

            fTime = Timer

            Do While fTime > Timer - 2

                DoEvents

            Loop

        Else

            GoTo TryAgain

        End If

 

End Sub
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Hi @Indieswirl - as luck would have it, I wrote a routine and an article a few months ago on just this topic!

mrexcel.com/board/excel-articles/website-snapshots.55/

If your URLs are in column D, you could download all the pages as PDF files, and complete the full PDF File path in Column E, for example, with:

VBA Code:
Sub DownloadPDFFiles()
Dim TargetURL as String
Dim TargetRow as Long

For TargetRow = 2 to 63
    TargetURL = Cells(TargetRow, 4).Value
    Cells(TargetRow, 5).Value = GetSnapShot(TargetURL, "D:\OutputPath\", FullPagePDF)
Next

End Sub

Or something along those lines. It requires the use of a Chromium-based browser - that's Chrome, Edge, Opera, and Brave, I think. Would that help?

Oh, and of course it requires the code I set out in the article at the link :-)
 
Upvote 0
My hyperlinks have been changed to text, rather than the "https://blahblah.com" - is there a way instead of .value in the code that it will find the address associated with the link?
See below image for example. Column C is the scientific species name of a plant - the link is sitting under the species name - i have updated the text display in the top few rows to show the data behind the text.

Example_issue1.JPG
 
Upvote 0
A
My hyperlinks have been changed to text, rather than the "https://blahblah.com" - is there a way instead of .value in the code that it will find the address associated with the link?
See below image for example. Column C is the scientific species name of a plant - the link is sitting under the species name - i have updated the text display in the top few rows to show the data behind the text.

View attachment 73630
Note I updated the TargetURL = Cells(TargetRow, 4).Value to TargetURL = Cells(TargetRow, 3).Value -- so its generating screen shots of the right column.
But I'm not sure its just the hyperlink text, the
VBA Code:
 For TargetRow = 2 to 63
doesn't seem to be selecting the correct rows, its somewhat random when all the rows are listed (e.g. 2 To 63). (Knowing that it currently isnt registering the "text hyperlink" as anything and needs the web address).
For example in the first few rows it is reading the http:// and giving me a screen shot, but even when i change the numbers 2 To 4 for example, it would give me only the screen shot in row 4, i thought the 'To" function would complete the task for row 2,3,4.
 
Upvote 0
My hyperlinks have been changed to text, rather than the "https://blahblah.com" - is there a way instead of .value in the code that it will find the address associated with the link?
See below image for example. Column C is the scientific species name of a plant - the link is sitting under the species name - i have updated the text display in the top few rows to show the data behind the text.

View attachment 73630
Wasn't sure that you were going to get back to me.
You can try replacing .Value with .Hyperlinks(1).Address
 
Upvote 0
A

Note I updated the TargetURL = Cells(TargetRow, 4).Value to TargetURL = Cells(TargetRow, 3).Value -- so its generating screen shots of the right column.
But I'm not sure its just the hyperlink text, the
VBA Code:
 For TargetRow = 2 to 63
doesn't seem to be selecting the correct rows, its somewhat random when all the rows are listed (e.g. 2 To 63). (Knowing that it currently isnt registering the "text hyperlink" as anything and needs the web address).
For example in the first few rows it is reading the http:// and giving me a screen shot, but even when i change the numbers 2 To 4 for example, it would give me only the screen shot in row 4, i thought the 'To" function would complete the task for row 2,3,4.
What code are you using? I'm not very good at guessing.
 
Upvote 0
I am using your code from the first reply and the link to your other thread you sent me to. Also - The Hyperlink(1).Address worked.

It just seems to be skipping some of the rows. I was playing around with the number of rows and with this code it gave me 8 seemingly random screenshots - the correct link was used. Its just not doing from 2 to 30 for some reason..?

VBA Code:
Sub DownloadPDFFiles()

Dim TargetURL As String
Dim TargetRow As Long

For TargetRow = 2 To 30

    TargetURL = Cells(TargetRow, 3).Hyperlinks(1).Address
    Cells(TargetRow, 7).Value = GetSnapShot(TargetURL, "C:\Users\XXXX\Documents\VBA_Prints\New folder\Test\", FullPagePDF)


Next TargetRow

End Sub
Rich (BB code):
 
Upvote 0
It looks like i get one screenshot per millisecond, is there code to add into the "Website Snapshots code" (Added below) that adds in a wait (maybe one second or even one millisecond on the loop or something?

VBA Code:
    Option Explicit
    
    Enum SnapShotType
        ScreenShotPNG
        FullPagePDF
    End Enum
    
    Enum PreferredBrowser
        MSEdge
        Chrome
        Brave
    End Enum

    #If VBA7 Then
        Private Declare PtrSafe Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal DirPath As String) As Long
    #Else
        Private Declare Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal DirPath As String) As Long
    #End If

    Sub TestSnapshot()

        Dim URL As String, OutputPath As String

        URL = "https://news.microsoft.com/2001/04/11/farewell-clippy-whats-happening-to-the-infamous-office-assistant-in-office-xp/"
        OutputPath = "D:\TEMP\"

        Debug.Print GetSnapShot(URL, OutputPath, ScreenShotPNG, Chrome)

    End Sub

    Function GetSnapShot(ByVal TargetURL As String, _
                         ByVal OutputPath As String, _
                         Optional ByVal Snap As SnapShotType = ScreenShotPNG, _
                         Optional ByVal BrowserName As PreferredBrowser = Chrome)

        Const ARGUMENT = " --headless --disable-gpu --blink-settings=scriptEnabled=true --window-size=768,1280 "
        
        Dim BrowserPath As String, Browser As String, Filename As String, Extension As String, Ret As Long, PID As Long
        Dim CommandLine As String, CLArguments As String, AdditionalParameter As String

        Browser = Switch(BrowserName = MSEdge, "msedge.exe", BrowserName = Chrome, "chrome.exe", BrowserName = Brave, "brave.exe")
        BrowserPath = DoubleQuote(GetProgramLocation(Browser))
        
        If Len(Trim(BrowserPath)) <= 2 Then
            MsgBox "Unable to locate the designated browser." & vbNewLine & "Exiting the procedure.", _
            vbCritical Or vbOKOnly, "Cannot locate browser."
            Exit Function
        End If
        
        ' The MakeSureDirectoryPathExists API will check whether a given path exists, and if not, will create the necessary directories.
        Ret = MakeSureDirectoryPathExists(OutputPath)
        Extension = IIf(Snap = FullPagePDF, ".pdf", ".png")
        AdditionalParameter = IIf(Snap = FullPagePDF, "--print-to-pdf=", " --screenshot=")
        Filename = DoubleQuote(OutputPath & "SnapShot_" & Format(Now, "yyyymmdd-hhmmssms") & Extension, True)
        CLArguments = ARGUMENT & AdditionalParameter & Filename & TargetURL
        CommandLine = BrowserPath & CLArguments
        
        ' Execute the instructions in a minimised window - other options include: vbHide, vbNormalFocus
        Debug.Print "Executing: " & CommandLine
        Shell CommandLine, vbMinimizedNoFocus
        
        GetSnapShot = mid(Trim(Filename), 2, Len(Trim(Filename)) - 2)
        
    End Function

    Function DoubleQuote(Optional ByVal SourceText As String, Optional ByVal TrailingSpace As Boolean = False)

        DoubleQuote = Chr(34) & SourceText & Chr(34) & IIf(TrailingSpace, Chr(32), vbNullString)

    End Function

    Function GetProgramLocation(ByVal ExeFilename As String)
        
        ' This function will check the registry for any registered applications with the given filename;
        ' it will return the full path if successful.
        Const REGISTRYADDRESS = "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\"
        
        On Error Resume Next
        GetProgramLocation = CreateObject("WScript.Shell").RegRead(REGISTRYADDRESS & ExeFilename & "\")

    End Function
 
Upvote 0
Fixed it- Thanks @Dan_W
Final code below!

VBA Code:
    Option Explicit
    
    Enum SnapShotType
        ScreenShotPNG
        FullPagePDF
    End Enum
    
    Enum PreferredBrowser
        MSEdge
        Chrome
        Brave
    End Enum

    #If VBA7 Then
        Private Declare PtrSafe Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal DirPath As String) As Long
    #Else
        Private Declare Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal DirPath As String) As Long
    #End If

   
   

    Function GetSnapShot(ByVal TargetURL As String, _
                         ByVal OutputPath As String, _
                         Optional ByVal Snap As SnapShotType = ScreenShotPNG, _
                         Optional ByVal BrowserName As PreferredBrowser = Chrome)
                         

        Const ARGUMENT = " --headless --disable-gpu --blink-settings=scriptEnabled=true --window-size=768,1280 "
        
        Dim BrowserPath As String, Browser As String, Filename As String, Extension As String, Ret As Long, PID As Long
        Dim CommandLine As String, CLArguments As String, AdditionalParameter As String

        Browser = Switch(BrowserName = MSEdge, "msedge.exe", BrowserName = Chrome, "chrome.exe", BrowserName = Brave, "brave.exe")
        BrowserPath = DoubleQuote(GetProgramLocation(Browser))
        
        If Len(Trim(BrowserPath)) <= 2 Then
            MsgBox "Unable to locate the designated browser." & vbNewLine & "Exiting the procedure.", _
            vbCritical Or vbOKOnly, "Cannot locate browser."
            Exit Function
        End If
        
        ' The MakeSureDirectoryPathExists API will check whether a given path exists, and if not, will create the necessary directories.
        Ret = MakeSureDirectoryPathExists(OutputPath)
        Extension = IIf(Snap = FullPagePDF, ".pdf", ".png")
        AdditionalParameter = IIf(Snap = FullPagePDF, "--print-to-pdf=", " --screenshot=")
        Filename = DoubleQuote(OutputPath & "SnapShot_" & Format(Now, "yyyymmdd-hhmmssmss") & Extension, True)
        CLArguments = ARGUMENT & AdditionalParameter & Filename & TargetURL
        CommandLine = BrowserPath & CLArguments
    
        Application.Wait (Now + TimeValue("0:00:01"))
        
        ' Execute the instructions in a minimised window - other options include: vbHide, vbNormalFocus
        Debug.Print "Executing: " & CommandLine
        Shell CommandLine, vbMinimizedNoFocus
        
        GetSnapShot = Mid(Trim(Filename), 2, Len(Trim(Filename)) - 2)
        
    End Function

    Function DoubleQuote(Optional ByVal SourceText As String, Optional ByVal TrailingSpace As Boolean = False)

        DoubleQuote = Chr(34) & SourceText & Chr(34) & IIf(TrailingSpace, Chr(32), vbNullString)

    End Function

    Function GetProgramLocation(ByVal ExeFilename As String)
        
        ' This function will check the registry for any registered applications with the given filename;
        ' it will return the full path if successful.
        Const REGISTRYADDRESS = "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\"
        
        On Error Resume Next
        GetProgramLocation = CreateObject("WScript.Shell").RegRead(REGISTRYADDRESS & ExeFilename & "\")

    End Function

Sub DownloadPDFFiles()

Dim TargetURL As String
Dim TargetRow As Long

 'Update number for how many rows 
For TargetRow = 2 To 63

    TargetURL = Cells(TargetRow, 3).Hyperlinks(1).Address
    Cells(TargetRow, 7).Value = GetSnapShot(TargetURL, "C:\Users\XXXX\Documents\VBA_Prints\New folder\", FullPagePDF)


Next TargetRow

End Sub
 
Upvote 0

Forum statistics

Threads
1,223,886
Messages
6,175,196
Members
452,616
Latest member
intern444

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top