• If you would like to post, please check out the MrExcel Message Board FAQ and register here. If you forgot your password, you can reset your password.
  • Excel articles and downloadable files provided in the articles have not been reviewed by MrExcel Publishing. Please apply the provided methods / codes and open the files at your own risk. If you have any questions regarding an article, please use the Article Discussion section.
D

Website Snapshots

Excel Version
  1. 365
  2. 2021
  3. 2019
  4. 2016
It is possible to take website snapshots quickly and easily with any chromium-based browser (this includes Chrome (obviously), MS Edge, Brave, and Opera) as they all include headless browser functionality - this means that you are able to use VBA give your browser instructions to generate screenshots (in PNG) and full-page PDF captures. Google set out how to go about it from the command line here, but I have cobbled together the following function to demonstrate how you could make use of this feature with VBA. The code below comprises four procedures:-

1. TestSnapshot - this is a basic subroutine demonstrating how to use the function. Please be sure to adjust the output path as necessary and change the URL to something you'd actually like to take a snapshot of;
2. GetSnapshot - this is the key procedure. It requires as arguments a target URL and an output path. By default, it will produce a PNG snapshot, but you can opt for a PDF capture instead. Also by default, it will use the Chrome browser;
3. DoubleQuote - a helper function to wrap a string in quotation marks, with the option of adding a trailing space; and
4. GetProgramLocation - returns to full location of the browser executable file.

all of which should be placed in a single module.

VBA Code:
    Option Explicit
    
    Enum SnapShotType
        ScreenShotPNG
        FullPagePDF
    End Enum
    
    Enum PreferredBrowser
        MSEdge
        Chrome
        Brave
    End Enum

    #If VBA7 Then
        Private Declare PtrSafe Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal DirPath As String) As Long
    #Else
        Private Declare Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal DirPath As String) As Long
    #End If

    Sub TestSnapshot()

        Dim URL As String, OutputPath As String

        URL = "https://news.microsoft.com/2001/04/11/farewell-clippy-whats-happening-to-the-infamous-office-assistant-in-office-xp/"
        OutputPath = "D:\TEMP\"

        Debug.Print GetSnapShot(URL, OutputPath, ScreenShotPNG, Chrome)

    End Sub

    Function GetSnapShot(ByVal TargetURL As String, _
                         ByVal OutputPath As String, _
                         Optional ByVal Snap As SnapShotType = ScreenShotPNG, _
                         Optional ByVal BrowserName As PreferredBrowser = Chrome)

        Const ARGUMENT = " --headless --disable-gpu --blink-settings=scriptEnabled=true --window-size=768,1280 "
        
        Dim BrowserPath As String, Browser As String, Filename As String, Extension As String, Ret As Long, PID As Long
        Dim CommandLine As String, CLArguments As String, AdditionalParameter As String

        Browser = Switch(BrowserName = MSEdge, "msedge.exe", BrowserName = Chrome, "chrome.exe", BrowserName = Brave, "brave.exe")
        BrowserPath = DoubleQuote(GetProgramLocation(Browser))
        
        If Len(Trim(BrowserPath)) <= 2 Then
            MsgBox "Unable to locate the designated browser." & vbNewLine & "Exiting the procedure.", _
            vbCritical Or vbOKOnly, "Cannot locate browser."
            Exit Function
        End If
        
        ' The MakeSureDirectoryPathExists API will check whether a given path exists, and if not, will create the necessary directories.
        Ret = MakeSureDirectoryPathExists(OutputPath)
        Extension = IIf(Snap = FullPagePDF, ".pdf", ".png")
        AdditionalParameter = IIf(Snap = FullPagePDF, "--print-to-pdf=", " --screenshot=")
        Filename = DoubleQuote(OutputPath & "SnapShot_" & Format(Now, "yyyymmdd-hhmmss") & Extension, True)
        CLArguments = ARGUMENT & AdditionalParameter & Filename & TargetURL
        CommandLine = BrowserPath & CLArguments
        
        ' Execute the instructions in a minimised window - other options include: vbHide, vbNormalFocus
        Debug.Print "Executing: " & CommandLine
        Shell CommandLine, vbMinimizedNoFocus
        
        GetSnapShot = mid(Trim(Filename), 2, Len(Trim(Filename)) - 2)
        
    End Function

    Function DoubleQuote(Optional ByVal SourceText As String, Optional ByVal TrailingSpace As Boolean = False)

        DoubleQuote = Chr(34) & SourceText & Chr(34) & IIf(TrailingSpace, Chr(32), vbNullString)

    End Function

    Function GetProgramLocation(ByVal ExeFilename As String)
        
        ' This function will check the registry for any registered applications with the given filename;
        ' it will return the full path if successful.
        Const REGISTRYADDRESS = "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\"
        
        On Error Resume Next
        GetProgramLocation = CreateObject("WScript.Shell").RegRead(REGISTRYADDRESS & ExeFilename & "\")

    End Function

Note that it may take a few moments for a browser to generate the file.

Improvements

The above function is just a basic demonstration of how to execute the headless browser functionality - once you become more familiar with the code, there are a number of improvements you can make to it:
  • For starters, you likely won't need to query the registry to find out the full path of your preferred browser every time you run this, so you can perhaps replace this GetProgramLocation function with a hardcoded path of the executable file.
  • Once you learn to trust yourself to provide an output path that actually exists, you can likely stop relying on the API call too - personally, I don't trust myself to do this correctly, so I'll probably keep on using the API ;-)
  • When I use this routine, it's during research projects, so I usually add the project name to any generated filenames in order to easily and quickly differentiate groups of files.
Alternatives

There are a multitude of other (and better) options, but this one doesn't require making API calls, or installing software or charge fees. Another option you tend to see on this forum is Selenium. On the one hand, it can be tricky to set up, it appears as though no-one is updating the software, and reviews of its user-friendliness are mixed; but on the other hand, it does provide greater control over the browser. If interested, here is a brief guide to installing and setting up Selenium on your computer, and here I reproduced the sample selenium script to generate a website snapshot.

As always, feedback and comments are encouraged and always appreciated. Thank you.
Author
Dan_W
Views
2,620
First release
Last update

Ratings

0.00 star(s) 0 ratings

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top