Downloading files from protected website with VBA-Excel

victorjagu

New Member
Joined
Dec 9, 2019
Messages
7
Office Version
  1. 2010
Platform
  1. Windows
I'm trying to download a file from a protected webpage (from my job, so I cannot post the url).
When I save the bytes (I'm using a WinHTTP request), in an .xls, the file appears to be corrupted.
My code (without sensitive information) and a little code taken by "HTTP Header Live".


VBA Code:
option explicit

Sub SaveFileFromURL()

Dim FileNum As Long
Dim FileData() As Byte
Dim WHTTP As Object
Dim SetCookieString As String
Dim mainUrl As String
Dim fileUrl As String
Dim SavePath As String
Dim strAuthenticate As String

'this is the url to login, extracted with HTTP Header Live.
'however, the url shown in the browser is: "https://www.website.com/retro/default.asp?idioma=ES"
mainUrl = "https://www.website.com/retro/logincheck.asp"

'this is the url for to download the file, but need to send a cookie as credentials
'(if you login via browser and paste the link, it will popup the saveas dialog, but if you don't login, it will return an internal error)
fileUrl = "https://www.website.com/retro/VerBordero.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls"

'the path where to save the file with the extension I know it will have.
SavePath = "C:\Users\Victor\Desktop\bordero.xls"

'authetication should be:
strAuthenticate = "txtUser=MYUSER&txtpwd=MYPASS&lg=es"

'I login to catch the cookie that it is suppose to allow me to download the file.
Set WHTTP = CreateObject("WinHTTP.WinHTTPrequest.5.1")
WHTTP.Open "POST", mainUrl, False
WHTTP.Send strAuthenticate

'if it is correct:
If WHTTP.Status = 200 Then

'I get the cookie
SetCookieString = WHTTP.getResponseHeader("Set-Cookie")

'Then you have to GET direct file url
WHTTP.Open "GET", fileUrl, False
WHTTP.setRequestHeader "Content-Type", "application/x-msexcel"
WHTTP.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
WHTTP.setRequestHeader "Accept-Language", "es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3"
WHTTP.setRequestHeader "Cookie", SetCookieString
WHTTP.Send

'if i get the file
If WHTTP.Status = 200 Then

'download bytes
FileData = WHTTP.responseBody
Set WHTTP = Nothing

'Save the bytes into file
FileNum = FreeFile
Open SavePath For Binary Access Write As #FileNum
Put #FileNum, 1, FileData
Close #FileNum

End If

End If

End Sub

The code runs. I got "OK" in all checked status but when I try to open the file (that have 622kb, the same as if I download it manually), Excel prompts me with a message saying that there is too much data for a cell. If I click on "open anyway", I can read some parts.
It looks like the file is being saved in different encoding.
Some code from HTTP Header Live.
LOGIN:
https://www.website.com/retro/logincheck.asp
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 38
Origin: Create Your Website for Free | Free Site Builder | Website.com
Connection: keep-alive
Referer: https://www.website.com/retro/default.asp?idioma=ES
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
txtUser=MYUSER&txtpwd=MYPASS&lg=es
POST: HTTP/2.0 302 Found
date: Wed, 04 Dec 2019 22:20:17 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: text/html
expires: Wed, 04 Dec 2019 22:20:18 GMT
location: retro.asp
content-length: 130
via: 2.0 www.website.com
X-Firefox-Spdy: h2
FILE:
https://www.website.com/retro/VerBorderoGRxls.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Referer: https://www.website.com/retro/borderos_resumen.asp
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
GET: HTTP/2.0 200 OK
date: Wed, 04 Dec 2019 22:21:40 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: application/x-msexcel
expires: Wed, 04 Dec 2019 22:21:40 GMT
content-length: 637440
via: 2.0 www.website.com
X-Firefox-Spdy: h2
 

Excel Facts

What do {} around a formula in the formula bar mean?
{Formula} means the formula was entered using Ctrl+Shift+Enter signifying an old-style array formula.
Welcome to MrExcel forums.

As you might appreciate, it's very difficult to help you without seeing the complete headers that a browser sends and receives, and which you must emulate exactly with this WinHttpRequest method.
VBA Code:
WHTTP.Open "POST", mainUrl, False
WHTTP.Send strAuthenticate

LOGIN:
https://www.website.com/retro/logincheck.asp
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
In the code above, you're not setting the Content-Type header, and the Send argument should be in brackets, since you're using late binding of WinHttpRequest.
VBA Code:
WHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
WHTTP.send (strAuthenticate)
See these other threads for examples of using WinHttpRequest or XMLhttpRequest to download files:




I also recommend using Fiddler to analyse the http requests - you can compare the requests that a browser sends and receives with those of your VBA code and look for differences.

If you can't get the WinHttpRequest method to work, then try the IE and UIAutomation method, as detailed in Using UIAutomationClient to automate the Save As file download in IE11
 
Last edited:
Upvote 0
Thank you very much John_w. I've seen your suggestions without success.

The thing is that the file is actually downloaded but it is corrupted, I mean when I open the Excel file or even as a txt file, I am able to read some parts of the file, but others are incorrectly shown. Can it be a enconding problem?

I am going to try with "Fiddler" and I will post back with any new progresses.

Thanks again
 
Upvote 0
Hello again!

After several tries, I could navigate with IE11 and clic the button download using VBA.

I can download pdf files successfully, but when I try to download an Excel file, it prompt me dialog that I can't bypass (I don't want to use application.sendkeys).

The prompt is:
Sin título.png


I have tried with FindWindowEx and GetWindow and all these, but I can't find the button "save as" ("guardar como").

Any ideas of what kind of popup is and how could I attack it?

thank you!
 
Upvote 0
That pop-up window looks the same as the "What do you want to do with xxxxx.xxx?" problem I described in the UIAutomation thread:


You can either use the code in that post to handle the dialogue windows (it handles the normal IE download notification bar and this odd pop-up window) or do the equivalent using FindWindow, GetWindow, etc.
 
Upvote 0
Wow, that's amazing! I will try to understand it and come back again.

Thank you very much for your support!!!
 
Upvote 0
HoHo, it works great!!! thank you very much.

Your code works perfectly when I replace the english text captions with the spanish ones, so perfect.

I was even able to make it work with winhttp. You gave me the tip, the url had a little thing wrong.

Thanks again.

best regards
 
Upvote 0
Excellent! Thanks for the update and pleased that the code also works for you.

Yes, I expected you would need to change some of the English strings in the code to Spanish, such as
VBA Code:
            p1 = InStr(IEdialogueTextString, "What do you want to do with ")
Did you also need to change any of the control names? For example:

VBA Code:
    Set ControlName = UIAutomation.CreatePropertyCondition(UIA_NamePropertyId, "Notification")
    Set ControlName = UIAutomation.CreatePropertyCondition(UIA_NamePropertyId, "Internet Explorer")
        Set ControlName = UIAutomation.CreatePropertyCondition(UIA_NamePropertyId, "Save as")
If you would post the lines you changed to Spanish then I can produce a language-independent version of the code with a section which defines the strings for each language.
 
Upvote 0
Of course!!! =)

I have only use it for the "what do you want to do with xxxxx?" popup, so I guess it would be the same with the other ones.

Unfortunately, I changed all of the English-text into Spanish (even the controls) to make it work (in fact, I download "Accessibility Insights" to check some "text-names".

VBA Code:
Set ControlName = UIAutomation.CreatePropertyCondition(UIA_NamePropertyId, "Nombre:")    'instead of: "File name:""'

VBA Code:
Set ControlName = UIAutomation.CreatePropertyCondition(UIA_NamePropertyId, "Notificación")    'instead of "Notification"'

Code:
Set ControlName = UIAutomation.CreatePropertyCondition(UIA_NamePropertyId, "Texto de la barra de notificación")    'instead of: Notification bar Text

Finally, in this loop I had to change the "instr" with a "like" because in Spanish the message was a little bit different.
VBA Code:
        'Find the Notification bar Text element in the Frame Notification Bar pane and wait until it contains "download has completed"
        
        NotificationBarTextString = ""
        Do
            Set NotificationBarText = NotificationToolbar.FindFirst(TreeScope_Children, NameAndType)
            DoEvents
            Sleep 200
            If Not NotificationBarText Is Nothing Then
                NotificationBarTextString = NotificationBarText.GetCurrentPropertyValue(UIA_ValueValuePropertyId)
            End If
            If DebugMode Then Debug.Print Time; NotificationBarTextString
        Loop Until NotificationBarTextString Like "La descarga de*se completó."      'here is the change

Other minor language-fixes were:
Save=Guardar
Save as = Guardar como
Open = Abrir
Close = Cerrar
Cancel = Cancelar
What do you want to do with = "¿Qué quieres hacer con"

The "Internet Explorer" text does not change.

Does this help?
Tell me if I can help you somehow, please.
 
Upvote 0
This is very helpful, thanks. I have now added language-dependent string definitions (English or Spanish) to the UIAutomation download function:


Finally, in this loop I had to change the "instr" with a "like" because in Spanish the message was a little bit different.

VBA Code:
        'Find the Notification bar Text element in the Frame Notification Bar pane and wait until it contains "download has completed"  

        NotificationBarTextString = ""
        Do
            Set NotificationBarText = NotificationToolbar.FindFirst(TreeScope_Children, NameAndType)
            DoEvents
            Sleep 200
            If Not NotificationBarText Is Nothing Then
                NotificationBarTextString = NotificationBarText.GetCurrentPropertyValue(UIA_ValueValuePropertyId)
            End If
        Loop Until NotificationBarTextString Like "La descarga de*se completó."      'here is the change
This won't work as I intended because the code is meant to extract the file name of the downloaded file from the Notification bar, by using Instr to find the text before and after the file name. For example, if the Notification bar text is "The xxxx yyyy.zzz download has completed", the code looks for "The " and " download has completed" and extracts the file name "xxxx yyyy.zzz". In Spanish, I believe the equivalent is "La descarga de xxxx yyyy.zzz se completó". Could you confirm this please?
 
Upvote 0

Forum statistics

Threads
1,224,813
Messages
6,181,109
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top