victorjagu
New Member
- Joined
- Dec 9, 2019
- Messages
- 7
- Office Version
- 2010
- Platform
- Windows
I'm trying to download a file from a protected webpage (from my job, so I cannot post the url).
When I save the bytes (I'm using a WinHTTP request), in an .xls, the file appears to be corrupted.
My code (without sensitive information) and a little code taken by "HTTP Header Live".
The code runs. I got "OK" in all checked status but when I try to open the file (that have 622kb, the same as if I download it manually), Excel prompts me with a message saying that there is too much data for a cell. If I click on "open anyway", I can read some parts.
It looks like the file is being saved in different encoding.
Some code from HTTP Header Live.
LOGIN:
https://www.website.com/retro/logincheck.asp
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 38
Origin: Create Your Website for Free | Free Site Builder | Website.com
Connection: keep-alive
Referer: https://www.website.com/retro/default.asp?idioma=ES
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
txtUser=MYUSER&txtpwd=MYPASS&lg=es
POST: HTTP/2.0 302 Found
date: Wed, 04 Dec 2019 22:20:17 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: text/html
expires: Wed, 04 Dec 2019 22:20:18 GMT
location: retro.asp
content-length: 130
via: 2.0 www.website.com
X-Firefox-Spdy: h2
FILE:
https://www.website.com/retro/VerBorderoGRxls.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Referer: https://www.website.com/retro/borderos_resumen.asp
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
GET: HTTP/2.0 200 OK
date: Wed, 04 Dec 2019 22:21:40 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: application/x-msexcel
expires: Wed, 04 Dec 2019 22:21:40 GMT
content-length: 637440
via: 2.0 www.website.com
X-Firefox-Spdy: h2
When I save the bytes (I'm using a WinHTTP request), in an .xls, the file appears to be corrupted.
My code (without sensitive information) and a little code taken by "HTTP Header Live".
VBA Code:
option explicit
Sub SaveFileFromURL()
Dim FileNum As Long
Dim FileData() As Byte
Dim WHTTP As Object
Dim SetCookieString As String
Dim mainUrl As String
Dim fileUrl As String
Dim SavePath As String
Dim strAuthenticate As String
'this is the url to login, extracted with HTTP Header Live.
'however, the url shown in the browser is: "https://www.website.com/retro/default.asp?idioma=ES"
mainUrl = "https://www.website.com/retro/logincheck.asp"
'this is the url for to download the file, but need to send a cookie as credentials
'(if you login via browser and paste the link, it will popup the saveas dialog, but if you don't login, it will return an internal error)
fileUrl = "https://www.website.com/retro/VerBordero.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls"
'the path where to save the file with the extension I know it will have.
SavePath = "C:\Users\Victor\Desktop\bordero.xls"
'authetication should be:
strAuthenticate = "txtUser=MYUSER&txtpwd=MYPASS&lg=es"
'I login to catch the cookie that it is suppose to allow me to download the file.
Set WHTTP = CreateObject("WinHTTP.WinHTTPrequest.5.1")
WHTTP.Open "POST", mainUrl, False
WHTTP.Send strAuthenticate
'if it is correct:
If WHTTP.Status = 200 Then
'I get the cookie
SetCookieString = WHTTP.getResponseHeader("Set-Cookie")
'Then you have to GET direct file url
WHTTP.Open "GET", fileUrl, False
WHTTP.setRequestHeader "Content-Type", "application/x-msexcel"
WHTTP.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
WHTTP.setRequestHeader "Accept-Language", "es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3"
WHTTP.setRequestHeader "Cookie", SetCookieString
WHTTP.Send
'if i get the file
If WHTTP.Status = 200 Then
'download bytes
FileData = WHTTP.responseBody
Set WHTTP = Nothing
'Save the bytes into file
FileNum = FreeFile
Open SavePath For Binary Access Write As #FileNum
Put #FileNum, 1, FileData
Close #FileNum
End If
End If
End Sub
The code runs. I got "OK" in all checked status but when I try to open the file (that have 622kb, the same as if I download it manually), Excel prompts me with a message saying that there is too much data for a cell. If I click on "open anyway", I can read some parts.
It looks like the file is being saved in different encoding.
Some code from HTTP Header Live.
LOGIN:
https://www.website.com/retro/logincheck.asp
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 38
Origin: Create Your Website for Free | Free Site Builder | Website.com
Connection: keep-alive
Referer: https://www.website.com/retro/default.asp?idioma=ES
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
txtUser=MYUSER&txtpwd=MYPASS&lg=es
POST: HTTP/2.0 302 Found
date: Wed, 04 Dec 2019 22:20:17 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: text/html
expires: Wed, 04 Dec 2019 22:20:18 GMT
location: retro.asp
content-length: 130
via: 2.0 www.website.com
X-Firefox-Spdy: h2
FILE:
https://www.website.com/retro/VerBorderoGRxls.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Referer: https://www.website.com/retro/borderos_resumen.asp
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
GET: HTTP/2.0 200 OK
date: Wed, 04 Dec 2019 22:21:40 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: application/x-msexcel
expires: Wed, 04 Dec 2019 22:21:40 GMT
content-length: 637440
via: 2.0 www.website.com
X-Firefox-Spdy: h2