Excel VBA automation for websites with Login

Ghouul

New Member
Joined
Oct 22, 2015
Messages
4
Hi everyone,

I have a similar problem to the following thread: http://www.mrexcel.com/forum/excel-questions/688091-excel-visual-basic-applications-automation-internet-explorer-web-site.html#post3404965

Explanation:
1. I have to download several csv-files daily from a website that needs authentification.
2. After that I have to tranform the results and put them into a database.

The real problem here is number 1.

Downloading files from websites without authentification was solved in the thread posted above.
But dealing with login is a bit different. My first thought was to include the cookie into the header of the post-request.
This works great if you have the cookie, but the only way I know to get this data is to inspect the network with fiddler after manually downloading the files.

But how can I login to the website with "post-requests" and get the cookie?
Or is there another way to solve the login issue?

I can also provide some of the raw data from fiddler if it helps.

Thanks in advance :)
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
I've never used the XMLhttp GET/POST request method with a site which requires a login. HTMLDocument.cookie contains the page's cookie, so request the page with a XMLhttp GET, put the response in a HTMLDocument object and access the .cookie property. You can then use the cookie string in subsequent GET/POST requests by specifying it in XMLhttp .setRequestHeader "Set-Cookie", HTMLdoc.cookie.

However, with the login aspect it is probably easier to write IE automation code to log into the site and download the files and then you don't need to bother with cookies. And if, once logged in, the csv files have direct URLs, you can download them with UrlDownloadToFile.
 
Upvote 0
At work I use this function:

URL is concatenated via a formula on the cover sheet, obviously, parts have been omitted and your URL will be different

URL ="https://xxxx.com/upload/monthend?Action=GetFile&UserName="&Login&"&Password="&Password&"ValuationDate="&TEXT(Value_Date, "MMMYY")&"&Type=Results&Name="&Temp_File&"&FileType=Excel"

Folderpath and Filename are generated elsewhere.

Code:
Private Function DownloadxxxxData(ByRef sFolderPath As String, ByRef sFileName As String, ByRef URL As String) As Boolean

Dim WinHttpReq  As Object
Dim oStream     As Object
Dim myURL       As String
Dim fso         As Object

    Set fso = CreateObject("Scripting.FileSystemObject")
    Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
    
    With WinHttpReq
        .Open "GET", URL, False
        .send
        myURL = .responseBody
    
        If .Status = 200 Then
            Set oStream = CreateObject("ADODB.Stream")
            With oStream
                .Open
                .Type = 1
                .Write WinHttpReq.responseBody
            End With
            
            'If file exists, delete
            With fso
                On Error Resume Next
                    If Not Len(Dir(sFolderPath, vbDirectory)) Then MkDir sFolderPath
                    If .FileExists(sFolderPath & sFileName) Then .DeleteFile sFolderPath & sFileName
                On Error GoTo 0
            End With
            
            'Save and close file
            With oStream
                .SaveToFile sFolderPath & sFileName
                .Close
            End With
            
            DownloadTotemData = True
            
        End If
    End With
    
    'Clear variables
    Set WinHttpReq = Nothing
    Set oStream = Nothing
    Set fso = Nothing
    
End Function
 
Upvote 0
Thanks for your comments so far :)

The thing is, using direct URLs to download the files won't work. Correct me if I'm wrong, but puting an URL into the adressbar, calls the "GET"-method. It seems that GET seems to not be permitted on that side.
This is why I want to use POST.

Here is my Raw data from fiddler. I put some XXX in because of sensitive data ;)

POST https://www.xxx.com/xxIndex/Bin/downloadindex.asp HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Referer: *******
Accept-Language: de-DE
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
Host: www.xxx.com
Content-Length: 442
DNT: 1
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: bac_persist=153523371.26655.0000; ASPSESSIONIDQCRQCRDR=LEKNOIEAIPADGBECJJEJCMNM; BIGipServerbaml_p_debt=338015403.20480.0000; ASPSESSIONIDSCSRDQBR=BAPOABDADENMEIIPGEBJIFBF; BIGipServerbaml_p_core=237352107.20480.0000; ASPSESSIONIDQARQBSAQ=KKPABBDAIOJOGBBBIKDBCIFH; ASPSESSIONIDACSTASDS=BGOAABDADLGGBOKJKHJJDMIF; dtCookie=F6EB08CF5664E29A68BF6D68297BDDA1|xxx.com|1|_default|1; dtPC=-; WT_FPC=id=218a7f4713c753916c91445518194146:lv=1445518252955:ss=1445518194146; dtLatC=33|51.5|15; SMSESSION=xF+RxFjEGPZHYPL8djfEWIAjysdyNjZhAvyiVG4sSNrUvnZnqgYTiHfcjppab/Lw4fjZKhXnkPuQU5fv3AP9hHBtc7F1rbpKDU7riwUxMF4ndduAmwjERJHQvlbreIDsB0MHUFnnR6ngdch7h0qpcFkxUKBh48sPnkgYAfaY+4SIsMCk38TsHGsbutfYzA4Tt8XmKgwfgryxuwvwcMD5GLBgEIPwUiC9PiJuAzHyZIGsHlAlu4HEFB7Jz2EYgfFDwtXgedaIJ6GaSSC3JJ8hs+8RYZ5l7FKmYDkuw0kqxk7yXFBV0gdx8MKMTNH1Ti1db6SG6JGPmheEXdJhN8yHLyvoo9JXO/iBgvBpUFMrKPZVOG5GP7Vp/7/rmIrkZDpyEXbrIyC/U4WPDHPkGkUcesGr2wxmc+ZALPUR6OMifpPRlhhYl2Gg7OcE3TQhX6TwnqFuA+LKJZ7OzfkngtPXW4WljHlhBHWhfxf2/xa0zvh/bwJ1KI6ngREnBjJbNmWq6XvCzKbbSX/dbSV6lqG1fHeyIEY3t5jLnHn731UJnYczScJF5HLYti0N0A3krdPEZ8IZrRn+5exn87pjNWmSADaSBz6uuhuXpMTo0jOMIwd26/BZGlkzm18eLAYbL0BpDHLiD2A3J/6FIAhK2NcMqR5QLBmc3ZZEa8vmH6wX9ldG0D51MLOeFJvGwdcXg92M8NgZwoerFLPYCk11CzoaoIduqSKI/AcI0ll79s4B9QFvASA1yraYzBuFbuxfCuWEVEWsLuMrldNt2m4I2MCsbRd+PYCe7g8one/+gqjGPl9X6RPx3XbIBHrjXJdW0DxrOC7SXh8Tz3a1Gd1EB9qXNGsxUxYfgkmEBjb/rEL7uidTEJZjveM1auPSHy37Zw70QA13Q6F6Vh+j6PEZcLCWUBHHJKluxxvFkA04nIqGqnAUyeBJQjm1fshBxctZxjIwcPYIoxmbqf0L8FenW024bAx99hU3h3vvRYHNswAS5YQr9h2SSz/0BDbHFhAqK1S2ClkfmzCxIh5gGzEhn57Li5E24I9g6m+N7Pd90u6KBsY3Vb9+cZMuNzoDoAPR4ItC3qkaj6fSzx1b3JqtJZeVhGy4hKsfqe8TwSuXxSLwk20y/YanuPQF64NjPNKt6iXAEXPsx/nW+CchzWjdT0/Y/9kXMpoX80tUZTV8RmjHTwN6aN/D3aH5nM4srkdUpSMEYzRHrye8LXB05e4x26B07GRDcnovuGuBtBOjHMeTu1rqx9IMvNqjLvyvcLMYSkqM; GENX_PRD_SMDATA=QdWsrjGnAyM8tHe1CZQDzVQnlX52ADUPvKHWCFHHI9R9B3/Clbmre76O6Oig12GnIbbK60kaeqDBmb0mLBKOa5EoWULNhbHvvtz9ltKVEoiXJUsJRhmpJ/gXhra5Eh2+W9teLeJD4J4xBG32OpUofK8VlInn04/w17JhOqFy6c4vFkD2YK0QlRNx2dUIhXxcms+Lxn3M8mHWKg2JpHNTO1Wzk//9ye6vdoh30HwkxBG1GW0oSW5zsQYzKrKwuME84ap89BZQbaUl90Tb74OXRTXpWJeKNGTIqWTv6evWla0QhfyRsGx4HcuLLU76vyAkAqwXI/PrJGpSVvVR2mMtqQ==; TCA_PRD=1; DMBRAND=MLX

datevalue=10%2F21%2F2015&hdnAsofdate=10%2F21%2F2015&hidCuspID=&ColState=&hdnSrcApp=&hdnPortID=&hdnMthEndDate=&hdnViewState=&txtIndex=ECV0&txtFromDate=10%2F21%2F2015&SelOldNew=1&cboConfiguration=13023&cboFilter=0&iPage=&IndexVal=&Param1=&strWhere=&Param2=&SortVariable=&SortVariable1=&SortVariable2=&SortVariable3=&SortBy1=&SortBy2=&SortBy3=&SortString=&SortType=&Download=&OrderBy=&ConfigId=13023&CriteriaID=&NoOfConfig=&Source=1&ProfileName=


The last part is the request send.

I tried to combine the request and adress to a complete URL. Entering it into the adressbar (when logged in) didn't work.

In addition: Using the following URLDownloadToFile-Sub gives me a strange result (downloading the source code of the website):

Sub URLDOWN()
Dim strUrl As String
strUrl = "https://www.xxx.com/MLIndex/Bin/downloadindex.asp?datevalue=10%2F22%2F2015&hdnAsofdate=10%2F22%2F2015&hidCuspID=&ColState=&hdnSrcApp=&hdnPortID=&hdnMthEndDate=&hdnViewState=&txtIndex=Q694&txtFromDate=10%2F22%2F2015&SelOldNew=1&cboConfiguration=13023&cboFilter=0&iPage=&IndexVal=&Param1=&strWhere=&Param2=&SortVariable=&SortVariable1=&SortVariable2=&SortVariable3=&SortBy1=&SortBy2=&SortBy3=&SortString=&SortType=&Download=&OrderBy=&ConfigId=13023&CriteriaID=&NoOfConfig=&Source=1&ProfileName="



Dim strSavePath As String
Dim returnValue As Long
strSavePath = "C:\Users\yyy\Downloads\Awesome.csv"
returnValue = URLDownloadToFile(0, strUrl, strSavePath, 0, 0)
If returnValue = 0 Then
MsgBox "xxxx downloaded OK!"
Else
MsgBox "Tray again or exit"
End If
End Sub
 
Upvote 0
Remember, I said "If" about the direct URLs. The URLs might be relative links, in which case construct the full URL from the base site URL and the relative link. It might be generated by a Javascript function..., etc.. All guesswork without the URL.

The browser sends a GET request for the initial page, e.g. login page. It sends a POST request when sending form data, e.g. to download the file:

Content-Type: application/x-www-form-urlencoded

I'm not sure, but I think the XMLhttpRequest object handles the cookie and session ids internally, so if you want to use this GET/POST method then I would concentrate on ensuring that the form data string is constructed correctly. You can see the form data parameter names and values in Fiddler (WebForms or Raw tab within the Inspectors tab) or your browser's developer tools

I can't really help further without the URL and access to the site.
 
Upvote 0
After some further research and some testing, I did find a solution :)
Here it is:

To get all the information needed, I did the whole process of Logging into the website and donwloading a file manually inspecting everything via Fiddler or the stadard inspection tool of IE.
There I saw 2 POST-requests that are important:

1. The first POST-request sends my credentials to the server, logging me in and setting cookies.
2. With the method "getAllReponseHeaders" you get what it says: Header-information. Within this, I filtered for "Set-Cookie".
3. I built a "Cookiestring" out of all these "Set-Cookies"-lines.
4. The second POST-request sends my download-request for the csv-file. Here I added the method "setRequestHeader "Cookie", Cookiestring" to let the server know that im authenticated.

I was so happy after I figured this out :) It took me a whole week...

The only thing that is still unclear: Must/Should I log out?
 
Upvote 0

Forum statistics

Threads
1,223,228
Messages
6,170,871
Members
452,363
Latest member
merico17

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top