Excel and Vba: using MSXML2.XMLHTTP to login into a website

Nelson78

Well-known Member
Joined
Sep 11, 2017
Messages
526
Office Version
  1. 2007
Hello everybody.

I'm a newbie by surfing the net via GET and POST request.

So far, I've had no many problems in scraping data from a link.


Code:
Sub getintoSITE()

    Dim URL As String, strResponse As String
    Dim objHTTP As Object

    URL = "............"
    
    Set objHTTP = CreateObject("MSXML2.XMLHTTP")

    With objHTTP
        .Open "GET", URL, False
        .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
        .send
        strResponse = .responseText
        
        Sheets(3).Range("A1") = strResponse
        
    End With

End Sub

Now, the following step is managing a barrier in terms of authentication.


In this link I have the form to insert login and password:

Code:
https://xxxxx/xxxxx/xxxxx/login.aspx

Then, if the operation is successful, I'm redirected to the desired link:

Code:
https://xxxxx/xxxxx/yyyyy.aspx


How can I face this issue?

I know I have to set a POST request to send the credentials. I also have Fiddler on my pc to parse cookies.

Anyway, my first doubt is: do I have to build the POST request referring to the link where the login form is set, or referring to the following page where I'm redirected?

I mean, something like this?

Code:
Sub getintoSITE()

    Dim URL As String, strResponse As String
    Dim objHTTP As Object

    URL_login = "https://xxxxx/xxxxx/xxxxx/login.aspx"
    URL_goal = "https://xxxxx/xxxxx/yyyyy.aspx"

 With CreateObject(“MSXML2.XMLHTTP”)
        .Open “post”, URL_login, False
        .setRequestHeader “Content-type”, “application/x-www-form-urlencoded”

...............................


Thank's in advance for your tips.
 
Last edited:

Excel Facts

Copy formula down without changing references
If you have =SUM(F2:F49) in F50; type Alt+' in F51 to copy =SUM(F2:F49) to F51, leaving the formula in edit mode. Change SUM to COUNT.
After some in-depth analyses, maybe the initial steps are as follows:

1) launch a get request in the login page in order to:
A - get session cookies
B - get the login framework
2) launch a post request to login using the 2 parameters A and B

Parsing via Fidler the get request in the login page:

Code:
HTTP/1.1 200 OK
Date: Mon, 24 Jun 2019 12:23:52 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Set-Cookie: ASP.NET_SessionId=blablabla123456; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=iso-8859-1
Content-Length: 13878
Set-Cookie: BIGipServerpool_xxx.yyyyyyyy.zz_http=123456789.12345.0000; path=/; Httponly; Secure
Vary: Accept-Encoding
Connection: Keep-Alive

About the form, I can see a framework as follows:

Code:
<form name="form" method="post" action="login.aspx" onsubmit="javascript:return submitform();" id="form">
<div>
<input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" />
<input type="hidden" name="ZZZ456" id="ZZZ456" value="" />
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUJLTQ5MTk5MDYzD2QWAgIDD2QWAgITDzwrAAQBAA8WCB4VRW5hYmxlRW1iZWRkZWRTY3JpcHRzZx4cRW5hYmxlRW1iZWRkZWRCYXNlU3R5bGVzaGVldGceElJlc29sdmVkUmVuZGVyTW9kZQspclRlbGVyaWsuV2ViLlVJLlJlbmRlck1vZGUsIFRlbGVyaWsuV2ViLlVJLCBWZXJzaW9uPTIwMTguMS4xMTcuMzUsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49MTIxZmFlNzgxNjViYTNkNAIeF0VuYWJsZUFqYXhTa2luUmVuZGVyaW5naGRkGAEFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYBBQpidG5QcmltYXJ5ciW9RVsKlsB2m0RSfaD/1/R+Ulc=" />
</div>

******** type="text/javascript">

'... then stuff, in which 

function submitform()

'...other stuff

</form>

Now, I need some tips...
 
Last edited:
Upvote 0
Little by little, I'm going onward.

Scraping the parameters (__VIEWSTATE, __PREVIOUSPAGE, ...), all of them seem to have always the same value, except

Code:
__EVENTVALIDATION

I mean: if I scrape it twice in few minutes, it has the same value. If I scrape it once, and then scrape it after a hour, it changes.

Could it have something like an expiration?
 
Upvote 0
Little by little, I'm going onward.

Scraping the parameters (__VIEWSTATE, __PREVIOUSPAGE, ...), all of them seem to have always the same value, except

Code:
__EVENTVALIDATION

I mean: if I scrape it twice in few minutes, it has the same value. If I scrape it once, and then scrape it after a hour, it changes.

Could it have something like an expiration?

I could try answering myself: until the session is not expired (I think 10 minutes), the __EVENTVALIDATION keeps the same value.
 
Last edited:
Upvote 0
Some new steps has been done.

I have extrapolated the following values and built, with them, the -very long - string
Code:
strPostData
.

Here the values:
Code:
__VIEWSTATE
__LASTFOCUS
__EVENTARGUMENT
__EVENTTARGET
__VIEWSTATEGENERATOR
__PREVIOUSPAGE
__EVENTVALIDATION


Furthermore, two cookies are involved in the process:
Code:
Cookie 1 = "abc"
Cookie 2 = "def"


How can I manage the cookies in the post request (my unsuccessfull attempt below)?
Code:
Set reqHttp = CreateObject("MSXML2.XMLHTTP")
        reqHttp.Open "Post", URL, False
        reqHttp.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
        reqHttp.setRequestHeader "Cookie", cookie1
        reqHttp.setRequestHeader "Cookie", cookie2
        reqHttp.send (strPostData)
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,243
Messages
6,170,971
Members
452,371
Latest member
Frana

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top