Browser Automation

RonaTa

New Member
Joined
May 14, 2017
Messages
3
Hi!

I usually always find answers to my questions by searching this and other forums. Now that i have been struggling with a problem i cant figure out i am posting my first thread.
I been trying to learn to use VBA for scraping web pages. As the URL itself is a page used at work, i am not at liberty to share the link. I hope it is possible to understand the issue anyhow, it is probably very basic.

When page opens it is a log in page, after the login page there is search field that i wanna insert a number and scrape the result of this search.

Code so far:

Code:
Sub solid()







Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim UserN As MSHTML.IHTMLElement
Dim PassW As MSHTML.IHTMLElement
Dim SearchS As MSHTML.IHTMLElement
Dim HtmlButtons As MSHTML.IHTMLElementCollection
Dim htmlLinks As MSHTML.IHTMLElementCollection
Dim htmlButton As MSHTML.IHTMLElement




IE.Visible = True
IE.navigate "#"


Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop


Set HTMLDoc = IE.Document


Set UserN = HTMLDoc.getElementById("j_username")
    UserN.Value = "***"


Set PassW = HTMLDoc.getElementById("j_password")
    PassW.Value = "**"




Set HtmlButtons = HTMLDoc.getElementsByTagName("input")
    HtmlButtons(2).Click ' here i successfully log in to a new page.


Set htmlLinks = HTMLDoc.getElementsByTagName("a")


Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop


For Each htmlButton In htmlLinks
 Debug.Print htmlButton.tagName, htmlButton.innerText
Next htmlButton






End Sub


My initial problem was that when i wanna put in value in the search field after i log in (After HtmlButtons(2).Click), nothing happened.
When i loop through elements in end of my code, it still prints elements from the first page despite the successful log in. Why is that?
 

Excel Facts

Control Word Wrap
Press Alt+Enter to move to a new row in a cell. Lets you control where the words wrap.
Update,

I think it has to do with the object IE or/and HTMLDoc being set with initial page, and continues after the page changes after i log in.
To clarify my question:

This is the first part of my code. It opens up a page, fill in values in username and password fields and successfully log in.
Code:
Sub solid()






Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim UserN As MSHTML.IHTMLElement
Dim PassW As MSHTML.IHTMLElement
Dim SearchS As MSHTML.IHTMLElement
Dim HtmlButtons As MSHTML.IHTMLElementCollection
Dim htmlButton As MSHTML.IHTMLElement




IE.Visible = True
IE.Navigate "URL"






Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop


Set HTMLDoc = IE.Document






Set UserN = HTMLDoc.getElementById("j_username")
    UserN.Value = "#"


Set PassW = HTMLDoc.getElementById("j_password")
    PassW.Value = "#"




Set HtmlButtons = HTMLDoc.getElementsByTagName("input")
    HtmlButtons(2).Click

As follows is my second part. It is supposed to populate a search field, and then press a search button.
Code:
Set SearchS = HTMLDoc.getElementById("onelinesearch")
    SearchS.Value = "986806318"


Set HtmlButtons = HTMLDoc.getElementsByTagName("input")




For Each htmlButton In HtmlButtons
    If htmlButton.getAttribute("classname") = "btnSearch" Then htmlButton.Click
Next htmlButton


If i run just the second part of my code (together with the declarations ofc), with the site already being logged in, it works. But if i run the whole ting together it does not. I get run time error '91' on the first action of my second part.

I tried to experiment alot with setting the objects to nothing and then declaring again, tried with all new objects for my second part. But nothing seems to work.
Does anyone have some pointers?
 
Upvote 0
Hello and welcome to MrExcel.

Sorry to see you have not got an answer yet. I don't have an answer but am experiencing the exact same issue when trying to scrape updated data after a search (the IE object is retaining the original data after pushing a value into a search box despite the IE page having updated).

Hopefully someone more knowledgeable then me sees this and can assist.

I don't think we can use the WinHttpRequest method given we are changing the webpage after navigating to it. It's as if we need the IE object to refresh itself with the current open webpage without invoking the IE.Navigate command.

Andrew
 
Upvote 0
Solution for my problem:

I put a Do while "IE.Busy" loop after i logged in, and the page changes. I put a "DoEvents" inside the loop.

Code:
 Do While IE.Busy    DoEvents
Loop

While(sic) i do not fully comprehend why this works, and f.ex "Do While IE.ReadyState <> READYSTATE_COMPLET: Loop" dont. I read that has something to do with the DoEvents releases the control to the operating system or something like that.
 
Upvote 0

Forum statistics

Threads
1,225,743
Messages
6,186,778
Members
453,371
Latest member
HMX180

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top