Importing Web Page HTML elements text using VBA

mattu1990

Board Regular
Joined
Mar 9, 2013
Messages
52
Hi,

I'm new to VBA (and to this forum) and was wondering if someone could help me with a problem getting some web page data into Excel 2010 using VBA.

Basically the scenario I am trying to create is very similar to this YouTube tutorial. Excel VBA Pull Data From A Website - YouTube

My scenario however is set up with the following titles in cell A1, B1, C1, D1 and E1 : POST CODE, OUTLET, ADDRESS, TELEPHONE, EMAIL

The result I want to achieve is I enter a post code into cell A2 for example, Excel then uses IE to navigate to the relevant web page as defined in the VBA code. I then want the following to happen:


  • The InnerText of the web page's h1 tag is then inserted into the OUTLET cell (B2)
  • The first instance of the p tag is then inserted into the ADDRESS cell (C2)
  • The second instance of the p tag is then inserted into the TELEPHONE cell (D2)
  • The third instance of the p tag is then inserted into the EMAIL cell (E2)

All instances of the p tag are contained in a div element called div class="adBox_content" . There are also 5 other DIVs above that DIV in the hierarchy (whether that helps or not I don't know?)

Using the YouTube tutorial link, the method has worked for me using the getElementsByTagName("h1").innerText However, when I try adding a second getElementsByTagName("p")(01).innerText the whole thing fails.

So I'm left with two problems; I can't make the VBA get more than one element at a time from the page, I can only either have the h1 or the first instance of the p tag. I've tried all the getElementBy methods and none of them seem to work in getting the second and third
instances to show.


I also need the code to make the data be put on the same row ONLY as where the post code was entered. In this scenario for example of entering a post code into A2, the OUTLET needs to land in cell B2 only, ADDRESS C3 only etc.

By following the youtube tutorial above by giving the cells names to refer to in the code, the data ends up being inputted in all further rows with identical cell names. I need it to not do that.

Any help would be hugely appreciated. The code is needed for around 300 rows of post codes that will be entered and refreshed every week or so.

Thanks,
Matt
 
Last edited:
That got it Mike and oh my..how much faster is that. Quite incredible.

So what's the scenario with this then. Does the original method use Excel's version of an IE object, where as the second method uses a more modern approach?

Just interested as the speed difference is crazy.
 
Upvote 0

Excel Facts

Excel motto
Not everything I do at work revolves around Excel. Only the fun parts.
Hi Matt

Yes, I think that is it precisely.

Where the previous method formats the whole web page, the new method just accesses the source code and therein lies the improvement.

I now use the new method for interacting with web pages because there was a problem with IE10 not completing the page formatting on occasions.

Have a good weekend.
 
Upvote 0
HI Mike i need the similar code for search of postal code to address in Singapore search, could you help ?


Private Sub Worksheet_Change(ByVal Target As Range)
Dim Tgtrw As Long
Tgtrw = Target.Row
If Target.Row = Range("A" & Tgtrw).Row And _
Target.Column = Range("A" & Tgtrw).Column Then

Dim IE As New InternetExplorer
'IE.Visible = True
IE.navigate "gothere.sg:" & Range("A" & Tgtrw).Value
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.document
Dim sDD As String
sDD = Trim(Doc.getElementById("panel").getElementsByClassName("place")(0).getElementsByClassName("locf")(0).innerText)
IE.Quit
Dim aDD As Variant
aDD = Split(sDD, " ")
Range("B" & Tgtrw).Value = aDD(0)
Range("C" & Tgtrw).Value = aDD(1) & Space(2) + aDD(2) & Space(2) + aDD(3) & Space(2) + aDD(4)
End If
End Sub
This is what I am looking for : 1. In Excel column A: I key in postal code and hit enter 2. once I enter the excel search the web page(gothere.sg) and return address like Column B with Blk no Column C with street address(Ignore Singapore and postal code returned)
Web Page output : 101 Serangoon North Avenue 1 Singapore 550101
please check the above url for html coding
Example : postal code : 550101 will return 101 Serangoon North Avenue 1 Singapore 550101 this should be printed in excel as Column B: 101 Column C: Serangoon North Avenue 1 and dont want to print the singapore 550101. help is much appreciate, I tried to fine tune with no luck.Thank you
 
Upvote 0
Hi

Well done you almost got it spot on.

Here is the code to complete it :-
Code:
 IE.Navigate "gothere.sg"
 Do: DoEvents: Loop Until IE.ReadyState = READYSTATE_COMPLETE
 Set Doc = IE.Document
 Doc.getElementById("q").Value = Range("A" & Tgtrw).Value
 Doc.getElementById("ss").Click
 sDD = Trim(Doc.getElementById("panel").getElementsByClassName("place")(0).getElementsByClassName("locf")(0).innerText)
 IE.Quit
 
 aDD = Split(sDD, " ")
 Range("B" & Tgtrw).Value = aDD(0)
 Range("C" & Tgtrw).Value = aDD(1) & Space(2) + aDD(2) & Space(2) + aDD(3) & Space(2) + aDD(4)

ie replace the code from IE.Navigate to the end with the above.

hth
 
Upvote 0
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Posta Code
[/TD]
[TD]Blk
[/TD]
[TD]Address
[/TD]
[/TR]
[TR]
[TD]550101
[/TD]
[TD]101
[/TD]
[TD]Serangoon Noth Ave 1
[/TD]
[/TR]
[TR]
[TD]760329
[/TD]
[TD]329
[/TD]
[TD]Yishun Ring Road
[/TD]
[/TR]
[TR]
[TD]530952
[/TD]
[TD]952
[/TD]
[TD]Hougang Avenue 9
[/TD]
[/TR]
</tbody>[/TABLE]
Thanks for your reply, i have tried it as advise, but still i am stuck it didnt worked, i actually dont want from Singapore 550101 to be imported to excel. I want the output as above.


Private Sub Worksheet_Change(ByVal Target As Range)
Dim Tgtrw As Long
Tgtrw = Target.Row
If Target.Row = Range("A" & Tgtrw).Row And _
Target.Column = Range("A" & Tgtrw).Column Then

Dim IE As New InternetExplorer
'IE.Visible = True
IE.navigate "gothere.sg/maps#q:"
'& Range("A" & Tgtrw).Value
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
'Dim Doc As HTMLDocument
Set Doc = IE.document
' Dim sDD As String
Doc.getElementById("q").Value = Range("A" & Tgtrw).Value
Doc.getElementById("ss").Click
sDD = Trim(Doc.getElementById("panel").getElementsByClassName("place")(0).getElementsByClassName("locf")(0).innerText)
IE.Quit
' Dim aDD As Variant
aDD = Split(sDD, " ")
Range("B" & Tgtrw).Value = aDD(0)
Range("C" & Tgtrw).Value = aDD(1) & Space(2) + aDD(2) & Space(2) + aDD(3) & Space(2) + aDD(4)
End If
End Sub
 
Upvote 0
Hi

Excuse me the Postal code you supplied produced the required results and you didn't supply any alternative codes.

Your original code did not work and I supplied you with code that worked.

Change the following line :-
Code:
 Range("C" & Tgtrw).Value = aDD(1) & Space(2) + aDD(2) & Space(2) + aDD(3) & Space(2) + aDD(4)

to

 Range("C" & Tgtrw).Value = aDD(1) & Space(2) + aDD(2) & Space(2) + aDD(3) & IIf(aDD(4) <> "Singapore", Space(2) + aDD(4), "")

Btw, if the spacing of the address parts is not mandatory then you could get by with the following :-
Code:
 sDD = Trim(Application.WorksheetFunction.Substitute(Doc.getElementById("panel").getElementsByClassName("place")(0).getElementsByClassName("locf")(0).innerText, "Singapore " & Range("A" & Tgtrw).Value, ""))
 
 Range("B" & Tgtrw).Value = Left(sDD, Application.WorksheetFunction.Find(" ", sDD) - 1)
 Range("C" & Tgtrw).Value = Right(sDD, Len(sDD) - Application.WorksheetFunction.Find(" ", sDD))

hth
 
Upvote 0

Forum statistics

Threads
1,223,236
Messages
6,170,915
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top