VBA Web scraper

ThomasA83

Board Regular
Joined
Mar 10, 2009
Messages
95
Hi,

I have been reading through many VBA web scraper posts, however I still can't figure out to get my code to work, so hope you can help me.

The solution I am after is as follow:

1. Go to MSC: Global Container Shipping Company
2. Insert Value from column A2 in the search box (a. in image illustration)
3. Hit search to get the search result
4. If not search result, then continue to step 6.
5. Return 2 values from the search result (b. & c. in the image illustration),
6. Continue procedure from step 2, but with next row (A3), until end of active value in column A

I have managed to get it to work until step 3, but can't get it to work with step 4-6..

Attached is the current Excel file, including the VBA code and the image illustration: WeTransfer Download-link

VBA Code:
Sub Search()
 
    'dimension (declare or set aside memory for) our variables
    Dim objIE As InternetExplorer 'special object variable representing the IE browser
    Dim result As String 'string variable that will hold our result link
 
    'initiating a new instance of Internet Explorer and asigning it to objIE
    Set objIE = New InternetExplorer
 
    'make IE browser visible (False would allow IE to run in the background)
    objIE.Visible = True
 
    'navigate IE to this web page (a pretty neat search engine really)
    objIE.navigate "https://www.msc.com/track-a-shipment"
 
    'wait here a few seconds while the browser is busy
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
 
    'in the search box put cell "A2" value
    objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_txtBolSearch_TextField").Value = _
      Sheets("Sheet1").Range("A2").Value
 
    'click the 'search' button
    objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_hlkSearch").Click
 
    'wait again for the browser
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
 
    'define the Bill of Lading value from the website
    a = objIE.document.getElementsByClassName("bolToggle").Item(0).innerText
    
    Sheets("Sheet1").Range("B2") = a
   
    objIE.Quit
 
'exit our Search subroutine
End Sub

Thanks.
 

Excel Facts

Remove leading & trailing spaces
Save as CSV to remove all leading and trailing spaces. It is faster than using TRIM().
Hi. I've had a look at your code and at the workbook you uploaded. First, can I say - thank you for putting so much effort into explaining what it is your looking for and what you've done. The detail you went into is appreciated.

So I've managed to get the answer, in part. The code keeps breaking for me at point 1 - this is because I am constantly being asked to identify which area of the world I'm currently located. This could be because I use VPN - who knows, but I assume that if you make it past that, then you don't encounter the same problem.

Your code is pretty much on track, but I propose you replace the a = line with the following
VBA Code:
    tmpResult = objIE.document.getElementsByTagName("h3")(0).innerText
    If InStr(tmpresult, "No matching tracking") > 0 Then
        Sheets("Sheet1").Range("B2") = "No tracking information"
    Else
        a = Split(Split(tmpresult, ": ")(1), " ")(0)
        Sheets("Sheet1").Range("B2") = a
    End If
Basically, the first line extracts the Bill of Lading text from the H3 tag, and then it extracts the BOL number from that text. It worked for me, save for the second and third last numbers, for which there is apparently no tracking information.

Let me know if it works for you.

D
 
Upvote 0
The code keeps breaking for me at point 1 - this is because I am constantly being asked to identify which area of the world I'm currently located.
Hi, same issue without any VPN …​
 
Upvote 0
Thanks, @Marc L . I suspected that might be the case.
@ThomasA83 - does the code work for you? If you're not being prompted for your location each time you log in *you say that you get as far as clicking the search button), I'm inclined to leave it alone - provided the rest of the code works...
 
Upvote 0
Hi Dan,
Thanks for your answer and strange that the first part dont work for you?

I tried inserting your code, so it look like below. but it debug at "a = Split(Split(tmpresult, ": ")(1), " ")(0)". Any idea why?

VBA Code:
Sub Search()
 
    'dimension (declare or set aside memory for) our variables
    Dim objIE As InternetExplorer 'special object variable representing the IE browser
    Dim result As String 'string variable that will hold our result link
 
    'initiating a new instance of Internet Explorer and asigning it to objIE
    Set objIE = New InternetExplorer
 
    'make IE browser visible (False would allow IE to run in the background)
    objIE.Visible = True
 
    'navigate IE to this web page (a pretty neat search engine really)
    objIE.navigate "https://www.msc.com/track-a-shipment"
 
    'wait here a few seconds while the browser is busy
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
 
    'in the search box put cell "A2" value
    objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_txtBolSearch_TextField").Value = _
      Sheets("Sheet1").Range("A2").Value
 
    'click the 'search' button
    objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_hlkSearch").Click
 
    'wait again for the browser
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
 
    'define the Bill of Lading value from the website
    
    tmpresult = objIE.document.getElementsByTagName("h3")(0).innerText
    
    If InStr(tmpresult, "No matching tracking") > 0 Then
        Sheets("Sheet1").Range("B2") = "No tracking information"
    Else
        a = Split(Split(tmpresult, ": ")(1), " ")(0)
        Sheets("Sheet1").Range("B2") = a
    End If
   
    objIE.Quit
 
'exit our Search subroutine
End Sub
 
Upvote 0
I'm very sorry - I didn't see a notification to say that you had responded. I suspect that it's coming up with an error there because it isn't finding a : in the string, which is odd because when I ran the code it worked each time. I wonder if you're seeing a different version of the site than the rest of us perhaps. Anyway, it isn't crtical, you can just delete that line if you like and on the next line, replace a with tmpresult.
 
Upvote 0
I'm very sorry - I didn't see a notification to say that you had responded. I suspect that it's coming up with an error there because it isn't finding a : in the string, which is odd because when I ran the code it worked each time. I wonder if you're seeing a different version of the site than the rest of us perhaps. Anyway, it isn't crtical, you can just delete that line if you like and on the next line, replace a with tmpresult.
Hi Dan,
No worries and thanks for your reply.
I have tried many things since, but cant get it to work.
I think it might be something where the website identify you country and then show a slightly different page based on that, and thats why it is hard to receive a solution for you and others :(
The mainpage is www.msc.com and then it ask you for your region, where I select Europe, and on the next page I click Track a Shipment, and then I am on the page where I do the Search and would like to retrieve data from.
When I use your code (adjusted below) and remove the split part, then it returns a value to B2, however the value retrieved is "Get in touch", since that is the first "H2" tag in the page I see :(

Any idea how I can move on, below is the inspect element from IE that I see:
1622705838338.png

VBA Code:
Sub Search()
 
    'dimension (declare or set aside memory for) our variables
    Dim objIE As InternetExplorer 'special object variable representing the IE browser
    Dim result As String 'string variable that will hold our result link
 
    'initiating a new instance of Internet Explorer and asigning it to objIE
    Set objIE = New InternetExplorer
 
    'make IE browser visible (False would allow IE to run in the background)
    objIE.Visible = True
 
    'navigate IE to this web page (a pretty neat search engine really)
    objIE.navigate "https://www.msc.com/track-a-shipment"
 
    'wait here a few seconds while the browser is busy
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
 
    'in the search box put cell "A2" value
    objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_txtBolSearch_TextField").Value = _
      Sheets("Sheet1").Range("A2").Value
 
    'click the 'search' button
    objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_hlkSearch").Click
 
    'wait again for the browser
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
 
    'define the Bill of Lading value from the website
    
    tmpresult = objIE.document.getElementsByTagName("h3")(0).innerText
    
    If InStr(tmpresult, "No matching tracking") > 0 Then
        Sheets("Sheet1").Range("B2") = "No tracking information"
    Else
        Sheets("Sheet1").Range("B2") = tmpresult
    End If
   
    objIE.Quit
 
'exit our Search subroutine
End Sub
 
Upvote 0
Hi. So I've made some adjustments to the script and got this:
Booking NoBill of ladingDischarge Date
11220779801MEDUHO74029226/11/2019
038CHS1566677MEDUKB79455102/09/2020
038CHS1590739MEDUUB34278728/06/2020
038CHS1595748MEDUUA40420810/09/2020
046JHL2004320MEDUJ131831313/05/2020
048BC0210053MEDUBC93158210/06/2020
048BC0216910MEDUBC99872212/11/2020
06-10-01473-01No matching tracking informationN/A
06-10-01546-01No matching tracking informationN/A
06-10-01602-01No matching tracking informationN/A
11220796735MEDUH186918729/02/2020
 
Upvote 0
This is the script I used:
VBA Code:
Sub Search()
 
    'dimension (declare or set aside memory for) our variables
    Dim objIE As InternetExplorer 'special object variable representing the IE browser
    Dim result As String 'string variable that will hold our result link
    
    Dim BOL As Object
    Dim objContainer As Object
    
    Dim DischargedStatus As String, BillOfLading As String
    
    'initiating a new instance of Internet Explorer and asigning it to objIE
    Set objIE = New InternetExplorer
 
    'make IE browser visible (False would allow IE to run in the background)
    objIE.Visible = True
 
    'navigate IE to this web page (a pretty neat search engine really)
    objIE.navigate "https://www.msc.com/track-a-shipment"
 
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
    
    ' Selects United Kingdom
    objIE.document.getElementsByClassName("flags gbr")(0).Click
    PAUSE 0.5
    
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
    
    ' Rejects newsletter option
    objIE.document.getElementsByClassName("button button-secondary expand js-reject")(0).Click
    PAUSE 0.5
    
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
    
    ' Accepts all cookies
    objIE.document.getElementsByClassName("optanon-allow-all accept-cookies-button")(0).Click
    PAUSE 0.5
    
    'wait here a few seconds while the browser is busy
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
    
    For i = 2 To 12
    
        ' Put cell "A2" value in the search box
        objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_txtBolSearch_TextField").Value = _
        Sheets("Sheet1").Range("A" & i).Value
    
        ' Click the 'search' button
        objIE.document.getElementById("ctl00_ctl00_plcMain_plcMain_TrackSearch_hlkSearch").Click
 
        ' Wait for the browser
        Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
        
        PAUSE 0.5
        
        Set res = objIE.document.getElementsByClassName("copyPanel")
        
        If InStr(res(0).innerText, "No matching tracking information") = 0 Then
        
            Set objContainer = objIE.document.getElementsByClassName("accordion trackingView ")(0)
            Set BOL = objContainer.getElementsByClassName("bolToggle")(0)
            
            BillOfLading = Split(Split(BOL.innerText, ": ")(1), " (")(0)
          
            Set ResultTable = objIE.document.getElementsByClassName("resultTable")(1)
          
            For Each tr In ResultTable.getElementsByTagName("tbody")(0).Rows
                
                If Trim(tr.Cells(1).innerText) = "Discharged" Then
                    DischargedStatus = Trim(tr.Cells(2).innerText)
                    Exit For
                End If
              
                PAUSE 0.5
            
            Next
        Else
            BillOfLading = "No matching tracking information"
            DischargedStatus = "N/A"
        End If
        
        Sheets("Sheet1").Range("B" & i) = BillOfLading
        Sheets("Sheet1").Range("C" & i) = DischargedStatus
    
    Next
    
    objIE.Quit
 
 End Sub
Sub PAUSE(Period As Single)
    Dim T As Single
    T = Timer
    Do
         DoEvents
    Loop Until T + Period < Timer
 End Sub
 
Upvote 0
Solution
Sorry - I had meant to send this part as welll:
I added a few extra lines at the beginning to deal with the country selection and newsletter/cookies pop-ups, so you may or may not want to change these. Otherwise the other adjustments are to either extract the desired info, or to deal with the possibility that there is no available tracking information. I found that IE tends to move a bit too fast for this site, so I would suggest that you use the PAUSE script that I included above if encounter any errors. It's usually the case that IE hasn't found a given HTML element because that element has yet to load.

Let me know how it goes. Fingers crossed.
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,178
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top