Scraping Website Elements

jalexmiller0

New Member
Joined
Nov 10, 2017
Messages
8
I received some incredible help from here the first time I posted and I am back with another question related to web scraping.

I mainly just need the status result and delivery/projected delivery date. I have figured out a way to loop through these results however some of the load numbers aren't traceable. So if I just exported the results into excel not every result would match up with the correct load number.

I have some code, below, that I commented out. Where i was working on trying to find the corresponding load numbers on the results page. The numbers are stored in a list under the reference numbers section under CRNs. I was trying to figure out a way to check if the load numbers are in the results and if not then leave those cells blank in excel or put an error message saying they couldn't be tracked. But there are several other numbers stored in the list and I am struggling to find just the load number. Our numbers always have two letters to start then 5 digits after them. See examples below.


Here is the HTML for the list:

HTML:
                                                                                    LB86215, 1118701-SU, 1144844-SU-B, 1144846-SU, 1144848-SU, 1144935-SU,                                                                                     LB86215, 1118701-SU, 1144844-SU-B, 1144846-SU, 1144848-SU, 1144935-SU, 141069352                                                                                                                                                • LB86215                                                                    • 1118701-SU                                                                    • 1144844-SU-B                                                                    • 1144846-SU                                                                    • 1144848-SU                                                                    • 1144935-SU                                                                    • 141069352



Here are some example load numbers that I have been working with:
[TABLE="width: 64"]
<tbody>[TR]
[TD="width: 64"]LB86215[/TD]
[/TR]
[TR]
[TD]LA30394[/TD]
[/TR]
[TR]
[TD]K476310[/TD]
[/TR]
[TR]
[TD]K948859[/TD]
[/TR]
[TR]
[TD]K749405[/TD]
[/TR]
[TR]
[TD]K862112[/TD]
[/TR]
[TR]
[TD]K891456[/TD]
[/TR]
[TR]
[TD]K853033[/TD]
[/TR]
[TR]
[TD]K853780[/TD]
[/TR]
[TR]
[TD]LA14638[/TD]
[/TR]
[TR]
[TD]LA14643[/TD]
[/TR]
[TR]
[TD]LA14657[/TD]
[/TR]
[TR]
[TD]LA07291[/TD]
[/TR]
[TR]
[TD]LA26198[/TD]
[/TR]
[TR]
[TD]K995816[/TD]
[/TR]
[TR]
[TD]K853040[/TD]
[/TR]
[TR]
[TD]K978092[/TD]
[/TR]
[TR]
[TD]K855130[/TD]
[/TR]
</tbody>[/TABLE]

Here is my full code so far:

Code:
Public Sub ABF_Track_Load_Numbers()    
    Dim IE As InternetExplorer
    Dim HTMLdoc As HTMLDocument
    Dim URL As String
    Dim loadNumberCells As Range
    Dim traceNumbers As String
    Dim results As HTMLDivElement
'    Dim StatusDiv As HTMLDivElement
    Dim Status As IHTMLElement
    Dim StatusInfo As IHTMLElement
    Dim Delivery As IHTMLElement
    Dim ProjDelivery As IHTMLElement
    Dim infoTables As IHTMLElementCollection
    Dim i As Long
    Dim result As String
    Dim test As HTMLDivElement
    Dim testtext As String
    Dim loadNumber As String
    Dim loadNumberLong As IHTMLElement
    Dim loadNumberRaw As String
    Dim loadNumberRow As Variant
    Dim divs, div
    Dim ele As Object
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace As String
        
    
    'Get pickup numbers from column A in 1st sheet starting at A2
    
    With Worksheets(1)
        Set loadNumberCells = .Range("A2", .Cells(Rows.Count, "A").End(xlUp))
    End With
    loadNumberCells.Offset(, 1).ClearContents
    traceNumbers = Join(Application.Transpose(loadNumberCells), vbNewLine)
    
    URL = "https://arcb.com/tools/tracking.html"


    'Get existing IE window open at page, if any
    
    'Set IE = Get_IE_Window2(URL)
    If IE Is Nothing Then Set IE = New SHDocVw.InternetExplorer
    
    With IE
        .Visible = True
        'SetForegroundWindow .Hwnd
        .navigate URL
        While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend
        Set HTMLdoc = .Document
    End With
    
    Application.Wait (Now + TimeValue("0:00:03"))
    
    Set evt = IE.Document.createEvent("keyboardevent")
    evt.initEvent "change", True, False
    
    With HTMLdoc
        'Put trace numbers in input box and click Trace
        .all("trackingNumbers").Value = traceNumbers
        .all("trackingNumbers").dispatchEvent evt
        .getElementsByClassName("btn btn-primary")(0).Click
        While .ReadyState <> "complete": DoEvents: Wend
    End With
    
    'Waits for tracking to finish loading
    Application.Wait (Now + TimeValue("0:00:06"))
    
    IE.Visible = True
    
    'Extract results
    
'    Set results = HTMLdoc.getElementById("results")
'    Set divs = HTMLdoc.getElementsByClassName("vm.expandCRNs")
'    Debug.Print HTMLdoc.getElementsByClassName("vm.expandCRNs")
'    Set div = HTMLdoc.getElementsByTagName("div")
'    For Each ele In HTMLdoc.getElementsByClassName("vm.expandCRNs")
'        If InStr(ele.class, "crn in vm.shipment.referenceNumbers.customerReferenceNumbers track by $index") > 0 Then
'            Debug.Print ele.innerText
'        End If
'    Next ele
'    For Each div In divs
'        Debug.Print "*************************************"
'        Debug.Print div.ChildNodes(0).toString
'        Debug.Print div.getElementsByTagName("span")(0).innerText
'        Debug.Print div.getElementsByTagName("span")(1).innerText
'        '  etc...
'    Next div
'    i = 0


'    For Each loadNumberLong In HTMLdoc.getElementsByTagName("span")
'        If loadNumberLong.className = "modal-text-together" Then
'            loadNumberRaw = loadNumberLong.innerText
'                If InStr(1, loadNumberRaw, "SU", vbTextCompare) <> 0 Then
'                    GoTo GoHere
'                ElseIf InStr(1, loadNumberRaw, "SU", vbTextCompare) = 0 Then
'                    loadNumber = Mid(loadNumberRaw, 3, 8)
'                    Debug.Print loadNumber
'
'                End If
'        End If
'GoHere:
'    Next loadNumberLong
  
'Error message the appears at the top of the results when a load number isn't traceable 
    For Each test In HTMLdoc.getElementsByClassName("alert alert-warning ng-scope")
        If test.innerText <> "" Then
            Debug.Print test.innerText
        End If
    Next test
    


'Gets status info        
    For Each Status In HTMLdoc.getElementsByTagName("span")
        If Status.className = "status-code ng-binding" Then
            If Status.innerText = "" Then
                Debug.Print "Check status info online"
            Else: Debug.Print Status.innerText
            End If
        End If
    Next Status
    
'Gets delivery info 
    For Each Delivery In HTMLdoc.getElementsByTagName("div")
        If Delivery.className = "col-xs-4 delivery-node-description" Then
            Debug.Print Delivery.innerText
        End If
    Next Delivery


    
        'Find this pro number in Excel cells and if found put result in adjacent cell
        
        loadNumberRow = Application.Match(loadNumber, loadNumberCells, 0)
        If Not IsError(loadNumberRow) Then
            loadNumberCells(loadNumberRow, 2).Value = Status
        Else
            MsgBox "load number " & loadNumber & " in results not found in cells " & loadNumberCells.Address
        End If
'
    
    
End Sub
 

Excel Facts

How can you automate Excel?
Press Alt+F11 from Windows Excel to open the Visual Basic for Applications (VBA) editor.

Forum statistics

Threads
1,224,823
Messages
6,181,176
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top