All,
I am trying to scrape some data from a password protected website. A quick search provided some code to login to the website. After the login, I am trying to grab some specific information; however, all of my attempts are coming up short. I was hoping to get some help understanding the HTML. Some research tells me that the element I am trying to access is buried in nested tables and forms. I am having trouble understanding how to access the elements. Using internet explorer, I used F12 to access the HTML code and then Ctrl + B to inspect the element. This provided me with what I thought would be enough but I am not getting anywhere.
The dropdown I am trying to access is shown below. When I hover over the tabs, several show ID names an several show class names. Others such as 'table' and 'tr' are just labeled as elements with no class or ID.
My assumption was I could get through the nesting to the bottom with something like:
It doesn't seem to be working. Thanks in advanced for the help. Please let me know if any other information is required!
Reference: How to Use Excel VBA to Log-In to a Website -
I am trying to scrape some data from a password protected website. A quick search provided some code to login to the website. After the login, I am trying to grab some specific information; however, all of my attempts are coming up short. I was hoping to get some help understanding the HTML. Some research tells me that the element I am trying to access is buried in nested tables and forms. I am having trouble understanding how to access the elements. Using internet explorer, I used F12 to access the HTML code and then Ctrl + B to inspect the element. This provided me with what I thought would be enough but I am not getting anywhere.
The dropdown I am trying to access is shown below. When I hover over the tabs, several show ID names an several show class names. Others such as 'table' and 'tr' are just labeled as elements with no class or ID.
My assumption was I could get through the nesting to the bottom with something like:
VBA Code:
Set eleBdy = HTMLDoc.getElementById("Bdy")
Set eledataView = eleBdy.getElementById("dataView")
Set cHeader = eledataView.getElementsByClassName("cheader")
It doesn't seem to be working. Thanks in advanced for the help. Please let me know if any other information is required!
Reference: How to Use Excel VBA to Log-In to a Website -
VBA Code:
Dim HTMLDoc As HTMLDocument
Dim oBrowser As InternetExplorer
Sub Website_Login_Test()
Dim oHTML_Element As IHTMLElement
Dim sURL As String
On Error GoTo Err_Clear
sURL ="[URL='https://www.examplewebsite.co.uk/login.aspx'][U]https://www.examplewebsite.co.uk/login.aspx[/U][/URL]"
Set oBrowser = New InternetExplorer
oBrowser.Silent = True
oBrowser.timeout = 60
oBrowser.navigate sURL
oBrowser.Visible = True
Do
' Wait till the Browser is loaded
Loop Until oBrowser.readyState = READYSTATE_COMPLETE
Set HTMLDoc = oBrowser.Document
HTMLDoc.all.loginID.Value = "exampleusername"
HTMLDoc.all.Password.Value = "examplepassword"
For Each oHTML_Element In HTMLDoc.getElementsByTagName("input")
If oHTML_Element.Type = "submit"
_Then oHTML_Element.Click: Exit For
Next
Set eleBdy = HTMLDoc.getElementById("Bdy")
Set eledataView = eleBdy.getElementById("dataView")
Set cHeader = eledataView.getElementsByClassName("cheader")
' oBrowser.Refresh ' Refresh If Needed
Err_Clear:
If Err <> 0 Then
Err.Clear
Resume Next
End If
End Sub
HTML:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css">
<META HTTP-EQUIV="X-UA-Compatible" CONTENT="IE=edge">
<META NAME="GENERATOR" CONTENT="IBM Software Development Platform">
<META NAME="keywords" CONTENT="BMS2, PA Department of Transportation, Registration, Doing Business">
<META NAME="description" CONTENT="Bridge Management System">
<LINK rel="stylesheet" type="text/css" href="css/body.css">
<LINK rel="stylesheet" type="text/css" href="css/calendar.css">
<LINK rel="stylesheet" type="text/css" href="css/menu.css">
<LINK rel="stylesheet" type="text/css" href="css/banner.css">
<LINK rel="stylesheet" type="text/css" href="css/toolbar.css">
<LINK rel="stylesheet" type="text/css" href="css/bms.css">
<LINK rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Montserrat:400,500,900">
<link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Roboto+Mono">
<script language=JavaScript src="javascript/common.js"></script>
<script language=JavaScript src="javascript/buttonhandler.js"></script>
<script language=JavaScript src="javascript/CascadeMenu.js"></script>
<script language=JavaScript src="javascript/ECMSCommon.js"></script>
<script language=JavaScript src="javascript/validation.js"></script>
<script language=JavaScript src="javascript/BMSCommon.js"></script>
<script language=JavaScript src="javascript/ajax.js"></script>
<script language=JavaScript src="javascript/toolbar.js"></script>
<TITLE>Structure Home</TITLE>
<script src="javascript/jquery-1.6.4.min.js"></script>
<script src="javascript/jquery-ui.min-1.8.16.js"></script>
<script src="javascript/json2.js"></script>
<script src="javascript/jquery.infoDialog.js"></script>
<script src="javascript/jquery.structureHome.js" charset="UTF-8"></script>
<link rel="stylesheet" href="css/jquery-ui-1.8.16.css" type="text/css">
</HEAD>
<BODY ID="Bdy" OnLoad="bodyOnLoad();" onResize="bodyOnResize();" leftmargin="0" rightmargin="0" topmargin="0" marginwidth="0" marginheight="0">
<!--
<table cellpadding="0" cellspacing="0" border="0" width="100%" background="images/bms2Header1.gif">
<tr height="61"><td> </td></tr>
</table>
-->
<style>
.dropdown {
position: absolute;
display: inline-block;
float: right;
padding-right: 25px;
top: 55px;
right: 5px;
}
.dropdown-content {
display: none;
position: absolute;
background-color: #f9f9f9;
min-width: 160px;
box-shadow: 0px 8px 16px 0px rgba(0,0,0,0.2);
padding: 12px 16px;
z-index: 1;
}
.dropdown:hover .dropdown-content {
display: block;
}
.dropdown-header {
color: #FFFFFF;
}
.dropdown-header:hover {
cursor: pointer;
}
.dropdown-section {
border-style: solid;
border-width: 0px 0px 1px 0px;
/* font-weight: 500; */
}
</style>
<table class="banner">
<colgroup>
<col WIDTH='30%' >
<col WIDTH='40%' >
<col WIDTH='30%' >
</colgroup>
<tr class="banner">
<td class="bannerleft">
<div class="penndotLink">
<a target="_blank" href="http://www.penndot.gov" class="banner" >
<img class="banner" alt="PennDOT" title="PennDOT" src="images/PennDOTColorLogo2.gif" height="81">
</a>
</div>
<div class="paLink">
<a target="_blank" href="http://www.pa.gov" class="bannerlink" >
<img class="banner" alt="Pennsylvania" title="Pennsylvania" src="images/pagov_logo_2.png" style="height:24px;">
</a>
</div>
<div class="homelink">
<a href="SVCOMLogin?action=init" class="bannerlink">
<img class="banner" alt="BMS Home" title="BMS Home" src="images/house_logo_2.png" style="width:25px;">
</a>
</div>
</td>
<td class = "bannercenter" align="center">
<div>
<span style="color: #0B2B6E;font-size: 50px; font-weight: 900;">BMS</span><span style="color: #FFCC33;font-size: 50px; font-weight: 900;">2</span>
</div>
<div class="bannerenvironmenttext">
</div>
</td>
<td class="bannerright">
<div style="position: absolute; top: 10px; right: 5px; float: right;">
<table style="width: 330px;">
<tr><td colspan="4" width="200px"><span style="color: #FFFFFF;font-size: 18px;">BRIDGE MANAGEMENT SYSTEM</span> <span style="color: #FFCC33;font-size: 30px; line-height: .8;">2</span></td></tr>
</table>
</div>
<div class="dropdown">
<span class="dropdown-header">Welcome Aaron Rovenolt</span>
<img class="banner" src="images/user_logo.png" style="width:20px; position:relative; bottom: -1px;">
<div class="dropdown-content">
<p class="dropdown-section">Account Actions</p>
<!-- <p><a href="#">iForms Password</a></p> -->
<p><a href="SVCOMLogin?action=logout">Logout</a>
<p><br></p>
<p class="dropdown-section">User Information</p>
<p>Userid: arovenolt</p>
<p>District: </p>
<p>Email: arovenolt@larsondesigngroup.com</p>
</div>
</div>
</td>
</tr>
</table>
<IFRAME id='button1_frame' class='button' src='javascript:false;' ></IFRAME>
<DIV ID='button1' class='button' >
<DIV ID='buttonItem1_2' class='buttonItem' title='Return to Structure Home' cmd="javascript:doPDButton("href","PDTagServlet?action=pop&value=2","")" onBefore="checkSelectionConfirmation("","","")" >
<A href='' ID='buttonItem1_2_Link' onclick='return false;' class='buttonItem' >Structure Home</A>
</DIV>
<DIV ID='buttonItem1_3' class='buttonItem' title='Return to Structure Home' cmd="javascript:doPDButton("href","PDTagServlet?action=pop&value=3","")" onBefore="checkSelectionConfirmation("","","")" >
<A href='' ID='buttonItem1_3_Link' onclick='return false;' class='buttonItem' >Structure Home</A>
</DIV>
</DIV>
<DIV ID='toolbarBar' class='toolbarBar' >
<DIV ID='buttonBar1' class='toolbarButton' menu='button1' dropdownId='Toolbar_Arrow_1' title='Back' cmd="javascript:doPDButton("href","PDTagServlet?action=pop&value=2","")" onBefore="checkSelectionConfirmation("","","")" >
<A href='' ID='buttonBar1_Link' onclick='return false;' class='toolbarButton' ><img src='images/i_back.gif' border=0 align='left'>Back <Span Id='Toolbar_Arrow_1' class='toolbarArrow'>6</Span> </A>
</DIV>
<DIV ID='buttonBar2' class='toolbarButton' title='Save' cmd="javascript:doPDButton("submit","pdform","")" onBefore="checkSelectionConfirmation("true","setAction('Save')","")" >
<A href='' ID='buttonBar2_Link' onclick='return false;' class='toolbarButton' ><img src='images/i_save.gif' align='left' > Save</A>
</DIV>
<DIV ID='buttonBar3' class='toolbarButton' title='Print' cmd="javascript:doPDButton("window","PDTagServlet?action=printerFriendly&jspName=WEB-INF/jsp/INVstructureHome.jsp","")" onBefore="checkSelectionConfirmation("","","")" >
<A href='' ID='buttonBar3_Link' onclick='return false;' class='toolbarButton' ><img src='images/i_printer.gif' border=0 align='left'></A>
</DIV>
<DIV ID='buttonBar4' class='toolbarButton' title='Click here to lauch screen help.' cmd="javascript:doPDButton("script","openScreenHelpWindow('WEB-INF/jsp/INVstructureHome.jsp')","")" onBefore="checkSelectionConfirmation("","","")" >
<A href='' ID='buttonBar4_Link' onclick='return false;' class='toolbarButton' ><img src='images/i_help.gif' border=0 align='left'>Help</A>
</DIV>
</DIV>
<SCRIPT> InitToolbar(); </SCRIPT>
<!-- Include file for displaying current ECMSMessages -->
<div id="dataView" style="width:100%; height:100%; overflow:auto;">
<script language=JavaScript src="javascript/calendarControl.js"></script>
<table><tr><td>
<script type="text/javascript">
function submitHeader(s){
document.pdformheader.action.value = s;
document.pdformheader.submit();
}
</script>
<style>
span.validation_counts {
color: white;
border-radius: 5px;
padding: 0px 6px 0px 6px;
border-color:white;
border-width: 2px;
}
span.validation_failures {
background: #A93226;
padding: 0px 6px 1px 6px;
}
span.zero_validation_failures {
color: #A93226;
padding: 0px 6px 1px 6px;
}
span.validation_errors {
background: #BA4A00;
}
span.zero_validation_errors {
color: #BA4A00;
}
span.validation_warnings {
background: #D4AC0D;
}
span.zero_validation_warnings {
color: #D4AC0D;
}
span.validation_infos {
background: #2E86C1;
}
span.zero_validation_infos {
color: #2E86C1;
}
</style>
<form name="pdformheader" action = "SVCOMScreenHeader">
<input type="hidden" name=action>
<TABLE class="title">
<colgroup>
<col width="100%" class="left" />
<col width="12%" class="center" />
</colgroup>
<TBODY>
<tr class="title">
<td class="title">
Inventory - Structure Home<input type=hidden name=HEADER_TITLE value='Inventory - Structure Home'>
</td>
<TD valign="bottom" bgcolor="#4791C5" nowrap>
<A class="qlinkshowhide" id="hqlinks"
href="javascript: toggleQlinks('hqlinks','qlinks')">
Hide Quick Links
</A>
</TD>
</tr>
</TBODY>
</TABLE>
<TABLE class="cheader" >
<colgroup>
<col width="100%" class="center" />
</colgroup>
<TBODY>
<TR>
<TD style="padding: 5px;">
<TABLE class="body">
<COLGROUP width="10%"/>
<COLGROUP width="18%"/>
<COLGROUP width="10%"/>
<COLGROUP width="10%"/>
<COLGROUP width="9%"/>
<COLGROUP width="16%"/>
<COLGROUP width="4%"/>
<COLGROUP width="23%"/>
<TBODY>
<TR>
<TD class="labelright" nowrap>5A01 SR ID:</TD>
<TD class="dataleft" nowrap>
<input type='text' name='HEADER_BRIDGE_ID' value='08018704200018' onBlur='return true;' onkeydown="return processEnterKey('findbysrid');" size='18' maxlength='14' class="PDTextbox" >
<input id="findbysrid" class="goButton" align="middle" type="image" alt="Find Structure by SR ID" title="Find Structure by SR ID"
src="images/go5.gif" name="sridgo" width="21" height="21"
onclick="submitHeader('SHOWBYSRID');"/>
</TD>
<TD class="labelright" nowrap>5A03 BRKEY:</TD>
<TD class="dataleft" nowrap>
<input type='text' name='HEADER_BRKEY' value='6105' onBlur='return true;' onkeydown="return processEnterKey('findbybrkey');" size='6' maxlength='6' class="PDTextbox" >
<input id="findbybrkey" class="goButton" align="middle" type="image" alt="Find Structure by BRKEY" title="Find Structure by BRKEY"
src="images/go5.gif" name="brkeygo" width="21" height="21"
onclick="submitHeader('SHOWBYBRKEY');"/>
</TD>
<TD class="labelright" nowrap>Agency ID:</TD>
<TD class="dataleft" nowrap>
<input type='text' name='HEADER_AGENCY_ID' value='' onBlur='return true;' onkeydown="return processEnterKey('findbyagid');" size='15' maxlength='15' class="PDTextbox" >
<input id="findbyagid" class="goButton" align="middle" type="image" alt="Find Structure by Agency ID" title="Find Structure by Agency ID"
src="images/go5.gif" name="agencyidgo" width="21" height="21"
onclick="submitHeader('SHOWBYAGENCYID');"/>
</TD>
<TD class="labelright" nowrap>Go To:</TD>
<TD class="dataleft" nowrap>
<select name=HEADER_CODE size=1 maxlength='30' class="PDDropdown" >
<option value='99' false>ADMIN CONFIGURATION</option>
<option value='52' false>Bulletin Board Message List</option>
<option value='98' false>Configuration - Element NBE</option>
<option value='82' false>Email Configuration</option>
<option value='83' false>Email Notifications</option>
<option value='78' false>FHWA Submittal</option>
<option value='79' false>Feature Manager</option>
<option value='5' false>Inspection - Agency Inspection</option>
<option value='10' false>Inspection - Comments</option>
<option value='32' false>Inspection - Fracture Critical</option>
<option value='3' false>Inspection - Load Rating</option>
<option value='59' false>Inspection - Misc</option>
<option value='2' false>Inspection - Ratings, Schedule</option>
<option value='63' false>Inspection - Safety Features</option>
<option value='62' false>Inspection - Signings</option>
<option value='57' false>Inspection - Signs and Lights</option>
<option value='15' false>Inspection - Underwater</option>
<option value='58' false>Inspection - Walls</option>
<option value='47' false>Inspection-Element List</option>
<option value='14' false>Inventory - Agency Bridge</option>
<option value='11' false>Inventory - BP Assignment</option>
<option value='7' false>Inventory - Design</option>
<option value='8' false>Inventory - Drawings Notes</option>
<option value='4' false>Inventory - Features Intersected</option>
<option value='6' false>Inventory - Inspection Planning</option>
<option value='55' false>Inventory - Misc</option>
<option value='64' false>Inventory - Paint</option>
<option value='43' false>Inventory - Posting</option>
<option value='61' false>Inventory - Risk Assessment</option>
<option value='56' false>Inventory - Signs and Lights</option>
<option value='84' false>Inventory - Structure Group</option>
<option value='1' selected true>Inventory - Structure Home</option>
<option value='9' false>Inventory - Structure Units</option>
<option value='20' false>Inventory - Tunnel</option>
<option value='54' false>Inventory - Walls</option>
<option value='66' false>Other - APRAS Data</option>
<option value='41' false>Other - Maintenance - Completed Maintenance</option>
<option value='42' false>Other - Maintenance - Proposed Maintenance</option>
<option value='53' false>Other - Reports</option>
<option value='27' false>Other - Structure EDMS Document List</option>
<option value='81' false>Other - Validate Structure</option>
<option value='75' false>Special Permission</option>
<option value='80' false>Validation Expressions</option>
</select>