Hello all!
I posted a couple questions about an ongoing project a while back. With the help of people here, and some modifications to the code, I now have a code which signs into the website specified in a certain cell, using user input boxes for username and password. The code will skip this sign in process if the user is already logged in, and go straight to the scraping data part of the code. The code will then use the TR and TD tags to copy over all the data from the HTML table to excel. The code then does some error checking stuff, some formatting, and other various things. The code then schedules the same thing to happen every ten minutes. (I will trim the code down to only the troublesome part however.)
I have one main problem now. In the HTML table, there is an image in one column (always the same column, the 2nd column, which appears in column C in Excel due to column A containing only the table name of the scrapped table) of some of the rows. I found where it is in the HTML, and will paste one row of the table below (heavily edited to protect proprietary stuff, but the row with the image is mostly exact from the source with a few edits that I kept track of so can change wording back.) The HTML looks like this:
EDIT, I couldn't post the exact HTML, it got all wonky on the website.... I took out the < and </ on a bunch of stuff
I can see where this appears in the HTML, but I CANNOT get this image to appear when running my code. I will post only the part which is actually scrapping the data below.
With this information, (the HTML sample row and the current code) would anybody be able to help point me in the right direction so that this code ALSO either puts the image into Excel, OR puts any marker whatsoever (an S maybe?). I see in the HTML that it says..... ;img src= '../images/Icons/S.gif' alt='This indicates STATUS'.....What does the alt= part mean? Could the code simply pull the alt part? Maybe wherever this image appears the code would say 'This indicates STATUS' inside that cell?
Bonus points if anybody can come up with a better way to update the status bar (I am currently just guessing that there will be a bit over 500 rows so using a static 5.3% rows complete...
Thanks for any and all assistance!
I posted a couple questions about an ongoing project a while back. With the help of people here, and some modifications to the code, I now have a code which signs into the website specified in a certain cell, using user input boxes for username and password. The code will skip this sign in process if the user is already logged in, and go straight to the scraping data part of the code. The code will then use the TR and TD tags to copy over all the data from the HTML table to excel. The code then does some error checking stuff, some formatting, and other various things. The code then schedules the same thing to happen every ten minutes. (I will trim the code down to only the troublesome part however.)
I have one main problem now. In the HTML table, there is an image in one column (always the same column, the 2nd column, which appears in column C in Excel due to column A containing only the table name of the scrapped table) of some of the rows. I found where it is in the HTML, and will paste one row of the table below (heavily edited to protect proprietary stuff, but the row with the image is mostly exact from the source with a few edits that I kept track of so can change wording back.) The HTML looks like this:
EDIT, I couldn't post the exact HTML, it got all wonky on the website.... I took out the < and </ on a bunch of stuff
Code:
TR>
TR STUFF
TD STUFF>
TD>
THIS IS THE IMPORTANT PART RIGHT HERE:
TD colspan=1 align="RIGHT" class="USUAL CLASS" OTHER STUFF HERE DON'T THINK WILL BE NEEDED>
FONT Face='Arial,Verdana,Helvetica' Size=1 COLOR="BLACK"> img src= '../images/Icons/S.gif' alt='This indicates STATUS' width=10 align=middle> Font color='#FONTCOLOR'>GFont>/FONT>
TD>
TD STUFF">
<STUFF>
/TD>
TD STUFF
/TD>
TDSTUFF</TD>
TD STUFF/TD>
THERE ARE LIKE 20-30 MORE TD AND /TD TAGS THEN:
TR>
THEN THE NEXT ROW STARTS, SOME HAVE THAT IMAGE THING AND SOME DON'T
I can see where this appears in the HTML, but I CANNOT get this image to appear when running my code. I will post only the part which is actually scrapping the data below.
Code:
Set ws = ThisWorkbook.Worksheets("WORKSHEET NAME")
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
'below I am trying to insert a status bar update for every new row
Application.StatusBar = "Approx. " & nextrow / 5.3 & "% complete."
Next rw
Next tbl
With this information, (the HTML sample row and the current code) would anybody be able to help point me in the right direction so that this code ALSO either puts the image into Excel, OR puts any marker whatsoever (an S maybe?). I see in the HTML that it says..... ;img src= '../images/Icons/S.gif' alt='This indicates STATUS'.....What does the alt= part mean? Could the code simply pull the alt part? Maybe wherever this image appears the code would say 'This indicates STATUS' inside that cell?
Bonus points if anybody can come up with a better way to update the status bar (I am currently just guessing that there will be a bit over 500 rows so using a static 5.3% rows complete...
Thanks for any and all assistance!
Last edited: