Hello, I want to write a tool that will help me scraping a webpage.
I would like to set through EVERY element on a page and at least debug print or write it to a WS. the processing to debug or putting in to a WS is not important, and I can do that.
I am using MSXML2.XMLHTTP
I know how to get DIV and then step through each DIV and similar, but I do not know how to get all of the children of all types from the main.
Basically, How do you get all of the div, then in each div, how do you process all of the tables, all of the spans, all of the class etc of each div. and then drill down through each of those recursively.
I know how to get the innertext, innerhtml, outertext, outerhtml, ID and all of that, so I do not need those details (not that is would hurt)
This should not really matter what page for what I am looking for. But here is one of the pages. Acorn Nut Fasteners - Luke Rivets for Handles - (.186HD x 4-40US x .9L) - Stainless Steel | KnifeKits.com
I am very surprised I have not found samples of how to do this.
I don't think this would be hard, but I just not quite sure how.
Thanks
Mc
I would like to set through EVERY element on a page and at least debug print or write it to a WS. the processing to debug or putting in to a WS is not important, and I can do that.
I am using MSXML2.XMLHTTP
I know how to get DIV and then step through each DIV and similar, but I do not know how to get all of the children of all types from the main.
Basically, How do you get all of the div, then in each div, how do you process all of the tables, all of the spans, all of the class etc of each div. and then drill down through each of those recursively.
I know how to get the innertext, innerhtml, outertext, outerhtml, ID and all of that, so I do not need those details (not that is would hurt)
This should not really matter what page for what I am looking for. But here is one of the pages. Acorn Nut Fasteners - Luke Rivets for Handles - (.186HD x 4-40US x .9L) - Stainless Steel | KnifeKits.com
I am very surprised I have not found samples of how to do this.
I don't think this would be hard, but I just not quite sure how.
Thanks
Mc