DataBlake
Well-known Member
- Joined
- Jan 26, 2015
- Messages
- 781
- Office Version
- 2016
- Platform
- Windows
Hi all,
I have a rather complicated (to me) scrape to do, but i'm sure it would be simpler to someone experienced. I've never done any programming outside of local files so I'm unsure of most things with web scraping but have been trying to research. I know that i want to use XML request as to not actually open web pages as i will be scraping my own eBay storefront of 70,000 listings. eBay doesn't have a way for you to download your own information without API access. So i have to scrape my store
luckily ebay does have easy page navigation in the url. so i imagine i would loop through pages until the "0 results shown in all categories " is displayed which would indicate the end of the page navigation
example: Auto Addictions | eBay Stores
so for each page there are 48 listings (or less on the last page)
now for the actual things i need scraped, if you click on a listing like this
i need these elements
if #mm-saleOrgPrc is not there then grab #prcIsum in its place
the table and container are what mainly throw me off. The table, or "item specifics", contains the headers on the left and the values on the right (so we only want to grab 2nd and 4th columns)
For the #container i need the Inner CSS HTML
#descItemNumber is also found in the url (i think)
here is an example of what the following listings should produce
**note i cut the HTML Inner of the description down a lot because it was a little much. separated the beginning and the end with periods.
there is a lot of variance in the listings such as the Black Rhino listing does not have Bolt Pattern 2 in the Item Specifics Table
there are listings that have wheel and tire packages that look like this
for these i JUST need the HTML Inner for description, Title, and the ebay item #container
the description for these tire package listings i believe is "body > table:nth-child(1)" instead of #container
a qualifier for these listings can be that if the item specifics or description contain "tire"
or if item specifics has "Aspect Ratio" or "Section Width"
I have a rather complicated (to me) scrape to do, but i'm sure it would be simpler to someone experienced. I've never done any programming outside of local files so I'm unsure of most things with web scraping but have been trying to research. I know that i want to use XML request as to not actually open web pages as i will be scraping my own eBay storefront of 70,000 listings. eBay doesn't have a way for you to download your own information without API access. So i have to scrape my store
Security Measure
www.ebay.com
luckily ebay does have easy page navigation in the url. so i imagine i would loop through pages until the "0 results shown in all categories " is displayed which would indicate the end of the page navigation
example: Auto Addictions | eBay Stores
so for each page there are 48 listings (or less on the last page)
now for the actual things i need scraped, if you click on a listing like this
16x8 Black Moto Metal Mo970 Wheels 8x6.5 0 for sale online | eBay
Find many great new & used options and get the best deals for 16x8 Black Moto Metal Mo970 Wheels 8x6.5 0 at the best online prices at eBay! Free shipping for many products!
www.ebay.com
i need these elements
HTML:
#itemTitle
#mm-saleOrgPrc
.section > table:nth-child(2)
#descItemNumber
#container
if #mm-saleOrgPrc is not there then grab #prcIsum in its place
the table and container are what mainly throw me off. The table, or "item specifics", contains the headers on the left and the values on the right (so we only want to grab 2nd and 4th columns)
For the #container i need the Inner CSS HTML
#descItemNumber is also found in the url (i think)
here is an example of what the following listings should produce
16x8 Black Moto Metal Mo970 Wheels 8x6.5 0 for sale online | eBay
Find many great new & used options and get the best deals for 16x8 Black Moto Metal Mo970 Wheels 8x6.5 0 at the best online prices at eBay! Free shipping for many products!
www.ebay.com
One (1) 17x8 Black Rhino Arsenal ET 30 Black 5x120 Wheel Rim | eBay
Find many great new & used options and get the best deals for One (1) 17x8 Black Rhino Arsenal ET 30 Black 5x120 Wheel Rim at the best online prices at eBay! Free shipping for many products!
www.ebay.com
One (1) 20x8.5 ET 35 NICHE Targa M131 Silver Wheel 5x114.3 | eBay
Find many great new & used options and get the best deals for One (1) 20x8.5 ET 35 NICHE Targa M131 Silver Wheel 5x114.3 at the best online prices at eBay! Free shipping for many products!
www.ebay.com
**note i cut the HTML Inner of the description down a lot because it was a little much. separated the beginning and the end with periods.
Book1 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | |||
1 | Title: | Price: | Ebay Item Number: | Condition: | Back Spacing: | Bolt Pattern: | Bolt Pattern 2: | Color: | Number of Bolts: | Manufacturer Part Number: | Brand: | Rim Diameter: | Rim Width: | Offset: | Description: | ||
2 | Set of 4 16x8 Mo970 Black Machine 8x165.1 Wheels Rims SILVERADO 2500 | $540.00 | 283545274424 | New | 4.5 | 8x165.10 | 8X6.5 | Gloss Black Machined Face | 8 | MO97068080300 | Moto Metal | 16 | 8 | 0 | <div id="compatibility"><h3>*Please Note eBay's built in compatibility checker . . . . . We have carefully selected the industry’s best brands to make sure you get the highest quality wheel available. Our team of wheel experts is always available (during business hours) to pick the right fitment and style for you.</em></p> | ||
3 | Single 17x8 Black Rhino Arsenal ET 30 Black 5x120 Wheel Rim | $220.00 | 372904320262 | New | 5.68 | 5x120 | TEXTURED MATTE BLACK | 5 | 1780ARS305120M76 | BLACK RHINO | 17 | 8 | 30 | <div id="compatibility"><h3>*Please Note eBay's built in compatibility checker . . . . . We have carefully selected the industry’s best brands to make sure you get the highest quality wheel available. Our team of wheel experts is always available (during business hours) to pick the right fitment and style for you.</em></p> | |||
4 | 1 New 20x8.5 ET 35 NICHE Targa M131 Silver Wheel 5x114.3 | $325.00 | 283433240008 | New | 6.13 | 5x114.3 | 5X4.5 | Gloss Silver Machined | 5 | M131208565+35 | Niche | 20 | 8.5 | 35 | <div id="compatibility"><h3>*Please Note eBay's built in compatibility checker . . . . . We have carefully selected the industry’s best brands to make sure you get the highest quality wheel available. Our team of wheel experts is always available (during business hours) to pick the right fitment and style for you.</em></p> | ||
Sheet1 |
there is a lot of variance in the listings such as the Black Rhino listing does not have Bolt Pattern 2 in the Item Specifics Table
there are listings that have wheel and tire packages that look like this
5) 17x9 Fuel Covert Black 35" Toyo MT Wheels Rims Tires 5x5 Jeep Wrangler JK JL | eBay
Find many great new & used options and get the best deals for 5) 17x9 Fuel Covert Black 35" Toyo MT Wheels Rims Tires 5x5 Jeep Wrangler JK JL at the best online prices at eBay! Free shipping for many products!
www.ebay.com
for these i JUST need the HTML Inner for description, Title, and the ebay item #container
the description for these tire package listings i believe is "body > table:nth-child(1)" instead of #container
a qualifier for these listings can be that if the item specifics or description contain "tire"
or if item specifics has "Aspect Ratio" or "Section Width"
Book1 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | |||
1 | Title: | Price: | Ebay Item Number: | Condition: | Back Spacing: | Bolt Pattern: | Bolt Pattern 2: | Color: | Number of Bolts: | Manufacturer Part Number: | Brand: | Rim Diameter: | Rim Width: | Offset: | Description: | ||
2 | Set of 4 16x8 Mo970 Black Machine 8x165.1 Wheels Rims SILVERADO 2500 | $540.00 | 283545274424 | New | 4.5 | 8x165.10 | 8X6.5 | Gloss Black Machined Face | 8 | MO97068080300 | Moto Metal | 16 | 8 | 0 | <div id="compatibility"><h3>*Please Note eBay's built in compatibility checker . . . . . We have carefully selected the industry’s best brands to make sure you get the highest quality wheel available. Our team of wheel experts is always available (during business hours) to pick the right fitment and style for you.</em></p> | ||
3 | Single 17x8 Black Rhino Arsenal ET 30 Black 5x120 Wheel Rim | $220.00 | 372904320262 | New | 5.68 | 5x120 | TEXTURED MATTE BLACK | 5 | 1780ARS305120M76 | BLACK RHINO | 17 | 8 | 30 | <div id="compatibility"><h3>*Please Note eBay's built in compatibility checker . . . . . We have carefully selected the industry’s best brands to make sure you get the highest quality wheel available. Our team of wheel experts is always available (during business hours) to pick the right fitment and style for you.</em></p> | |||
4 | 1 New 20x8.5 ET 35 NICHE Targa M131 Silver Wheel 5x114.3 | $325.00 | 283433240008 | New | 6.13 | 5x114.3 | 5X4.5 | Gloss Silver Machined | 5 | M131208565+35 | Niche | 20 | 8.5 | 35 | <div id="compatibility"><h3>*Please Note eBay's built in compatibility checker . . . . . We have carefully selected the industry’s best brands to make sure you get the highest quality wheel available. Our team of wheel experts is always available (during business hours) to pick the right fitment and style for you.</em></p> | ||
5 | Details about 5) 17x9 Fuel Covert Black 35" Toyo MT Wheels Rims Tires 5x5 Jeep Wrangler JK JL | 283623003050 | <font rwr="1" style="font-family: Arial;"><font rwr="1" style="font-family: Arial;"><font style="font-size: 14pt;" size="4" color="#000000"><strong> <p align="center"><font size="4.5" face="Arial" color="#ff0010"><strong>*Please Note eBay's built in compatibility checker does not properly......................<strong>Most of the pictures we use in our listings are stock pictures provided from the manufacture. Actual product may vary in lip size, number of bolts and wheel size. Please verify with us before purchasing if there are any questions!</strong></p></font></div></font></font></font></font></font></font></font></div> | ||||||||||||||
Sheet1 |