Learn Excel 2013 - "Harvesting Scraped Web Pages": Podcast #1685

If you like this content, please consider visiting the video on YouTube and subscribe to the MrExcel Channel to show your support!
This video has been published on Apr 10, 2013.
On Monday we looked at getting Data from the Web; yesterday we looked at gathering Links and Importing them to our Workbook, one Worksheet per Link; today we look at working with that harvested Data. After a little review of the last couple of days, Today in Episode #1685, Bill moves into what we can do with our new Data. Set the Data to your specifications for your Worksheet; use a Pivot Table to get to the most important of that Data quickly; think outside of the Excel Podcast Box and consider how you may apply this information in your work, hobbies or home! And be sure to tune in tomorrow for another Podcast in this series to see a few awesome new tricks with Data from Bill "MrExcel" Jelen!

**This Podcast involves Intermediate to Advanced Excel User information; but test your skills and give it try - even if you've never done it before!! Challenge your abilities and learn what Microsoft Excel can really do for you! Have fun!**

This mini-series is a really cool overview of some of our earlier Podcasts and how to integrate a variety of Formulas and Functions to evaluate Real Estate.

...This blog is the video podcast companion to the book, Learn Excel 2007 through Excel 2010 from MrExcel. Download a new two minute video every workday to learn one of the 512 Excel Mysteries Solved! and 35% More Tips than the previous edition of Bill's book! http://www.mrexcel.com/learn2010/LE20

And for more information on Excel 2013, check out "Microsoft 2013 InDepth" -- by Bill Jelen. Excel 2013 In Depth is the beyond-the-basics, beneath-the-surface guide for everyone working with Excel 2013. Excel expert and MVP Bill Jelen provides specific, tested, proven solutions to the problems Excel users run into every day: the types of challenges other books ignore or oversimplify. Jelen thoroughly covers all facets of working with Excel 2013. http://www.amazon.com/Excel-2013-Dept
maxresdefault.jpg


Transcript of the video:
MrExcel Podcast is sponsored by Easy-XL Learn Excel from MrExcel podcast episode 1685: Harvesting Scraped Web Pages.
Hey welcome back to the MrExcel netcast, I'm Bill Jelen.
Well, all this week we've been talking about this project.
Yesterday we went through—and had a macro that pulled down the web page for all of these links here; about 70 different pages.
I have all these different sheets but now I need to collect the information from these various worksheets and you have to kind of go through and see how they have things organized.
It’s stuff always in the same location, like for example here I’m interested in sales amount –does it always start in row 36?
Well, unfortunately in this case it doesn't because the value is above-- there may be more or less especially if you get down here where they have extra feature information, some have 2, some have 3, some have 4, so unfortunately I realized I'm going to have to kind of loop through everything and we actually have several different loops going on here.
First we're looping from Row 2 to 71 on our data page, that tells me each parcel.
Remember we prepended P before the parcel number and that allows me to select Worksheets(ThisParcel) Had that been numeric, they would have thought that I was talking about instead of parcel number 2147652, they would have thought I meant like the 241 000th worksheet in the workbook and so then once we select that worksheet we'll figure out how many rows we have in that worksheet and I realized that some of the information I needed is below the sales information so go cruising down from one to last row looking column C to see if we have the heading of the “Year Built” if it is then save that in a variable called “This Year.” Cruise down through column A column 1, see if it says “Building Area Information” and we can get ThisSqFt and then here's the loop that kind of does the whole thing.
It's looking for the site address saves that, looking for the lot size, saves that, and then once we find the heading of “Sale Date” in column B and then goes and looks at the next ten records and grabs the date, the amount, and type of sale and then finishes off all the loops.
Now this was a hassle to go through and write this.
You know this was, you know probably an hour's worth of code but the payoff here-- see we start off with a completely blank sales page will switch over to VBA and we'll run this, bam, that's all it took.
Went through all 71 pages found all the information I needed and would come back to sales and we now have for every one of these properties there were 70 properties but far more sales than that going back since the development was built.
Now a couple of oddities with this data, any time that there is a transfer like within the family the the property appraiser puts it in at a hundred dollars, so we're going to get rid of that and also these PTs that's where we had a sale where they were partitioning out some properties.
So PTs, PRs and quick claims those were probably short sales so we're going to get rid of those.
All right, so $ per SF so that is sales amount divided by square foot =G2/D2 and Year Sold Column =YEAR(F2) which is the year of the sales date.
Copy those down.
All right so, you have some easy things I can do.
I can see was the whole neighbourhood build up in the 60s or are there some newer houses?
Yes, there's some newer houses.
The range of sizes we go from a small of 1,400 square feet to a large of 2,700 square feet.
We have a couple of dogs Marylyn’s looking for a large lot for the dogs so 0.19 acres to 0.20 acres so there's a few that have larger lots but for the most part no.
So let's take a look at the $ per SF (dollars per square foot) trend.
We will do a pivot table for that.
Insert pivot table click OK, year sold down the left hand side $ per SF and we want to average this instead of sum it so let’s look at the average click OK.
Oh shoot, Number Format, Currency, 0 click OK.
I'm also interested in how many are sold each year, so in this particular street, unfortunately we're just selling a few each year so you know one large house might you know cause problems with the other data but let's just chart it and see what we get.
We will create a little pivot chart here with lines click OK and so there were some interesting bumps here in 73 through 76.
I'm not sure what that is but this I've come to recognize this as the typical Florida housing boom and bust this happens on almost every street that I've looked at.
So right now 2012 $113.
If I drew a straight line forecast and I realized I could do a trend line but I would have to throw out the bad data here if I just kind of draw a straight line extrapolate it out looks like we might be somewhere between 112 and 150 dollars per square foot for houses on the street today.
All right so you know we started out a couple of days ago just collecting data from a single web page visiting every single web page and now pulling the interesting information back into a sheet where I can sort, filter, slice, dice, pivot and start to you know get a picture of the neighbourhood.
All right, well, hey I want to thank you for stopping by.
We’ll see you next time for another netcast from MrExcel.
 

Forum statistics

Threads
1,223,700
Messages
6,173,909
Members
452,536
Latest member
Chiz511

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top