Data scraping of molar masses from website

avogadrosnumber

New Member
Joined
Feb 8, 2015
Messages
15
Hi All,
I simply do not have the skill set to solve a problem I have. I am creating a database for my students of compounds in chemistry. I have an excel list of the compound names. I have also created a url link (using the concatenated function) that enables me to go to google and there I can find the molar mass for that compound. I have several thousand compounds and I would like to be able to scrape the molar masses off of the website for each of the compounds and paste it into a cell, next to the compound, as a plain text entry. I will then create a second column and round the scraped data to two decimal places.

column 1: Sodium bromide
column 2: ="https://www.google.com/#q="&B14&"+molar+mass"
column 3: https://www.google.com/#q=sodium bromide+molar+mass
column 4: This is where I need help - I want to paste in the data from the web link that displays the molar mass for sodium bromide

Obviously this is one compound - I have several hundred such compounds and I want to collect the molar masses WITHOUT cutting and pasting.

Thank you for any help. My VBA skills are non-existent so if you could please comment the code, if you can, I could perhaps learn/understand the syntax.

Thanks,
Sam
 

Excel Facts

What is the shortcut key for Format Selection?
Ctrl+1 (the number one) will open the Format dialog for whatever is selected.
Re: High school chemistry teacher needs help with a data scraping of molar masses from website

What version of Excel are you using?
 
Upvote 0
Re: High school chemistry teacher needs help with a data scraping of molar masses from website

If you are using Excel 2016, this can be much more easily accomplished by getting into the M Query stuff. But, if not, I have some code that will scrape internet pages for you. I tried doing it the way you had it by doing Google searches, but part way through it starts to fail because Google starts with the Captcha BS asking if you are a human. So, I changed the URLs to wikipedia URLs such as, https://en.wikipedia.org/wiki/Acetaldehyde.

My URLs are in column H and the Molar masses that will be returned are going into column D, so you will need to adjust those parts of the code to fit your worksheet. I have commented the code in the areas that will need to be adjusted. I went to a random webpage to get the chemical compounds to test this out, and it returned 93% of the Molar masses. You probably won't get 100%, but it should get you close. My problem was mostly with having compounds that had several variants such as mono, di and tri. OK, I think that's it, here's the code....

Code:
Sub ScrapeMolarMass()
Dim IE As New InternetExplorer
Dim AR()
Dim Res()
Dim obj As Object
Dim r As Range
Dim t As HTMLTable
Dim ro As Object


Set r = Range("H3", Range("H" & Rows.Count).End(xlUp).Address) 'This is the column where I have the wikipedia links
AR() = r.Value
Set r = r.Offset(, -4) 'This offsets the range 'r' to the left 4 columns i.e. column D.  Change that -4 value to whatever column you want
Res() = r.Value


With IE
    .Visible = False
        For i = 1 To UBound(AR)
            On Error Resume Next
            .navigate AR(i, 1)
            
                While .readyState <> 4
                    DoEvents
                Wend
                
                Set obj = .document.getElementsByClassName("infobox bordered")
                Set t = obj(0)
                For Each ro In t.Rows
                    If InStr(ro.innerText, "Molar mass") > 0 Then
                        Res(i, 1) = ro.innerText
                        Exit For
                    End If
                Next ro


        Next i
        r.Value = Res
        .Quit
End With
Set IE = Nothing
End Sub
 
Upvote 0
Re: High school chemistry teacher needs help with a data scraping of molar masses from website

Thank you so much for taking the time to help with this. I am going to give it a whirl and see how it works. I will let you know how it works.
Sam
 
Upvote 0
Re: High school chemistry teacher needs help with a data scraping of molar masses from website

In your instructions you mention array formulas (Ctrl + Shift + Enter). I would imagine in your module though this does not apply.
I tried running this - I pasted your link in H3 ( https://en.wikipedia.org/wiki/Acetaldehyde.) and then attempted to run the macro. I get the following error: Compile error: User-defined type not defined
 
Upvote 0
The array formula stuff is just part of my signature. Doesn't apply to this post. And, I forgot. In the visual basic editor, you have to go to tools, references, and add a reference to Microsoft internet controls.
 
Upvote 0
Sorry to bother you again - I don't think I have near the references I need in tools to run this script. I get the same error but this time three rows down. I am going to google it and try and figure this out but I remain unsure what other references you would recommend I install. a screens shot of all the ones you have installed would work - if at all possible. Again, thank you for your time.
Sam
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top