Extracting portions of URLs in Excel

TraderDavis

New Member
Joined
Feb 12, 2015
Messages
4
I have a list of URLs that I need to sift through, such as:
facebook.com
home.facebook.com
live.fanfooty.com.au
fanfooty.com.au
desktop.spotify.com
zlink.app.spotify.com

I only need the main part of the domain (facebook.com, spotify.com, fanfooty.com.au), and there are many different variations of these duplicates. I've been trying to figure out a conditional format that will highlight the duplicate cells containing the main part (so it should highlight "home.facebook.com" and not "facebook.com"), but I think it may not be possible. I only need to highlight the cells so I can deal with them on a case-by-case basis. Perhaps this can be done in VBA?

This is as far as I got with the conditional format... it doesn't work of course.
Code:
=AND(LEN($A10)-LEN(SUBSTITUTE($A10,".",""))>=2,OR(ISERROR(SEARCH(".com.",$A10)),ISERROR(SEARCH(".co.",$A10))),NOT(ISERROR(MATCH(MID($A10,FIND("@",SUBSTITUTE(A10,".","@",LEN(A10)-(LEN(SUBSTITUTE(A10,".",""))+1)))+1,(FIND("@",SUBSTITUTE(A10,".","@",LEN(A10)-LEN(SUBSTITUTE(A10,".",""))))+1-FIND("@",SUBSTITUTE(A10,".","@",LEN(A10)-(LEN(SUBSTITUTE(A10,".",""))+1)))-1)),A:A)))
 

Excel Facts

Create a Pivot Table on a Map
If your data has zip codes, postal codes, or city names, select the data and use Insert, 3D Map. (Found to right of chart icons).
Give this a try to extract in another column what you need:

=SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE(A1,"."," ")," ",REPT(" ",100)),(1*ISNUMBER(SEARCH(".??#",A1&"#"))+2)*100))," ",".")
Excel Workbook
AB
1facebook.comfacebook.com
2home.facebook.comfacebook.com
3live.fanfooty.com.aufanfooty.com.au
4fanfooty.com.aufanfooty.com.au
5desktop.spotify.comspotify.com
6zlink.app.spotify.comspotify.com
Sheet
 
Upvote 0
that would give you a result of either 200 or 300, depends on how your domain ends. If it ends with .??, it returns a result of 300; else, it returns a result of 200.
A nice trick to determine whether you need the final two words or three words.

Thanks, this works perfectly! Could you explain how the ISNUMBER(SEARCH part works?
 
Upvote 0
=SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(SUBSTITUTE(A1,"."," ")," ",REPT(" ",100)),(1*ISNUMBER(SEARCH(".??#",A1&"#"))+2)*100))," ",".")

The inner Substitute converts each period to space, the outer Substitute converts each space to 100 spaces, so each part of the Url is separated with 100 spaces. So 100 characers from the right embeds the last part, 200 the last two parts, 300 the last three parts of the URL.

It is supposed that if the Url ends in „.??#” (two characters after the last period), you need three parts to keep, if it does not end with that then you need only the last two parts to extract. Search for „.??#” gives a numeric value in the former case, so 1*Isnumber will be 1, 1+2 = 3, so 3*100 = 300 characters will be extracted from the right end. If Search cannot find the „.??#” string then Isnumber and 1*isnumber will be zero, zero +2 will be 2, and 2*100 characters will be extracted from the right end.
The string „.??” alone would be also found at other locations of the URL, for example .fa(cebook.com) therefore „#” is put to the end of A1 to restrict the search for „.??” (period followed by two characters) only to the end of the string in A1.
Trim deletes the excess spaces, the wrapping Substitute converts the spaces to periods.
 
Upvote 0

Forum statistics

Threads
1,220,965
Messages
6,157,119
Members
451,398
Latest member
rjsteward

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top