Extracting Dates from a website -e.g. https://core.cro.ie/ (for each company in a list)

Djro69

New Member
Joined
Nov 2, 2020
Messages
15
Office Version
  1. 2019
Platform
  1. Windows
Anyone know the best way to get Next Annual Return date from https://core.cro.ie if I have all the website addresses in column B?

So, layout of my spreadsheet is as follows:
Column A = Company name
Column B = CRO website address for that company
Column C = I would like to have automatically populated with all the Next Annual Return dates

e.g.

A - B - C

GOOGLE IRELAND LIMITED - https://core.cro.ie/e-commerce/company/search/401323 - ????? (30/09/2023)

ADOBE SOFTWARE TRADING COMPANY LIMITED - https://core.cro.ie/e-commerce/company/search/394859 - ????? (03/09/2023)​

MUSGRAVE LIMITED - https://core.cro.ie/e-commerce/company/search/41155 - ?????? (30/09/2023)​


Essentially, I'd like to get the spreadsheet to check a lot of companies in one go.

(FYI - Unfortunately, I don't understand how the website address is made for each but I can get them manually for now. But if anyone can advise that too, I'd be keen to hear. i,e, I don't know why Google Ireland above has a search ending in 401323... I don't see what that number is to them or if it is even related, it seems static anyway.)
 

Excel Facts

How can you automate Excel?
Press Alt+F11 from Windows Excel to open the Visual Basic for Applications (VBA) editor.
Power Query to give you something like this
1683796879986.png


Follow this for the steps, a very thorough walkthrough
Import Multiple Pages from Web with One Query in Excel

This is the code I have to tidy up the data in the PQ Advanced editor for the Function (Fx...Name) Connection only

Power Query:
(URL) as table =>

let
    Source = Json.Document(Web.Contents(URL)),
    data = Source[data],
    #"Converted to Table" = Record.ToTable(data),
    #"Pivoted Column" = Table.Pivot(#"Converted to Table", List.Distinct(#"Converted to Table"[Name]), "Name", "Value"),
    #"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"entityRegAddress", "entityMailAddress", "entityRegDate", "entityRegNumber", "entityStatusDesc", "entityStatusExtDesc", "entityStatusEffectiveDate", "entityTypeId", "entityTypeDesc", "personInd", "entityTitle", "entityLastName", "entityFirstName", "entityMiddleName", "entityExpiryDate", "entityRegEmail", "entityMobile", "entitySuffix", "entityAttributes"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"annualReturnDueDate", type datetime}}),
    #"Split Column by Delimiter" = Table.SplitColumn(Table.TransformColumnTypes(#"Changed Type", {{"annualReturnDueDate", type text}}, "en-IE"), "annualReturnDueDate", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"annualReturnDueDate.1", "annualReturnDueDate.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"annualReturnDueDate.1", type date}, {"annualReturnDueDate.2", type time}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Changed Type1",{"annualReturnDueDate.2"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns1",{{"annualReturnDueDate.1", "AnnualReturnDueDate"}})
in
    #"Renamed Columns"


Code for URLS in PQ Advanced editor
Power Query:
let
    Source = Excel.CurrentWorkbook(){[Name="URLs"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Company Code", Int64.Type}, {"URL", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "FetchCompany", each fxAnnual_Return_Date([URL])),
    #"Expanded FetchCompany" = Table.ExpandTableColumn(#"Added Custom", "FetchCompany", {"entityId", "entityRegName", "AnnualReturnDueDate"}, {"entityId", "entityRegName", "AnnualReturnDueDate"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded FetchCompany",{"Company Code", "URL"})
in
    #"Removed Columns"
 
Upvote 0
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top