Data Types Issue

Mchllwoods

New Member
Joined
Feb 15, 2019
Messages
18
Office Version
  1. 365
Platform
  1. Windows
When I get the Top 250 Movies from the IMDb site I don't get all the right information to bring into Excel and I have to type them in manually. Then when I do bring them into Power Query some rows won't show the cause of the Data Type error or just don't have the information. Next day or so when the list on the site changes my Power Query won't refresh to accommodate it. Also when I enter Directors and Cast member names in Power Query and split them into First, Middle, and Last Name or Suffix as will. the names won't go into the proper column. Link below is the list I'm trying 2 bring to Excel. Please help!

 

Excel Facts

Format cells as time
Select range and press Ctrl+Shift+2 to format cells as time. (Shift 2 is the @ sign).
I am sorry, but am not able to support you with PowerQuery
If you wish, you can import that table using web scraping techniques.

For that, you need the "Selenium environment" that should be installed on your computer.
If you are interested, you'll find the instruction on how installing it in this message: extract data from web to inside sheet without open website &refersh

After the installation you could use the following macro:
VBA Code:
Sub GetImdb()
Dim WPage As Object
'
'Crea Driver:
Set WPage = CreateObject("Selenium.WebDriver")
myUrl = "https://www.imdb.com/chart/top/"         'Your url
WPage.Start "chrome", myUrl
WPage.Get "/"
WPage.Wait 200
WPage.FindElementsByTag("Table")(1).AsTable.ToExcel Sheets("Sheet1").Range("B2")    '<<< Set a position
WPage.Quit
'
With Sheets("Sheet1").Range("B2").CurrentRegion
    .WrapText = False
    .EntireColumn.AutoFit
End With
End Sub

This will open Chrome, open the IMDB page and import the table in Sheet1-B2

I didn't understand what you do next with the information, but of course the macro can be enhanced to add more action on the imported data
 
Upvote 0
You MUST Transform. You can't expect it to come in without it. Even if it comes in perfectly (ignoring Data Types), it's always worth first checking.
Also, turn Automatic Type off (Data -> Get Data -> Query Options, under Global -> Data Load, select Never detect column types... This will bite you more often than not, and unless needed Data Types shouldn't be set until the very end, and then just select all columns (Ctrl-A) and use Transform -> Detect Data Types. The big problem is that it hard codes column names which could change, especially when inserting a step.
Here's the code I used:
Power Query:
let
    Source = Web.Page(Web.Contents("https://www.imdb.com/chart/top/")),
    Data0 = Source{0}[Data],
    #"Removed Other Columns" = Table.SelectColumns(Data0,{"Rank & Title", "IMDb Rating"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Other Columns", "Rank & Title", Splitter.SplitTextByEachDelimiter({"."}, QuoteStyle.Csv, false), {"Rank & Title.1", "Rank & Title.2"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter", "Rank & Title.2", Splitter.SplitTextByEachDelimiter({"("}, QuoteStyle.Csv, false), {"Rank & Title.2.1", "Rank & Title.2.2"}),
    #"Replaced Value" = Table.ReplaceValue(#"Split Column by Delimiter1",")","",Replacer.ReplaceText,{"Rank & Title.2.2"}),
    #"Renamed Columns" = Table.RenameColumns(#"Replaced Value",{{"Rank & Title.1", "Rank"}, {"Rank & Title.2.1", "Title"}, {"Rank & Title.2.2", "Released"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Rank", Int64.Type}, {"Title", type text}, {"Released", Int64.Type}, {"IMDb Rating", type number}})
in
    #"Changed Type"
That resulted in this in PQ:
1670172311794.png

and this table
Book1
ABCD
1RankTitleReleasedIMDb Rating
21 The Shawshank Redemption 19949.2
32 The Godfather 19729.2
43 The Dark Knight 20089
54 The Godfather Part II 19749
65 12 Angry Men 19579
76 Schindler's List 19938.9
IMDBTop250

No Data Type problems!
 
Upvote 0

Forum statistics

Threads
1,223,989
Messages
6,175,808
Members
452,670
Latest member
nogarth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top