Power Query Fuzzy Lookup across one row

sauzee

New Member
Joined
Aug 24, 2017
Messages
22
Office Version
  1. 365
Platform
  1. Windows
Hi All

I'm trying to get all fuzzy lookups within a row to help with analysis. This thread has proved very helpful


But I'm struggling to exactly implement the solution. I have attached a much smaller version of the table and my current code up to the merge and fuzzy match is below. Many thanks!

Power Query:
let
    Source = Excel.CurrentWorkbook(),
    Table1 = Source{[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Table1,{{"Name1", type text}, {"Count1", type text}}),
    #"Merged Queries" = Table.FuzzyNestedJoin(#"Changed Type", {"Name1"}, #"Changed Type", {"Name1"}, "Changed Type", JoinKind.LeftOuter, [IgnoreCase=true, IgnoreSpace=true, Threshold=0.85, SimilarityColumnName="Similarity"]),
    #"Renamed Columns" = Table.RenameColumns(#"Merged Queries",{{"Changed Type", "Table"}})
in
    #"Renamed Columns"
 

Attachments

  • Screenshot 2022-10-21 122601.png
    Screenshot 2022-10-21 122601.png
    23.3 KB · Views: 33

Excel Facts

Waterfall charts in Excel?
Office 365 customers have access to Waterfall charts since late 2016. They were added to Excel 2019.
I am a bit confused, are you joining the table to itself ? Do you only have 1 table ?

Yes, it could be set up as two identical tables. But it's one list of names and I'm trying to find the fuzzy matches within that one list. So made sense just to fuzzy lookup the initial table to itself.

I could break it into two tables by perhaps including one table with a high count and then match it against the names with a lower count. But I'm intrigued to see the full list first.

Thanks
 
Upvote 0
Not sure how well that will work for you. If you set a high threshold it will only return itself or all exact matches.
If you set a lower threshold you are going to get multiple records for each line.

What are you hoping to achieve ?
 
Upvote 0
Not sure how well that will work for you. If you set a high threshold it will only return itself or all exact matches.
If you set a lower threshold you are going to get multiple records for each line.

What are you hoping to achieve ?

Yes it will create a problem with one name being matched to another, then that same name being matched back the other way. That's something I'll just have to go through manually.

It will be also match each name with itself but these will be easy enough to filter out

Thanks
 
Upvote 0
So what is it doing or not doing, that you want us to look at ?
So what is it doing or not doing, that you want us to look at ?
You outlined a solution to get all matches across one row here


Getting all the matches on one row would be great (which I can then split by delimeter)

but I couldn't quite get the code you mentioned here to create a table like that


I'm not sure where where I'm going wrong, it seems pretty straightforward but I just couldn't replicate the output you got

Thanks
 
Upvote 0
So something like this (from the names in my earlier example)

Getting rid of the duplicates would be incredible but I honest can't visualise how to do that yet. Creating a helper column of all the separate strings then filtering out duplicates should be possible. But investigate that after!

Thanks
 

Attachments

  • Screenshot 2022-10-21 145217.png
    Screenshot 2022-10-21 145217.png
    7.2 KB · Views: 19
Upvote 0
Its pretty late here, so I will have a look tomorrow.
If you can give me a bit more data via an XL2BB that would be great, otherwise I will try retyping just the 3 lines in your screenshot.
 
Upvote 0
Yes of course. Here's a table with a few more example.

Name1Count1
Jose A Hernandez81
Jose A. Fernandez1
Micheal Carter-Demary35
Micheal Carter Demary2
Christian Iglesias5
Gaetan Weydts Caessens5
Phillip Mac OGrady5
Jose A. Hernandez5
Michael Carter Demary351
Iglesias Cristian25
Cristian Iglesias22
Gaetan Weydts-Caesens50
Gaetan Weydts Caesens2
Phillip Mac O'Grady122
Phillip Mac O''Grady2


Many thanks, I really appreciate the help!
 
Upvote 0

Forum statistics

Threads
1,223,996
Messages
6,175,858
Members
452,676
Latest member
woodyp

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top