A new fuzzy lookup formula without VBA nor add-in

Excali

New Member
Joined
Jul 9, 2021
Messages
1
Office Version
  1. 365
Platform
  1. Windows
When you want to do a fuzzy lookup without VBA or addin or POWER QUERY, it might be usefull to use dynamic arrays and code a formula that test n-gram substrings to evaluate text similarities.

Hereafter is a detailed view of the computation of string similarities by n-gram substrings :

n-gram similarity.xlsx
BCDEFG
1find_textwithin_textn-gram
2ELOQUENTLYINELOQUENT3
3N
410n-gram similarity0,750=D6/(N-n_gram+1)
5single formula0,750=SOMME(1*ESTNUM(CHERCHE(STXT($B$2;SEQUENCE(NBCAR($B$2)-n_gram+1);n_gram);within_text)))/(N-n_gram+1)
6Total6
71ELO1
82LOQ1
93OQU1
104QUE1
115UEN1
126ENT1
137NTL0
148TLY0
Detailed single formula
Cell Formulas
RangeFormula
B4B4=LEN(B2)
F4F4=D6/(N-n_gram+1)
G4:G5G4=FORMULATEXT(F4)
F5F5=SUM(1*ISNUMBER(SEARCH(MID($B$2,SEQUENCE(LEN($B$2)-n_gram+1),n_gram),within_text)))/(N-n_gram+1)
D6D6=SUM(D7#)
B7:B14B7=SEQUENCE(LEN(B2)-n_gram+1)
C7:C14C7=MID(B2,B7#,n_gram)
D7:D14D7=1*ISNUMBER(SEARCH(C7#,within_text))
Dynamic array formulas.
Named Ranges
NameRefers ToCells
N='Detailed single formula'!$B$4F4:F5
n_gram='Detailed single formula'!$D$2B7:C7, F4:F5
within_text='Detailed single formula'!$C$2F5, D7


By wraping all string matching in a dynamic array formula, one can build a fuzzy lookup which return the most similar text value :

n-gram similarity.xlsx
BCD
1find_textn-gramMatching text
2JACKSON, Michael2Michael JACKSON
3
4
5
6within_textSimple formulaeArray formula
7Michael LORRIE0,40,4
8Jackson FIVE0,40,4
9The Jacksons0,40,4
10Prince of Pop00
11Michael JACKSON0,80,8
12Jackson, Stewart0,5333333330,533333333
13Latoya Jackson0,40,4
Fuzzy Lookup
Cell Formulas
RangeFormula
D2D2=INDEX(within_text_list[within_text],MATCH(MAX( MMULT(1*ISNUMBER(SEARCH(MID(find_text,SEQUENCE(1,LEN(find_text)-n_gram+1),n_gram),within_text_list[within_text]))/(LEN(find_text)-n_gram+1), SEQUENCE(LEN(find_text)-n_gram+1,1,1,0)) ), MMULT(1*ISNUMBER(SEARCH(MID(find_text,SEQUENCE(1,LEN(find_text)-n_gram+1),n_gram),within_text_list[within_text]))/(LEN(find_text)-n_gram+1), SEQUENCE(LEN(find_text)-n_gram+1,1,1,0)),0))
D7:D13D7=MMULT(1*ISNUMBER(SEARCH(MID($B$2,SEQUENCE(1,LEN($B$2)-n_gram+1),n_gram),within_text_list[within_text]))/(LEN($B$2)-n_gram+1), SEQUENCE(LEN($B$2)-n_gram+1,1,1,0))
C7:C13C7=SUM(1*ISNUMBER(SEARCH(MID(find_text,SEQUENCE(1,LEN(find_text)-n_gram+1),n_gram),B7)))/(LEN(find_text)-n_gram+1)
Dynamic array formulas.
Named Ranges
NameRefers ToCells
'Fuzzy Lookup'!find_text='Fuzzy Lookup'!$B$2D7, D2, C7:C13
'Fuzzy Lookup'!n_gram='Fuzzy Lookup'!$C$2D7, D2, C7:C13
Cells with Conditional Formatting
CellConditionCell FormatStop If True
C7:C13Cell Valuetop 1 valuestextNO
D7:D13Cell Valuetop 1 valuestextNO


Obviously, the formula could be simplified when the LET() function will be widely available :)

Hope this helps

Excali
 

Excel Facts

Return population for a City
If you have a list of cities in A2:A100, use Data, Geography. Then =A2.Population and copy down.
Thank you for sharing. That's an interesting utility.

The formulae are pretty complex, and it would be easy to make mistakes when creating them. I'm wondering if it wouldn't be preferable to write a VBA function to make life easier for users.
 
Upvote 0

Forum statistics

Threads
1,223,996
Messages
6,175,867
Members
452,678
Latest member
will_simmo

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top