How to make vlookup partial match digits from left to right?

Fractalis

Active Member
Joined
Oct 11, 2011
Messages
328
Office Version
  1. 365
Platform
  1. Windows
Hi all,

I'm trying to get a formula to do exact and partial matches.

I use INDEX(...MATCH(...)) to look for exact matches and if fails, then I try to do partial matches using VLOOKUP() but is not showing expected output.

The partial match should show the "best match", this is select the number that has more numbers in common in same order. In other words, select the number that
shares the longest common prefix. The minimun prefix length to match would be 4 digits, is only match 3 digits or none show "NOT FOUND".

I hope make sense, thanks for any help.

Examples:

For number 50660000 in D3, the more similar numbers in column B are 5067000,5063005,506600002 and within these 3, the more similar is 506600002 because shares common
prefix of length 8.
50660000
50660000
2

For number 3520913 in D6, the more similar numbers in column B are 352021,352621,352091000 and within these 3, the more similar is 352091000 because shares common
prefix of length 6.
3520913
352091000

For number 13603399617 in D9, the more similar numbers in column B are 1360339,136033 and within these 2, the more similar is 1360339 because shares common prefix of length 7.
13603399617
1360339

For number 32486 in D7, the more similar number in column B is 32475 but only has the first 3 numbers in common, so show "NOT FOUND"
32486
32475

Excel Formula:
=IFERROR(IFERROR(
        INDEX($A$2:$A$29,MATCH(D2,$B$2:$B29,0)),  --> This do exact match
        VLOOKUP(D2&"*",$A$1:$B$29,2,TRUE)         --> If exact match fails, try partial match
        ),
"NOT FOUND")

Search numbers.xlsx
ABCDEF
1CODENUMBERNUMBERCURRENT OUTPUTEXPECTED OUTPUT
2XZS90506700050660000NOT FOUNDXZS76
3XZS785063005423780000NOT FOUNDXZS13
4XZS7650660000239370XZS81XZS81
5XZS293247539335NOT FOUNDNOT FOUND
6XZS3942060800553520913NOT FOUNDXZS71
7XZS6442060232486NOT FOUNDNOT FOUND
8XZS35423791807631NOT FOUNDNOT FOUND
9XZS13423780013603399617NOT FOUNDXZS99
10XZS9642378076725257000XZS05XZS05
11XZS1642397030
12XZS055257000
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
18XZS271809859
19XZS991360339
20XZS41136033
21XZS981802692
22XZS60180825
23XZS93393533000
24XZS583932059
25XZS31393205800448
26XZS89352021
27XZS46352621
28XZS71352091000
29XZS8139370
Sheet1
Cell Formulas
RangeFormula
E2:E10E2=IFERROR(IFERROR( INDEX($A$2:$A$29,MATCH(D2,$B$2:$B29,0)), VLOOKUP(D2&"*",$A$1:$B$29,2,TRUE) ), "NOT FOUND")
 

Excel Facts

Who is Mr Spreadsheet?
Author John Walkenbach was Mr Spreadsheet until his retirement in June 2019.
This VBA custom function seems pretty close. But I'm not sure about your logic for the "Not Found" ones. Especially with, '1807631'. Because it matches more than the first 3 characters. I added threshold as an argument in the function so you can adjust it.

Prcatice.xlsx
DEF
1NUMBEREXPECTED OUTPUTVBA
250660000XZS76XZS76
3423780000XZS13XZS13
439370XZS81XZS81
539335NOT FOUNDNot Found
63520913XZS71XZS71
732486NOT FOUNDNot Found
81807631NOT FOUNDXZS56
913603399617XZS99XZS99
105257000XZS05XZS05
Sheet4
Cell Formulas
RangeFormula
F2:F10F2=RLOOKUP(D2,$B$2:$B$29,$A$2:$A$29,3)


VBA Code:
Function RLOOKUP(xVal As String, r As Range, lk As Range, threshold As Integer)
Dim AR() As Variant:    AR = r.Value2
Dim HS As Integer:      HS = 0
Dim TMP As Integer:     TMP = 0
Dim IDX As Integer:     IDX = 0

Dim b() As Byte:        b = xVal
Dim t() As Byte:

For i = 1 To UBound(AR)
    t = CStr(AR(i, 1))
    TMP = 0
    If UBound(b) < UBound(t) Then
        For j = 0 To UBound(b) Step 2
            If b(j) = t(j) Then TMP = TMP + 1
        Next j
    Else
        For j = 0 To UBound(t) Step 2
            If b(j) = t(j) Then TMP = TMP + 1
        Next j
    End If
    If TMP > HS Then
        HS = TMP
        IDX = i
    End If
Next i
If HS <= threshold Then
    RLOOKUP = "Not Found"
Else
    RLOOKUP = lk.Cells(IDX).Value
End If
End Function
 
Upvote 1
I'm almost there, but I have a discrepancy for 1807631, since there are two values which start with 1807 in column B.

Book8
ABCDEF
1CODENUMBERNUMBERCURRENT OUTPUTEXPECTED OUTPUT
2XZS90506700050660000XZS76XZS76
3XZS785063005423780000XZS13XZS13
4XZS7650660000239370XZS81XZS81
5XZS293247539335NOT FOUNDNOT FOUND
6XZS3942060800553520913XZS71XZS71
7XZS6442060232486NOT FOUNDNOT FOUND
8XZS35423791807631XZS56NOT FOUND
9XZS13423780013603399617XZS99XZS99
10XZS9642378076725257000XZS05XZS05
11XZS1642397030
12XZS055257000
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
18XZS271809859
19XZS991360339
20XZS41136033
21XZS981802692
22XZS60180825
23XZS93393533000
24XZS583932059
25XZS313.93206E+11
26XZS89352021
27XZS46352621
28XZS71352091000
29XZS8139370
Sheet1
Cell Formulas
RangeFormula
E2:E10E2=IFERROR( LOOKUP(FALSE,ISERROR(INDEX($A$2:$A$29,MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0))), INDEX($A$2:$A$29,MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0))), "NOT FOUND")
 
Upvote 0
Oops, slight amendment to be made, there seems to be an error when entered as an array formula, which is needed for Excel 2019. Made it slightly more compact as well. Previous discrepancy still stands.

Book9
ABCDEF
1CODENUMBERNUMBERCURRENT OUTPUTEXPECTED OUTPUT
2XZS90506700050660000XZS76XZS76
3XZS785063005423780000XZS13XZS13
4XZS7650660000239370XZS81XZS81
5XZS293247539335NOT FOUNDNOT FOUND
6XZS3942060800553520913XZS71XZS71
7XZS6442060232486NOT FOUNDNOT FOUND
8XZS35423791807631XZS56NOT FOUND
9XZS13423780013603399617XZS99XZS99
10XZS9642378076725257000XZS05XZS05
11XZS1642397030
12XZS055257000
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
18XZS271809859
19XZS991360339
20XZS41136033
21XZS981802692
22XZS60180825
23XZS93393533000
24XZS583932059
25XZS313.93206E+11
26XZS89352021
27XZS46352621
28XZS71352091000
29XZS8139370
Sheet1
Cell Formulas
RangeFormula
E2:E10E2=IFERROR( INDEX($A$2:$A$29, LOOKUP(2,SIGN(MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0)), MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0))), "NOT FOUND")
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 1
Solution
This VBA custom function seems pretty close. But I'm not sure about your logic for the "Not Found" ones. Especially with, '1807631'. Because it matches more than the first 3 characters. I added threshold as an argument in the function so you can adjust it.

Prcatice.xlsx
DEF
1NUMBEREXPECTED OUTPUTVBA
250660000XZS76XZS76
3423780000XZS13XZS13
439370XZS81XZS81
539335NOT FOUNDNot Found
63520913XZS71XZS71
732486NOT FOUNDNot Found
81807631NOT FOUNDXZS56
913603399617XZS99XZS99
105257000XZS05XZS05
Sheet4
Cell Formulas
RangeFormula
F2:F10F2=RLOOKUP(D2,$B$2:$B$29,$A$2:$A$29,3)


VBA Code:
Function RLOOKUP(xVal As String, r As Range, lk As Range, threshold As Integer)
Dim AR() As Variant:    AR = r.Value2
Dim HS As Integer:      HS = 0
Dim TMP As Integer:     TMP = 0
Dim IDX As Integer:     IDX = 0

Dim b() As Byte:        b = xVal
Dim t() As Byte:

For i = 1 To UBound(AR)
    t = CStr(AR(i, 1))
    TMP = 0
    If UBound(b) < UBound(t) Then
        For j = 0 To UBound(b) Step 2
            If b(j) = t(j) Then TMP = TMP + 1
        Next j
    Else
        For j = 0 To UBound(t) Step 2
            If b(j) = t(j) Then TMP = TMP + 1
        Next j
    End If
    If TMP > HS Then
        HS = TMP
        IDX = i
    End If
Next i
If HS <= threshold Then
    RLOOKUP = "Not Found"
Else
    RLOOKUP = lk.Cells(IDX).Value
End If
End Function
Thanks so much, it seems to work. I think your output is correct, including the output for 1807631, that your code is taking the first appearence.
 
Upvote 0
Oops, slight amendment to be made, there seems to be an error when entered as an array formula, which is needed for Excel 2019. Made it slightly more compact as well. Previous discrepancy still stands.

Book9
ABCDEF
1CODENUMBERNUMBERCURRENT OUTPUTEXPECTED OUTPUT
2XZS90506700050660000XZS76XZS76
3XZS785063005423780000XZS13XZS13
4XZS7650660000239370XZS81XZS81
5XZS293247539335NOT FOUNDNOT FOUND
6XZS3942060800553520913XZS71XZS71
7XZS6442060232486NOT FOUNDNOT FOUND
8XZS35423791807631XZS56NOT FOUND
9XZS13423780013603399617XZS99XZS99
10XZS9642378076725257000XZS05XZS05
11XZS1642397030
12XZS055257000
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
18XZS271809859
19XZS991360339
20XZS41136033
21XZS981802692
22XZS60180825
23XZS93393533000
24XZS583932059
25XZS313.93206E+11
26XZS89352021
27XZS46352621
28XZS71352091000
29XZS8139370
Sheet1
Cell Formulas
RangeFormula
E2:E10E2=IFERROR( INDEX($A$2:$A$29, LOOKUP(2,SIGN(MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0)), MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0))), "NOT FOUND")
Press CTRL+SHIFT+ENTER to enter array formulas.
Thanks so much. It works pretty fine even without entering as array formula. It seems to be a complex formula to me :). Nice!
 
Upvote 0
Oops, slight amendment to be made, there seems to be an error when entered as an array formula, which is needed for Excel 2019. Made it slightly more compact as well. Previous discrepancy still stands.

Book9
ABCDEF
1CODENUMBERNUMBERCURRENT OUTPUTEXPECTED OUTPUT
2XZS90506700050660000XZS76XZS76
3XZS785063005423780000XZS13XZS13
4XZS7650660000239370XZS81XZS81
5XZS293247539335NOT FOUNDNOT FOUND
6XZS3942060800553520913XZS71XZS71
7XZS6442060232486NOT FOUNDNOT FOUND
8XZS35423791807631XZS56NOT FOUND
9XZS13423780013603399617XZS99XZS99
10XZS9642378076725257000XZS05XZS05
11XZS1642397030
12XZS055257000
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
18XZS271809859
19XZS991360339
20XZS41136033
21XZS981802692
22XZS60180825
23XZS93393533000
24XZS583932059
25XZS313.93206E+11
26XZS89352021
27XZS46352621
28XZS71352091000
29XZS8139370
Sheet1
Cell Formulas
RangeFormula
E2:E10E2=IFERROR( INDEX($A$2:$A$29, LOOKUP(2,SIGN(MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0)), MATCH(MID(D2,1,ROW(INDIRECT("4:"&LEN(D2))))&"*",$B$2:$B$29&"",0))), "NOT FOUND")
Press CTRL+SHIFT+ENTER to enter array formulas.
I'd like to ask if I have 2 databases to search from (A1:Bn and D1:Dn), should I repeat twice your formula with both input ranges? for example. I'd like to search first in DB1 and if not found there, search in DB2. Is possible to use your formula with both ranges but searching first in A1:An and if not found search in D1:Dn but entering a single formula? I hope make sense

CODE_DB1NUMBERCODE_DB2NUMBERNUMBER TO SEARCHOUTPUT
XZS905067000XZS6746727579950660000
XZS785063005XZS6826879423780000
XZS76506600002XZS692687639370
XZS2932475XZS702482539335
XZS394206080055XZS71248263520913
XZS64420602XZS72963931032486
XZS3542379XZS7396393111807631
XZS134237800XZS749639311113603399617
XZS964237807672XZS7512462535257000
XZS1642397030XZS761268789
XZS055257000XZS771345990
XZS47421903
XZS15421905
XZS404206020899
XZS5618072141180
XZS0318072144650
 
Upvote 0
The easiest way is to probably nest the second lookup in another IFERROR().

I'm surprised it works without an array formula. Are you still using Excel 2019? There are more compact options (i.e. XLOOKUP, XMATCH) if you've upgraded to excel 2021 and later

Book1
ABCDEFGH
1CODE_DB1NUMBERCODE_DB2NUMBERNUMBER TO SEARCHOUTPUT
2XZS905067000XZS6746727579950660000XZS76
3XZS785063005XZS6826879423780000XZS13
4XZS76506600002XZS692687639370NOT FOUND
5XZS2932475XZS702482539335NOT FOUND
6XZS394206080055XZS71248263520913NOT FOUND
7XZS64420602XZS72963931032486NOT FOUND
8XZS3542379XZS7396393111807631XZS56
9XZS134237800XZS749639311113603399617NOT FOUND
10XZS964237807672XZS7512462535257000XZS05
11XZS1642397030XZS761268789
12XZS055257000XZS771345990
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
Sheet2
Cell Formulas
RangeFormula
H2:H10H2=IFERROR( IFERROR( INDEX($A$2:$A$17, LOOKUP(2,SIGN(MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$B$2:$B$17&"",0)), MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$B$2:$B$17&"",0))), INDEX($D$2:$D$12, LOOKUP(2,SIGN(MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$E$2:$E$12&"",0)), MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$E$2:$E$12&"",0)))), "NOT FOUND")
 
Upvote 0
The easiest way is to probably nest the second lookup in another IFERROR().

I'm surprised it works without an array formula. Are you still using Excel 2019? There are more compact options (i.e. XLOOKUP, XMATCH) if you've upgraded to excel 2021 and later

Book1
ABCDEFGH
1CODE_DB1NUMBERCODE_DB2NUMBERNUMBER TO SEARCHOUTPUT
2XZS905067000XZS6746727579950660000XZS76
3XZS785063005XZS6826879423780000XZS13
4XZS76506600002XZS692687639370NOT FOUND
5XZS2932475XZS702482539335NOT FOUND
6XZS394206080055XZS71248263520913NOT FOUND
7XZS64420602XZS72963931032486NOT FOUND
8XZS3542379XZS7396393111807631XZS56
9XZS134237800XZS749639311113603399617NOT FOUND
10XZS964237807672XZS7512462535257000XZS05
11XZS1642397030XZS761268789
12XZS055257000XZS771345990
13XZS47421903
14XZS15421905
15XZS404206020899
16XZS5618072141180
17XZS0318072144650
Sheet2
Cell Formulas
RangeFormula
H2:H10H2=IFERROR( IFERROR( INDEX($A$2:$A$17, LOOKUP(2,SIGN(MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$B$2:$B$17&"",0)), MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$B$2:$B$17&"",0))), INDEX($D$2:$D$12, LOOKUP(2,SIGN(MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$E$2:$E$12&"",0)), MATCH(MID(G2,1,ROW(INDIRECT("4:"&LEN(G2))))&"*",$E$2:$E$12&"",0)))), "NOT FOUND")
Excellent. It works, the only issue is that now appears the meassage "Calculating (8 threads).... X%" in the right low corner of excel sheet and gets stuck for several seconds. I think is because the numbers to search in Column G are about 2500, Data base1 it has more than 4k rows and DB2 has almost 4k rows.

I don't have the option to change to Excel 2021 since is a corporative MS office.
 
Upvote 0

Forum statistics

Threads
1,224,816
Messages
6,181,138
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top