Lookup for cell contains most of the words in a sentence.

  • Thread starter Thread starter Legacy 143009
  • Start date Start date
L

Legacy 143009

Guest
Hello everyone,

I am looking for a formula input to D2. The aim is to find the cell in Column A that contains most of the words given in C2 and return Column B value.

Column AColumn BColumn CColumn D
1The quick brown1Given sentence:Desired result:
2fox2The quick brown fox jumps over the lazy dog.3
3jumps over the lazy3
4dog4
 
Hi Peter,

Surely there will be no difference between picking either of the same amount of matches since they both meets the condition and there is no way not know which one is which. So, in this case first match will be adequate.

Most of the time words will be unique. Let me tell you about my actual need. I have a list of products and sometimes product name varies in purchase list. So let say, my product neme is "Electrical Insulated Gloves" and, I have "Electrical Tape" and "Insulated Gloves" in my purchase list. I want to match the quantity of "Insulation Gloves" in this case.

Briefly, most of the time the words are unique and in order. But there can be some skipped words. So, "Electrical Gloves" is also possible. All I need is some kind of a partial match.
 
Upvote 0

Excel Facts

Spell Check in Excel
Press F7 to start spell check in Excel. Be careful, by default, Excel does not check Capitalized Werds (whoops)
Does that mean that the column A list might have a fairly small number of words in a single cell, say max of 3 or 4 (or 5)?
 
Upvote 0
To be more precise, it is very very unlikely to have both "Electrical Gloves" and "Electrical Work Glows" in the same purchase order so that type scenario can be neglected. But both have equal significance in general.
 
Upvote 0
It won't be safe to limit like that.
Then in that case, with your version of Excel, I think that I would be pusuing a vba solution. Is that acceptable?
Also, your goal was to find the row with the highest number of word matches and return the value from column B. What should happen when there is not one clear winner like there has been in all your examples so far? For example ..

Flashbond.xlsm
ABCD
1Electrical Insulated1Desired result:
2Insulated Gloves2Electrical Insulated Gloves??
3Electrical Gloves3
Sheet1
 
Upvote 0
Flashbond.xlsm
ABCD
1Electrical Insulated1Desired result:
2Insulated Gloves2Electrical Insulated Gloves??
3Electrical Gloves3
Sheet1
It is a very unlikely scenario. First match is ok. I can also write a vba also but your approaches always more efficient than mines. I will be looking forward to see. Thanks for your time and effort.
 
Upvote 0
First match is ok.
OK, here is one way employing a UDF. If there is any punctuation involved that could introduce some problems.

VBA Code:
Function LookupMost(r As Range, s As String) As Variant
  Dim RX As Object
  Dim a As Variant
  Dim i As Long, nMax As Long, n As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.IgnoreCase = True
  a = r.Value
  For i = 1 To UBound(a)
    RX.Pattern = "\b(" & Replace(a(i, 1), " ", "|") & ")\b"
    n = RX.Execute(s).Count
    If n > nMax Then
      nMax = n
      LookupMost = a(i, 2)
    End If
  Next i
End Function

Flashbond.xlsm
ABCD
1Electrical1Result
2Insulated Gloves2Electrical Insulated Gloves2
3Electrical Gloves3
4
5The quick brown over lazy dog1Result
6fox2The quick brown fox jumps over the lazy dog.1
7jumps over the lazy3
8dog4
9
10jumps over the lazy1Result
11fox2The quick brown fox jumps over the lazy dog.3
12The quick brown over lazy dog3
13dog4
Sheet1
Cell Formulas
RangeFormula
D2D2=LookupMost(A1:B3,C2)
D6,D11D6=LookupMost(A5:B8,C6)
 
Upvote 0
Solution
Any punctuation can be neglected.
That is simple to say but not so simple to tell Excel how to do it. ;)
Do you actually have any punctuation in your data? If so, could you post a few sets of representative sample data that shows the sort of punctuation variation that can occur?
 
Upvote 0
It is all in Turkish :) Mostly "/", "-" and "."
Like:
Electrical Glowes L/XL
Or
Anti-static Rope 7.00MM

They are mostly the part of the word. So main delimeter is " ". Thats why I said "can be neglected".
 
Upvote 0

Forum statistics

Threads
1,223,227
Messages
6,170,853
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top