Find "brand" keywords in free text fields

TrimFunction

New Member
Joined
Jan 9, 2018
Messages
18
background: I'm building a reference file of branded product names. My aim is to have a clean list that can be used to search through a free text fields from a CRM system.

I want to find brand-names, but do not want common words like "account" or "management", since that will return almost every entry. the list of terms i search for is usually about 300+, using {=INDEX($B$2:$B$300,MATCH(1,COUNTIF(I3,"*"&$A$2:$A$300&"*"),0))}.

Question: What is the best way to "scrub" out the brand names from free text?

my current process is start with the list:

[TABLE="width: 311"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Original Product Name[/TD]
[/TR]
[TR]
[TD]Accounts Receivable, JD Edwards EnterpriseOne[/TD]
[/TR]
[TR]
[TD]IBM Kenexa Employee Assessments[/TD]
[/TR]
[TR]
[TD]Catalog Management, PeopleSoft[/TD]
[/TR]
[TR]
[TD]SAP Ariba Invoice Management

Text to columns, sort, then manually read through to determine which are "brand" names:

Column A Column B
[TABLE="width: 194"]
<colgroup><col><col></colgroup><tbody>[TR]
[TD]Accounts[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Ariba[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Assessments[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Catalog[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Edwards[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Employee[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]EnterpriseOne[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]IBM[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Invoice[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]JD[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Kenexa[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Management[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Management,[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]PeopleSoft[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Receivable,[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]SAP[/TD]
[TD]USE[/TD]
[/TR]
</tbody>[/TABLE]

I've been using countifs to determine how frequent the word occurs and LEN to quickly go through short words, but this is not efficient.

Is there a better way to do this?[/TD]
[/TR]
[TR]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
 

Excel Facts

Will the fill handle fill 1, 2, 3?
Yes! Type 1 in a cell. Hold down Ctrl while you drag the fill handle.
Is there a better way to do this?
Probably.

But from what you have posted, I can't really tell what you have, where, and exactly what you are trying to do.

Can you give us some small dummy sample data and the expected results and clearly show or describe where various data is and where the results should go?
Better (clearer) ways to present your sample data are given in a link in my signature block below.
 
Upvote 0
How about PowerQuery (Get&Transform) ?

[Table="width:, class:head"]
[tr=bgcolor:#FFFFFF][td]Table1[/td][td][/td][td]Table2 (user defined)[/td][td][/td][td]Result table[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#5B9BD5]Product name[/td][td][/td][td=bgcolor:#5B9BD5]Common[/td][td][/td][td=bgcolor:#70AD47]Keywords[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#DDEBF7]Accounts Receivable, JD Edwards EnterpriseOne[/td][td][/td][td=bgcolor:#DDEBF7]Accounts[/td][td][/td][td=bgcolor:#E2EFDA]Ariba[/td][/tr]

[tr=bgcolor:#FFFFFF][td]IBM Kenexa Employee Assessments[/td][td][/td][td]Assessments[/td][td][/td][td]Edwards[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#DDEBF7]Catalog Management, PeopleSoft[/td][td][/td][td=bgcolor:#DDEBF7]Catalog[/td][td][/td][td=bgcolor:#E2EFDA]EnterpriseOne[/td][/tr]

[tr=bgcolor:#FFFFFF][td]SAP Ariba Invoice Management[/td][td][/td][td]Employee[/td][td][/td][td]IBM[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td=bgcolor:#DDEBF7]Invoice[/td][td][/td][td=bgcolor:#E2EFDA]JD[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td]Management[/td][td][/td][td]Kenexa[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td=bgcolor:#DDEBF7]Receivable[/td][td][/td][td=bgcolor:#E2EFDA]PeopleSoft[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]SAP[/td][/tr]
[/table]


Table1: source
Table2: user-defined words that should not be taken into account

for table1:
Code:
[SIZE=1]let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Replaced Value" = Table.ReplaceValue(Source,",","",Replacer.ReplaceText,{"Product name"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Product name", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Product name")
in
    #"Split Column by Delimiter"[/SIZE]
for result table:
Code:
[SIZE=1]let
    Source = Table.NestedJoin(Table1,{"Product name"},Table2,{"Common"},"Table3",JoinKind.LeftAnti),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Product name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"Product name", Order.Ascending}}),
    #"Renamed Columns" = Table.RenameColumns(#"Sorted Rows",{{"Product name", "Keywords"}})
in
    #"Renamed Columns"[/SIZE]

example file
 
Upvote 0
I forgot about duplicates, so

for result table
Code:
[SIZE=1]let
    Source = Table.NestedJoin(Table1,{"Product name"},Table2,{"Common"},"Table3",JoinKind.LeftAnti),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Product name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"Product name", Order.Ascending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows"),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Duplicates",{{"Product name", "Keywords"}})
in
    #"Renamed Columns"[/SIZE]
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,179
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top