Find "brand" keywords in free text fields

TrimFunction · Oct 8, 2018

background: I'm building a reference file of branded product names. My aim is to have a clean list that can be used to search through a free text fields from a CRM system.

I want to find brand-names, but do not want common words like "account" or "management", since that will return almost every entry. the list of terms i search for is usually about 300+, using {=INDEX($B$2:$B$300,MATCH(1,COUNTIF(I3,"*"&$A$2:$A$300&"*"),0))}.

Question: What is the best way to "scrub" out the brand names from free text?

my current process is start with the list:

[TABLE="width: 311"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Original Product Name[/TD]
[/TR]
[TR]
[TD]Accounts Receivable, JD Edwards EnterpriseOne[/TD]
[/TR]
[TR]
[TD]IBM Kenexa Employee Assessments[/TD]
[/TR]
[TR]
[TD]Catalog Management, PeopleSoft[/TD]
[/TR]
[TR]
[TD]SAP Ariba Invoice Management

Text to columns, sort, then manually read through to determine which are "brand" names:

Column A Column B
[TABLE="width: 194"]
<colgroup><col><col></colgroup><tbody>[TR]
[TD]Accounts[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Ariba[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Assessments[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Catalog[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Edwards[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Employee[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]EnterpriseOne[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]IBM[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Invoice[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]JD[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Kenexa[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Management[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]Management,[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]PeopleSoft[/TD]
[TD]USE[/TD]
[/TR]
[TR]
[TD]Receivable,[/TD]
[TD]SKIP-Common[/TD]
[/TR]
[TR]
[TD]SAP[/TD]
[TD]USE[/TD]
[/TR]
</tbody>[/TABLE]

I've been using countifs to determine how frequent the word occurs and LEN to quickly go through short words, but this is not efficient.

Is there a better way to do this?[/TD]
[/TR]
[TR]
[TD][/TD]
[/TR]
</tbody>[/TABLE]

Peter_SSs · Oct 9, 2018

TrimFunction said:
Is there a better way to do this?

Probably.

But from what you have posted, I can't really tell what you have, where, and exactly what you are trying to do.

Can you give us some small dummy sample data and the expected results and clearly show or describe where various data is and where the results should go?
Better (clearer) ways to present your sample data are given in a link in my signature block below.

sandy666 · Oct 9, 2018

How about PowerQuery (Get&Transform) ?

[Table="width:, class:head"]
[tr=bgcolor:#FFFFFF][td]Table1[/td][td][/td][td]Table2 (user defined)[/td][td][/td][td]Result table[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#5B9BD5]Product name[/td][td][/td][td=bgcolor:#5B9BD5]Common[/td][td][/td][td=bgcolor:#70AD47]Keywords[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#DDEBF7]Accounts Receivable, JD Edwards EnterpriseOne[/td][td][/td][td=bgcolor:#DDEBF7]Accounts[/td][td][/td][td=bgcolor:#E2EFDA]Ariba[/td][/tr]

[tr=bgcolor:#FFFFFF][td]IBM Kenexa Employee Assessments[/td][td][/td][td]Assessments[/td][td][/td][td]Edwards[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#DDEBF7]Catalog Management, PeopleSoft[/td][td][/td][td=bgcolor:#DDEBF7]Catalog[/td][td][/td][td=bgcolor:#E2EFDA]EnterpriseOne[/td][/tr]

[tr=bgcolor:#FFFFFF][td]SAP Ariba Invoice Management[/td][td][/td][td]Employee[/td][td][/td][td]IBM[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td=bgcolor:#DDEBF7]Invoice[/td][td][/td][td=bgcolor:#E2EFDA]JD[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td]Management[/td][td][/td][td]Kenexa[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td=bgcolor:#DDEBF7]Receivable[/td][td][/td][td=bgcolor:#E2EFDA]PeopleSoft[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]SAP[/td][/tr]
[/table]

Table1: source
Table2: user-defined words that should not be taken into account

for table1:

Code:

[SIZE=1]let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Replaced Value" = Table.ReplaceValue(Source,",","",Replacer.ReplaceText,{"Product name"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Replaced Value", {{"Product name", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Product name")
in
    #"Split Column by Delimiter"[/SIZE]

for result table:

Code:

[SIZE=1]let
    Source = Table.NestedJoin(Table1,{"Product name"},Table2,{"Common"},"Table3",JoinKind.LeftAnti),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Product name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"Product name", Order.Ascending}}),
    #"Renamed Columns" = Table.RenameColumns(#"Sorted Rows",{{"Product name", "Keywords"}})
in
    #"Renamed Columns"[/SIZE]

example file

sandy666 · Oct 9, 2018

I forgot about duplicates, so

for result table

Code:

[SIZE=1]let
    Source = Table.NestedJoin(Table1,{"Product name"},Table2,{"Common"},"Table3",JoinKind.LeftAnti),
    #"Removed Other Columns" = Table.SelectColumns(Source,{"Product name"}),
    #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"Product name", Order.Ascending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows"),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Duplicates",{{"Product name", "Keywords"}})
in
    #"Renamed Columns"[/SIZE]

Find "brand" keywords in free text fields

TrimFunction

New Member

Peter_SSs

MrExcel MVP, Moderator

sandy666

Banned - Rules violations

sandy666

Banned - Rules violations

Similar threads

Share this page

Find "brand" keywords in free text fields

TrimFunction

New Member

Peter_SSs

MrExcel MVP, Moderator

sandy666

Banned - Rules violations

sandy666

Banned - Rules violations

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock