Analyzing list of text fields

Marty446 · Dec 6, 2018

Good morning/afternoon/evening,

I work for an Insurance company and i'm running into the following problem:

I'm trying to standardise 'free-text clauses' in our insurance system, clauses for insurance products which, for one reason or another, were not entered using a standard clause.

The difficulty here is that people have been using (slightly) different text to describe similar problems, or (slightly) similar text to descrbe different problems for over 20 years now.

i want to analyse my list of cells containing text (some 17000 rows).

I aim to find out:

- how to group the texts by similarity, and how to adjust the definition of 'similarity', E.g. 60% similar, 70% similar etc.

- how to ascertain percentages of similar text, e.g. 120 groups of similar text, group one comprising 4% of the data, group 2 comprising 6% etc.

- how to group the data based on key words, e.g. group all text clauses which contain words 'X', 'Y', 'Z'.

- how to remove certain words from the formulas used for above so as not to include certain phrases or words when calculating similiraty like 'and', 'the client has indicated' etc.

Due to the sensitive nature of the data i cannot post any examples of the data i am working with.

Any and all tips will be greatly appreciated, thanks in advance.

Akuini · Dec 8, 2018

Marty446 said:
i want to analyse my list of cells containing text (some 17000 rows).

I aim to find out:

- how to group the texts by similarity, and how to adjust the definition of 'similarity', E.g. 60% similar, 70% similar etc.

- how to ascertain percentages of similar text, e.g. 120 groups of similar text, group one comprising 4% of the data, group 2 comprising 6% etc.

- how to group the data based on key words, e.g. group all text clauses which contain words 'X', 'Y', 'Z'.

- how to remove certain words from the formulas used for above so as not to include certain phrases or words when calculating similiraty like 'and', 'the client has indicated' etc.

There's an add in called 'Fuzzy Lookup Add-In for Excel' from Microsoft.
I've never used it, but maybe it could help.

https://www.microsoft.com/en-us/download/details.aspx?id=15011

Analyzing list of text fields

Marty446

New Member

Akuini

Well-known Member

Similar threads

Share this page

Analyzing list of text fields

Marty446

New Member

Akuini

Well-known Member

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock