A comprehensive fuzzy match (counting bananas!)

Stuck1

Board Regular
Joined
Sep 3, 2009
Messages
73
Hi all,

This is a tough one to explain, but I’ll do my best as I’ve not found anyone who can offer a solution.

I have a list of measures that a number of organizations are monitoring and I've been asked to find out which organizations are measuring similar things. There are thousands of measures, so I’ve used the fuzzy match plug-in to identify similarities in the text. I’ve set the threshold at 0.6.</SPAN>

The problem that I’ve now got is that I now need to categorically state who measures similar topics. However, the way the fuzzy match works means that some similar measures might be matched for some organizations and not others, that’s because there are differences in how the measures are worded.</SPAN>

In my example (see below), we’ve got one organization measuring ‘Banana supply’ and the fuzzy match shows that another 3 are measuring the same sort of thing. However, further down the list I’ve got an organization measuring ‘Supply of local bananas’ and the match against this measure has revealed that a further organization is measuring something along the same lines. So, if I wanted to state who has an interest in counting bananas I should be listing a total 5 organizations, I need to somehow amalgamate the two groups.</SPAN>

This is a very simple example. I’m guessing that I won’t be able to avoid doing a manual check at some point, but if anyone can give me any hints as to how I can cut hours out of this task I’d really, really appreciate it.</SPAN>

Let me know if this isn’t clear and I’ll try and clarify.

[TABLE="width: 636"]
<TBODY>[TR]
[TD]Organisation
[/TD]
[TD]Measure
[/TD]
[TD]Organisation
[/TD]
[TD]Similar Measure
[/TD]
[/TR]
[TR]
[TD]Org 1
[/TD]
[TD]Banana supply
[/TD]
[TD]Org 2
[/TD]
[TD]Supply of bananas
[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]Org 3
[/TD]
[TD]Banana Volume
[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]Org 4
[/TD]
[TD]The number of bananas
[/TD]
[/TR]
[TR]
[TD]Org 2
[/TD]
[TD]Supply of local bananas
[/TD]
[TD]Org 1
[/TD]
[TD]Banana supply
[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]Org 3
[/TD]
[TD]Banana Volume
[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]Org 4
[/TD]
[TD]The number of bananas
[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD]Org 5
[/TD]
[TD]Supply of yellow bananas
[/TD]
[/TR]
</TBODY>[/TABLE]

</SPAN>
 

Excel Facts

Last used cell?
Press Ctrl+End to move to what Excel thinks is the last used cell.

Forum statistics

Threads
1,223,234
Messages
6,170,891
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top