Similar texts with minor differences - How to get just ONE (a standard one)

heytoluca

New Member
Joined
May 26, 2014
Messages
9
Hello,


I have a very large database that contains names with minor differences on column A such as:
A
EXCEL AMERICAN COMPANY S.A.
EXCEL AMERICAN COMPANY SA.
EXCEL AMERICAN COMPANY, S.A.
ASSOCIATION, INDUSTRIAL
ASSOCIATION INDUSTRIAL
L.A. COMPANY CO
LA COMPANY CO


As you can see, this makes a mess when dealing with numbers associated with them and makes analysis more difficult.
What I've been trying to figure out is how to quickly get in column B a standard name for each one, for example:
B
EXCEL AMERICAN COMPANY S.A.
ASSOCIATION INDUSTRIAL
L.A. COMPANY CO


I was thinking of using a combination of the FIND and EXTRACT formula or maybe LEFT as well. (Maybe VB?)
Please use your excel wizardry to help a fellow comrade!


David
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
Hello David

You can eliminate the comma and the period to get a "standard" name.

<colgroup><col style="width: 28ptpx"><col width="198,75pt"><col width="154,5pt"></colgroup><tbody>
[TD="align: center"]A[/TD]
[TD="align: center"]B[/TD]

[TD="bgcolor: #cacaca, align: center"]1[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY S.A.[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA[/TD]

[TD="bgcolor: #cacaca, align: center"]2[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA.[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA[/TD]

[TD="bgcolor: #cacaca, align: center"]3[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY, S.A.[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA[/TD]

[TD="bgcolor: #cacaca, align: center"]4[/TD]
[TD="align: left"]ASSOCIATION, INDUSTRIAL[/TD]
[TD="align: left"]ASSOCIATION INDUSTRIAL[/TD]

[TD="bgcolor: #cacaca, align: center"]5[/TD]
[TD="align: left"]ASSOCIATION INDUSTRIAL[/TD]
[TD="align: left"]ASSOCIATION INDUSTRIAL[/TD]

[TD="bgcolor: #cacaca, align: center"]6[/TD]
[TD="align: left"]L.A. COMPANY CO[/TD]
[TD="align: left"]LA COMPANY CO[/TD]

[TD="bgcolor: #cacaca, align: center"]7[/TD]
[TD="align: left"]LA COMPANY CO[/TD]
[TD="align: left"]LA COMPANY CO[/TD]

</tbody>

ZelleFormel
B1=SUBSTITUTE(SUBSTITUTE(A1,".",""),",","")

<colgroup><col style="width: 40ptpx"><col></colgroup><tbody>
</tbody>
Diese Tabelle wurde mit Tab2Html (v2.4.1) erstellt. ©Gerd alias Bamberg

<tbody>
</tbody>
 
Upvote 0
Hello David

You can eliminate the comma and the period to get a "standard" name.

<tbody>
[TD="align: center"]A[/TD]
[TD="align: center"]B[/TD]

[TD="bgcolor: #cacaca, align: center"]1[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY S.A.[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA[/TD]

[TD="bgcolor: #cacaca, align: center"]2[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA.[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA[/TD]

[TD="bgcolor: #cacaca, align: center"]3[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY, S.A.[/TD]
[TD="align: left"]EXCEL AMERICAN COMPANY SA[/TD]

[TD="bgcolor: #cacaca, align: center"]4[/TD]
[TD="align: left"]ASSOCIATION, INDUSTRIAL[/TD]
[TD="align: left"]ASSOCIATION INDUSTRIAL[/TD]

[TD="bgcolor: #cacaca, align: center"]5[/TD]
[TD="align: left"]ASSOCIATION INDUSTRIAL[/TD]
[TD="align: left"]ASSOCIATION INDUSTRIAL[/TD]

[TD="bgcolor: #cacaca, align: center"]6[/TD]
[TD="align: left"]L.A. COMPANY CO[/TD]
[TD="align: left"]LA COMPANY CO[/TD]

[TD="bgcolor: #cacaca, align: center"]7[/TD]
[TD="align: left"]LA COMPANY CO[/TD]
[TD="align: left"]LA COMPANY CO[/TD]

</tbody>

ZelleFormel
B1=SUBSTITUTE(SUBSTITUTE(A1,".",""),",","")

<tbody>
</tbody>
Diese Tabelle wurde mit Tab2Html (v2.4.1) erstellt. ©Gerd alias Bamberg

<tbody>
</tbody>

Thanks a lot!, this effectively helped standarize the names.

However, I still have a name or two that are still not standarized. For example:

DURACORP S.A.
DURACORP S. A.

Results in:

DURACORPS SA
DURACORPS S A

Is there anything I can add to the formula to get:

DURACORPS SA

On both?



Thanks a LOT
David
 
Upvote 0
Hi David

Code:
[COLOR=#222222]=SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](A1,".","")[/COLOR][COLOR=#222222],",","")[/COLOR][COLOR=#0000dd],"S  A","SA")[/COLOR]
 
Upvote 0
Hi David

Code:
[COLOR=#222222]=SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](A1,".","")[/COLOR][COLOR=#222222],",","")[/COLOR][COLOR=#0000dd],"S  A","SA")[/COLOR]

Many thanks!

You made me realize I can add as many "filters" I want to that formula.

However (sorry if Im becoming annoying), I just noticed there are a few names that cant fit into the fit-all formula we're developing, for example:

EXCEL CORPORATION LUMBER DBE
EXCEL CORPORATION LU DBE
QUICK CO AND FIXTURES
QUICK COAND FIXTURES

Any help with these? Maybe a LEFT formula?

Thanks a LOT!
David
 
Upvote 0
Hi

I would suggest eliminating the spaces and then take the n leftmost characters.
In my example n is 10.
Code:
[COLOR=#222222]=LEFT[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](A1,".","")[/COLOR][COLOR=#0000dd],",","")[/COLOR][COLOR=#222222],"  ","")[/COLOR][COLOR=#0000dd],10)[/COLOR]
 
Upvote 0
Hi

I would suggest eliminating the spaces and then take the n leftmost characters.
In my example n is 10.
Code:
[COLOR=#222222]=LEFT[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](SUBSTITUTE[/COLOR][COLOR=#0000dd](SUBSTITUTE[/COLOR][COLOR=#222222](A1,".","")[/COLOR][COLOR=#0000dd],",","")[/COLOR][COLOR=#222222],"  ","")[/COLOR][COLOR=#0000dd],10)[/COLOR]

Hey! Thanks a lot for your answer and your time. This worked great!

Thanks again!
David
 
Upvote 0

Forum statistics

Threads
1,223,227
Messages
6,170,849
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top