Compare addresses and show percentage of similarities

fokeiro

New Member
Joined
Jun 29, 2016
Messages
15
hi guys. im tryign to compare addresses on column A vs column B and needs to show what is the percentage of similarity. this is my code so far, one sheet called CM, sheet2 called ABC, sheet 3 called MATCH copy both to here and show percentage. my code is below but im not sure why something completely different show a big % of similarity. see attached image and code. ty guys.


excelhelp.png

[TABLE="width: 344"]
<tbody>[TR]
[TD][TABLE="width: 344"]
<tbody>[TR]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
[TABLE="width: 344"]
<tbody>[TR]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[/TR]
</tbody>[/TABLE]
<strike></strike>
Code:
 Application.ScreenUpdating = False
LR = Sheets("CM").Cells(Rows.Count, 1).End(xlUp).Row
With Sheets("MATCH")
.Range("B1:B" & LR).Value = Sheets("CM").Range("B1:B" & LR).Value
.Range("C1:C" & LR).Value = Sheets("ABC").Range("B1:B" & LR).Value
.Range("A1:A" & LR).Value = Sheets("CM").Range("A1:A" & LR).Value
For r = 2 To LR
If Trim(.Cells(r, "A").Value) = Trim(.Cells(r, "B").Value) Then
.Cells(r, "C").Value = 1
GoTo Done
End If
Str1 = Trim(.Cells(r, "B").Value)
Str2 = Trim(.Cells(r, "C").Value)
Len1 = Len(Str1)
Len2 = Len(Str2)
Same = 0
For c = 1 To Len2
If InStr(1, Str1, Mid(Str2, c, 1), 1) Then
Same = Same + 1
Str1 = Replace(Str1, Mid(Str2, c, 1), "*", 1, 1)
End If
Next c
.Cells(r, "D").Value = Same / Len1
Done:
Next r
End With
Application.ScreenUpdating = True
 
for some reason im getting a error 6 - overflow on this line

.Cells(r, "E").Value = (maxlen - LevDist(str1, str2)) / maxlen

but if I comment it out seems still works without the error
 
Upvote 0

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
shg tried the ratio but while with lev dist i get 85% in this example with ratio i get 23% , only extra is # 436 on column B string

[TABLE="width: 792"]
<colgroup><col><col></colgroup><tbody>[TR]
[TD]101 E ROMANA ST PENSACOLA FL 32502[/TD]
[TD]101 E ROMANA ST # 436 PENSACOLA FL 32502[/TD]
[/TR]
</tbody>[/TABLE]
 
Upvote 0
Yes it does.

OTOH, 1871 NW SOUTH RIVER DR UNIT 17 MIAMI FL 33125 and 108-136 MARTIN LUTHER KING JR BLVD APT 1317 NEWARK NJ 07104 have an edit distance ratio of 34%, which seems like a awful lot of similarity, vs a qTR of 0.33%.

Reckon it depends on which best serves your purpose.
 
Last edited:
Upvote 0
I just keep getting a overflow on this line . Seems the code wants to do 9 rows if there is only let say 3 it throws in a overflow I guess trying to check with null cells

Code:
Sub Test1()
Dim str1 As String, str2 As String
    Application.ScreenUpdating = False
    LR = Sheets("CM").Cells(Rows.Count, 1).End(xlUp).Row
    With Sheets("MATCH")
        .Range("B1:B" & LR).Value = Sheets("CM").Range("B1:B" & LR).Value
        .Range("C1:C" & LR).Value = Sheets("ABC").Range("B1:B" & LR).Value
        .Range("A1:A" & LR).Value = Sheets("CM").Range("A1:A" & LR).Value
        LR = 9
        For r = 2 To LR
            If Trim(.Cells(r, "B").Value) = Trim(.Cells(r, "C").Value) Then
                .Cells(r, "D").Value = 1
                GoTo Done
            End If
            str1 = Trim(.Cells(r, "B").Value)
            str2 = Trim(.Cells(r, "C").Value)
            Len1 = Len(str1)
            Len2 = Len(str2)
            Same = 0
            For c = 1 To Len2
                If InStr(1, str1, Mid(str2, c, 1), 1) Then
                    Same = Same + 1
                    str1 = Replace(str1, Mid(str2, c, 1), "*", 1, 1)
                End If
            Next c
            .Cells(r, "D").Value = Same / Len1
Done:
            str1 = LCase(.Cells(r, "B"))
            str2 = LCase(.Cells(r, "C"))
            maxlen = IIf(Len(str1) > Len(str2), Len(str1), Len(str2))
  [COLOR=#ff0000]         .Cells(r, "E").Value = (maxlen - LevDist(str1, str2)) / maxlen  <----overflow[/COLOR]
        Next r
    End With
    Application.ScreenUpdating = True
End Sub
 
Last edited:
Upvote 0
I wondered if I should put a check in there. You're getting the error because both str1 and str2 are empty. So the maxlen is 0, so you're dividing by zero. You can just put something like this:

Code:
            str1 = LCase(.Cells(r, "B"))
            str2 = LCase(.Cells(r, "C"))
            maxlen = IIf(Len(str1) > Len(str2), Len(str1), Len(str2))
            If maxlen = 0 Then
              .Cells(r, "E").Value = "No addresses"
           Else
             .Cells(r, "E").Value = (maxlen - LevDist(str1, str2)) / maxlen
           End If

 
Upvote 0
eric you the man, that did the trick, figure it was diving by 0 and saw LR defines how many rows it shold work on, so now with more rows on LR, still no overflow, perfect thank you !!!
 
Upvote 0

Forum statistics

Threads
1,224,847
Messages
6,181,321
Members
453,032
Latest member
Pauh

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top