Compare two columns for Exact match and Similar matches

tommyb

Board Regular
Joined
Aug 3, 2008
Messages
66
Hi all,

I have two spreadsheets (SheetA and SheetB). SheetA has 4 columns, with ColumnA having names. SheetB has only 1 column, ColumnA, containing a different set of names.

I need to match SheetA.ColumnA with SheetB.ColumnA, and all matches need the data from SheetA copied to a new spreadsheet. However, I also need near matches, for example, Liberty Services should match with Liberty Insurance, and John & Sons Ltd should match with John Sons Ltd

I'm an intermediate with VBA, but I honestly just don't know where to start with this. The ideal solution would be a button with a macro that is placed on one of the sheets. Ideally I don't want to use the Excel Fuzzy add-in because this may be rolled out to multiple users.

Any help will be greatly appreciated!
 

Excel Facts

What is the fastest way to copy a formula?
If A2:A50000 contain data. Enter a formula in B2. Select B2. Double-click the Fill Handle and Excel will shoot the formula down to B50000.
Code:
Sub myMacro()
    sht1 = "SheetA"
    sht2 = "SheetB"
    sht3 = "Sheet3"
    lastRowA = Sheets(sht1).Range("A" & Rows.Count).End(xlUp).Row
    lastRowB = Sheets(sht2).Range("A" & Rows.Count).End(xlUp).Row
    i = 1  'i = 2 if you have headers in sht1
    iii = 1  'iii = 2 if you have headers in sht3
    Do Until i > lastRowA
        ii = 1  'ii = 2 if you have headers in sht2
        Do Until ii > lastRowB
            sht1Value = Sheets(sht1).Range("A" & i).Value
            sht2Value = Sheets(sht2).Range("A" & ii).Value
            myMatch = myMatchFunction(sht1Value, sht2Value)
            If myMatch = TRUE Then
                Sheets(sht3).Range("A" & iii).Value = Sheets(sht1).Range("A" & i).Value
                Sheets(sht3).Range("B" & iii).Value = Sheets(sht1).Range("B" & i).Value
                Sheets(sht3).Range("C" & iii).Value = Sheets(sht1).Range("C" & i).Value
                Sheets(sht3).Range("D" & iii).Value = Sheets(sht1).Range("D" & i).Value
                iii = iii + 1
            End If
            ii = ii + 1
        Loop
        i = i + 1
    Loop
End Sub
Take notice where I wrote myMatch = myMatchFunction(sht1Value, sht2Value)
I still have to post the myMatchFunction to this thread for this macro to work.
Finding near matches is a very complicated task and I wanted to do it seperately from your other requirements. myMatchFunction will bring back a value of TRUE or FALSE. I have a macro that does something similar on my computer. If I'm lucky, it does exactly what you want so that I don't have to make a new one. I just woke up and need to have breakfast. Very soon I will post the myMatchFunction macro and then you can try out this code.
 
Upvote 0
Here's the function part of the code. Both codes are needed to work.
Code:
Function myMatchFunction(a, b)
    splitA = SPLIT(a, " ")
    splitB = SPLIT(b, " ")
    uA = UBOUND(splitA)
    uB = UBOUND(splitB)
    If uA > 0 AND uB > 0 Then
        i = 0
        Do Until i > uA
            ii = 0
            Do Until ii > uB
                If splitA(i) = splitB(ii) Then
                    c = c + 1
                End If
                ii = ii + 1
            Loop
            i = i + 1
        Loop
    ElseIf uA > 0 Then
        i = 0
        Do Until i > uA
            If splitA(i) = b Then
                c = c + 1
            End If
            i = i + 1
        Loop
    ElseIf uB > 0 Then
        i = 0
        Do Until i > uB
            If splitB(i) = a Then
                c = c + 1
            End If
            i = i + 1
        Loop
    ElseIf a = b Then
            myMatchFunction = TRUE
            Exit Function
    End If
    If (c >= uA - 1 AND c <= uA + 1) OR (c >= uB - 1 AND c <= uB + 1) Then
        myMatchFunction = TRUE
    Else
        myMatchFunction = FALSE
    End If 
End Function
 
Upvote 0
The only drawback with myMatchFunction is that if you have a string in the A column of sht1 or sht2 that has repeating values, it won't work. For example, "kool awesome kool" and "kool awesome". Those are near matches because 2 of the 3 words match. But my code will read that 3 match out of 3. So in this case, it doesn't make a difference because it outputs when we want it to. But lets make things more complicated and say the two strings are "kool awesome kool awesome kool" and "kool awesome kool awesome". As a human, I know that 4 out of 5 words match. The code however will think that 10 out of the 5 words match. And since the code is checking if there is a match with at most 1 word not matching, then this will not be considered a match and return FALSE when we want it to return TRUE. So in conclusion, if you expect to have strings with the same words repeating in a cell, then the myMatchFunction will not work for you.

There was another idea I had but it would rely on the 2 strings being in the same order. So for example, "kool awesome" and "kool beans awesome" are in the same order. "kool awesome" and "awesome kool beans" are not in the same order. Let me know if you prefer this method and I'll write the code. If you have mixed up words like that, then this method will not work for you.

Something I would advise to you is to create a new threaded discussion on just this matter of finding near matches in a function. That way you can still use the Sub myMacro I gave you and apply the new myMatchFunction to it. Use the myMatchFunction I gave you if you want, but see if anyone else has a better idea of how to get a near match. And make sure they put the results as TRUE or FALSE in a function so that myMacro will work.
 
Upvote 0
Disregard my last message. I had an idea so that we don't run into the problems I mentioned in my last message. Find the below code in myMatchFunction.
Code:
If splitA(i) = splitB(ii) Then
    c = c + 1
End If
Once you find the above code in myMatchFunction, replace it with the below code.
Code:
If splitA(i) = splitB(ii) Then
    c = c + 1
    splitB(ii) = ""
End If
What this will do is turn the value in splitB into a blank so that later it will not match any other values to it from splitA. Such a simple solution. Can't believe I didn't think of it sooner.
 
Upvote 0
Hi.

Wow, all I can say is thank you so much for taking the time to do write this out for me!

I've implemented the code, and I'm finding my matches are being quadrupled..?
 
Upvote 0
I got all mixed up with the function and decided to start over. This one is way easier. I'm pretty sure it works. I mean I tested it, but I didn't use a big dataset. Try it out. Let me know if something is wrong.
Code:
Sub myMacro()
    sht1 = "SheetA"
    sht2 = "SheetB"
    sht3 = "Sheet3"
    lastRowA = Sheets(sht1).Range("A" & Rows.Count).End(xlUp).Row
    lastRowB = Sheets(sht2).Range("A" & Rows.Count).End(xlUp).Row
    i = 1  'i = 2 if you have headers in sht1
    iii = 1  'iii = 2 if you have headers in sht3
    Do Until i > lastRowA
        ii = 1  'ii = 2 if you have headers in sht2
        Do Until ii > lastRowB
            sht1Value = Sheets(sht1).Range("A" & i).Value
            sht2Value = Sheets(sht2).Range("A" & ii).Value
            myMatch = myMatchFunction(sht1Value, sht2Value)
            If myMatch = True Then
                Sheets(sht3).Range("A" & iii).Value = Sheets(sht1).Range("A" & i).Value
                Sheets(sht3).Range("B" & iii).Value = Sheets(sht1).Range("B" & i).Value
                Sheets(sht3).Range("C" & iii).Value = Sheets(sht1).Range("C" & i).Value
                Sheets(sht3).Range("D" & iii).Value = Sheets(sht1).Range("D" & i).Value
                iii = iii + 1
            End If
            ii = ii + 1
        Loop
        i = i + 1
    Loop
End Sub
Function myMatchFunction(a, b)
    mySplitA = Split(a, " ")
    mySplitB = Split(b, " ")
    uBndA = UBound(mySplitA)
    uBndB = UBound(mySplitB)
    i = 0
    Do Until i > uBndA
        ii = 0
        Do Until ii > uBndB
            If mySplitA(i) = mySplitB(ii) Then
                mySplitB(ii) = ""
                myMatch = myMatch + 1
                Exit Do
            End If
            ii = ii + 1
        Loop
        i = i + 1
    Loop
    If myMatch <> Empty And myMatch >= uBndA And myMatch <= uBndA + 1 _
    And myMatch >= uBndB And myMatch <= uBndB + 1 Then
        myMatchFunction = True
    Else
        myMatchFunction = False
    End If
End Function
 
Upvote 0
With the code above. You can have a value in sheetA cell A2 that says "kool kids run" and have a value that says "kool run" in sheetB cell A50 or A100 or it doesn't matter and it will find it. It doesn't have to to be in SheetB cell A2 just like the original was in sheetA cell A2. If this is not what you wanted then let me know. I can make it so that the code only checks the same cell in both sheets for near matches. That is way easier than what I did. It just requires I remove the second Do Loop from the myMacro. Then change all the "ii" into "i". Then delete lastRowSht2 code because it isn't needed.
 
Last edited:
Upvote 0
I can't let that code work. Thats my problem : I post the actual problem in text : From SHEET1, compare if column "E" is EQUAL to SHEET2 column "F" AND
From SHEET1, compare if column "F" is EQUAL to SHEET2" column "I".
If its not found , write in SHEET1 column "G" : "Not found"
If its found do this :
Compare SHEET1 if column "I" is EQUAL to SHEET2 column "AB" AND
Compare SHEET1 if column "J" is EQUAL to SHEET2 column "AC"
If BOTH compared values are the same then write in SHEET1 column "H" : "OK"
THEN
Compare SHEET1 if column "Y" is EQUAL to SHEET2 column "BW" AND
Compare SHEET1 if column "Z" is EQUAL to SHEET2 column "BX"
If BOTH compared values are the same then write in SHEET1 column "I" : "OK"
NOTE THAT : Columns "F" - "I" - "E" - "F" contains the company name. All other columns contains € values.
 
Upvote 0
I don't know what you mean. How about you paste your dataset on here so I know what you are talking about and I don't have to use my imagination to put together a dataset. I'm not that imaginative. Furthermore, in your recent comment you are talking about a "Sheet1". I thought you didn't have a sheet1? You said in your original post that you had a sheetA and a sheetB and wanted the output to go in Sheet3. From what I can tell in your above comment, you are not searching through column A as you stated in your original post. You are searching in a lot of other columns. You just made this super complicated. So now I don't want to spend much of my time going back and forth with you about what needs to happen so here's what I need from you. I need your SheetA dataset and your SheetB dataset. I also need a dataset of what your Sheet3 should look like after you run the macro. This will avoid further confusion about how to write the macro.
 
Upvote 0

Forum statistics

Threads
1,222,689
Messages
6,167,647
Members
452,127
Latest member
jayneecm

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top