Fuzzy Matching - new version plus explanation

al_b_cnu

Well-known Member
Joined
Jul 18, 2003
Messages
4,546
It has been a while since I originally posted my Fuzzy matching UDF’s on the board, and several variants have appeared subsequently.

I thought it time to ‘put the record straight’ & post a definitive version which contains slightly more efficient code, and better matching algorithms, so here it is.

Firstly, I must state that the Fuzzy matching algorithms are very CPU hungry, and should be used sparingly. If for instance you require to lookup a match for a string which starts with, contains or ends with a specified value, this can be performed far more efficiently using the MATCH function:
Fuzzy Examples.xls
ABCDE
1Starts WithEndsContains
2BilljelenBill
3Mr Bill Jelen433
4Bill Jelen
5Joe Bloggs
6Fred Smith
MATCH Example


... Continued ...
 
Hi Greg, OK, first cut ready.
The macro is in a 'master' workbook, which also contains a parameter sheet to minimise code changes.
The parameter sheet currently looks like this:
GetBarcodes.xlsm
ABC
1KeywordValueComment
2Group HeadingBrandHeading in Database and Reseller files of column which MUST match exactly for an entry to be a candidate for comparison
3Match HeadingTitleHeading in Database and Reseller files of column to be fuzzy matched
4DB QuantityQuantityHeading in Database of column containing quantity
5DB BarcodeBarcodeHeading in Database of column containing Barcode
6# Matches per Entry5Return the best 5 matching barcodes
7Min % Match5%Ignore any matches in 'Title' column below 5%
8Match Algorithm4Set to '2' to match pairs then triplets then quads etc, or '4' for Levenstein match. Algorithm 4 is more accurate but slower.
Parameters
Cell Formulas
RangeFormula
C6C6="Return the best "&B6&" matching barcodes"
C7C7="Ignore any matches in '"&B3&"' column below "&TEXT(B7,"#%")
Cells with Conditional Formatting
CellConditionCell FormatStop If True
A:CExpression=$A1<>""textNO


When you run the macro it will prompt for the input Reseller files, and create a new workbook containing a sheet for the results of each reseller. The sheet names will be the terminal name of the Reseller file (which I assume to be an Excel file), so, for example if the input file is "c:\documents\XYZ Company.xlsx", the sheetname will be "XYZ Company".
The parameter sheet specifies that the 5 best %age match barcodes are to be output, and the % match must be at least 5%.
The results for Reseller 1, data as supplied by you, are:
Book15
ABCDEFGHIJKLMN
1BrandTitleQuantityBarcodeBarcode #1#1 % MatchBarcode #2#2 % MatchBarcode #3#3 % MatchBarcode #4#4 % MatchBarcode #5#5 % Match
2GarnierRulldeodorant Garnier mineral action 50ml50ml360054247511254.00%360054247512952.00%360054247513652.00%360054247517452.00%360054080154846.00%
3Coca-ColaSoft Coca-Cola Zero 2L2l544900013184377.00%500011265137973.00%474305000004555.00%544900013180555.00%500011265130050.00%
4FELIXFelix Magushapu kaste Sweet&Sour 500g ananassiga500g475002250409840.00%474002904899240.00%474002904952440.00%474002907263838.00%474002907262135.00%
5ColgateHambapasta Triple Action, COLGATE, 75 ml75ml6920354835971100.00%871895122680774.00%871895122671573.00%871895105086067.00%871895125001767.00%
Reseller 1

When the macro exits, it leaves the workbook for you to save as appropriate.
Is this what you're looking for?
 
Upvote 0

Excel Facts

Using Function Arguments with nested formulas
If writing INDEX in Func. Arguments, type MATCH(. Use the mouse to click inside MATCH in the formula bar. Dialog switches to MATCH.
Hi Alan,
wow this is just impressive actually!
And yes, I think it's exactly what I'm looking for 😊
 
Upvote 0
OK, in addition to thecode below you will need to add the FuzzyVLookup code module into yuor 'Driver' worksheet., you will also need to create two sheets:
* Parameters: format and contents as above
* Database: As per your example. NOTE that the data MUST be sorted into ascending Brand sequence. For efficiency the code will search the first and last row for each brand and thus reduce the matching to within those boundaries.

You will need to copy the Reseller files into the same folder, when you run the macro you will be prompted for the file(s). Use [Ctrl-Click] or [Shift-Click] to select multiple files.

VBA Code:
Option Explicit

Type Params
    GroupHeading As String
    MatchHeading As String
    MatchesCount As Long
    MinPercent As Single
    Algorithm As Single
    DBQuantity As String
    DBBarcode As String
End Type
Dim mudtParameters As Params

Type BrandBounds
    BrandName As String
    BrandLB As Long
    BrandUB As Long
End Type

Dim mdicBrandBounds As Object

Type DatabaseData
    Brand As String
    Title As String
    Qty As String
    BarCode As String
End Type
Dim mudtDatabase() As DatabaseData

Type BrandMatches
'    BrandName As String
    BarCode As String
    BrandPercent As Single
End Type

Sub click_GetBarcodes()

Dim lUB As Long
Dim lLastUB  As Long
Dim lRow As Long

Dim saCurName() As String
Dim sCurTerminalName As String
Dim sPrevBrand As String
Dim sCurBrand As String

Dim vCurFilename As Variant
Dim vResellerFiles As Variant
Dim vaDatabase As Variant
Dim vaBrandLimits As Variant
Dim vaDicBrandItem As Variant

Dim WBcur As Workbook
Dim wbResults As Workbook

Dim wsDB As Worksheet
Dim wsCur As Worksheet

mudtParameters = GetParameters()

Application.DisplayAlerts = False

'*********************************
'**Get database data into array **
'*********************************
Set wsDB = ThisWorkbook.Sheets("Database")
With wsDB.UsedRange
    vaDatabase = wsDB.Range("A1").Resize(.Rows.Count, .Columns.Count).Value
End With

'*************************************************************************************
'** Set up Brands Pointers dictionary.                                              **
'** Note that the database entries MUST be sorted into ascending brand sequence!    **
'*************************************************************************************
Set mdicBrandBounds = Nothing
Set mdicBrandBounds = CreateObject("Scripting.Dictionary")

lUB = 0
sPrevBrand = ""
ReDim vaDicBrandItem(1 To 3)
For lRow = 2 To UBound(vaDatabase, 1)
    sCurBrand = NormaliseName(vaDatabase(lRow, 1))
    If sCurBrand <> "" Then
        If sCurBrand < sPrevBrand Then
            MsgBox prompt:="Database file MUST be sorted into ascending Brand sequence", _
                    Buttons:=vbOKOnly + vbCritical, _
                    Title:="Database Data Error"
            Set mdicBrandBounds = Nothing
            Exit Sub

        ElseIf sCurBrand = sPrevBrand Then
            '** Update End row for current brand
            vaDicBrandItem(3) = lRow
        Else
            '** Here if new brand entry row encountered **
            If sPrevBrand <> "" Then
                On Error Resume Next
                mdicBrandBounds.Add Key:=sPrevBrand, Item:=vaDicBrandItem
                On Error GoTo 0
            End If
            ReDim vaDicBrandItem(1 To 3)
            vaDicBrandItem(1) = sCurBrand
            vaDicBrandItem(2) = lRow
            vaDicBrandItem(3) = lRow
           
            sPrevBrand = sCurBrand
           
        End If
    End If
Next lRow
On Error Resume Next
mdicBrandBounds.Add Key:=sPrevBrand, Item:=vaDicBrandItem        '** Write final entry **
On Error GoTo 0

'**************************************************
'** Store database in the udt array mudtDatabase **
'**************************************************
Call PopulateDatabaseEntries(DBArray:=vaDatabase)

'***********************************
'** Get input reseller file names **
'***********************************
vResellerFiles = Application.GetOpenFilename(FileFilter:="Excel Files (*.xls*), *.xls*", _
                                            Title:="Please select Reseller Excel file(s)", _
                                            MultiSelect:=True)
If IsArray(vResellerFiles) = False Then Exit Sub

'****************************
'** Create output workbook **
'****************************
Set wbResults = Nothing
On Error Resume Next
Set wbResults = Workbooks.Add
On Error GoTo 0
If wbResults Is Nothing Then
    If Err.Number > 0 Then
        MsgBox prompt:=Err.Description, _
                Buttons:=vbOKOnly + vbCritical, _
                Title:="Cannot create Results Workbook"
       
    End If
    Exit Sub
End If

'****************************
'** Process Reseller files **
'****************************
For Each vCurFilename In vResellerFiles
   
    Call ProcessInputFile(DBData:=vaDatabase, _
                          InputFileName:=vCurFilename, _
                          ResultsWB:=wbResults)
   
Next vCurFilename
On Error Resume Next
wbResults.Sheets(1).Delete
On Error GoTo 0

Application.StatusBar = False

End Sub

Private Sub PopulateDatabaseEntries(ByVal DBArray As Variant)
'************************************
'** Create entries in mudtDataBase **
'************************************
Dim lRow As Long
Dim lCol As Long
Dim lColBrand As Long
Dim lColTitle As Long
Dim lColQty As Long
Dim lColBarCode As Long
Dim lEntriesPtr As Long

Dim sCurHeading As String

For lCol = 1 To UBound(DBArray, 2)
    sCurHeading = NormaliseName(CStr(DBArray(1, lCol)))
    Select Case sCurHeading
        Case NormaliseName(mudtParameters.GroupHeading)
            lColBrand = lCol
        Case NormaliseName(mudtParameters.MatchHeading)
            lColTitle = lCol
        Case NormaliseName(mudtParameters.DBQuantity)
            lColQty = lCol
        Case NormaliseName(mudtParameters.DBBarcode)
            lColBarCode = lCol
    End Select
Next lCol

ReDim mudtDatabase(1 To 1)
lEntriesPtr = 1
mudtDatabase(1).Brand = ""
On Error Resume Next
For lRow = 2 To UBound(DBArray, 1)
    If Trim$(DBArray(lRow, lColBrand)) <> "" Then
        lEntriesPtr = lEntriesPtr + 1
        ReDim Preserve mudtDatabase(1 To lEntriesPtr)
        On Error Resume Next
        mudtDatabase(lEntriesPtr).BarCode = CStr(DBArray(lRow, lColBarCode))
        mudtDatabase(lEntriesPtr).Brand = NormaliseName(DBArray(lRow, lColBrand))
        mudtDatabase(lEntriesPtr).Qty = CStr(DBArray(lRow, lColQty))
        mudtDatabase(lEntriesPtr).Title = CStr(DBArray(lRow, lColTitle))
        On Error GoTo 0
    End If
Next lRow

End Sub

Private Function GetParameters() As Params
'***********************************************
'** Return parameters from sheet 'parameters' **
'***********************************************
Dim lRow As Long
Dim lEndRow As Long

Dim sCurKeyword As String
Dim sCurValue As String

Dim vaParamData As Variant

'** Store parameter data into array **
vaParamData = ThisWorkbook.Sheets("Parameters").Range("A1").CurrentRegion.Resize(, 2).Value

For lRow = 2 To UBound(vaParamData, 1)
    sCurKeyword = LCase$(Replace(vaParamData(lRow, 1), " ", ""))    '** Remove all spaces and convert to lowercase
    Select Case sCurKeyword
        Case ""
            '** Ignore empty keyword cells **
        Case "groupheading"
            GetParameters.GroupHeading = NormaliseName(vaParamData(lRow, 2))
        Case "matchheading"
            GetParameters.MatchHeading = NormaliseName(vaParamData(lRow, 2))
        Case "#matchesperentry"
            GetParameters.MatchesCount = Val(vaParamData(lRow, 2))
        Case "min%match"
            GetParameters.MinPercent = Val(vaParamData(lRow, 2))
        Case "matchalgorithm"
            GetParameters.Algorithm = Val(vaParamData(lRow, 2))
        Case "dbquantity"
            GetParameters.DBQuantity = CStr(vaParamData(lRow, 2))
        Case "dbbarcode"
            GetParameters.DBBarcode = CStr(vaParamData(lRow, 2))
    End Select
Next lRow

End Function
Private Function NormaliseName(ByVal NameX As String) As String
'*************************************************************************************
'** Remove all but "abcdefghijklmnopqrstuvwxyz0123456789", and convert to lowercase **
'*************************************************************************************
Dim lPtr As Long

Dim sChar As String
Dim sResult As String

For lPtr = 1 To Len(NameX)
    sChar = LCase$(Mid$(NameX, lPtr, 1))
    If InStr("abcdefghijklmnopqrstuvwxyz0123456789", sChar) > 0 Then sResult = sResult & sChar
Next lPtr
NormaliseName = sResult
End Function

Private Sub ProcessInputFile(ByVal DBData As Variant, _
                             ByVal InputFileName As Variant, _
                             ByRef ResultsWB As Workbook)

Dim lCol As Long
Dim lRow As Long
Dim lPtr As Long
Dim lMustMatchCol As Long
Dim lMatchCol As Long
Dim lLB As Long
Dim lUB As Long
Dim lDBRow As Long
Dim lPtr1 As Long
Dim lPtr2 As Long
Dim lSheetCount As Long

Dim sCurHeading As String
Dim sTerminalName As String
Dim sCurResellerBrand As String
Dim sCurResellerTitle As String
Dim sCurDBTitle As String

Dim sngCurMatchPercent As Single

Dim udtBrandmatches() As BrandMatches

Dim vaCurData As Variant
Dim vaMatchResultsData As Variant
Dim vaCurDicBrandItem As Variant

Dim WB As Workbook

Dim WS As Worksheet
Dim wsResults As Worksheet

lMustMatchCol = 0
lMatchCol = 0

sTerminalName = GetTerminalName(InputFileName)

Application.StatusBar = "Processing " & sTerminalName
Application.ScreenUpdating = False

On Error Resume Next
Set WB = Nothing
Set WB = Workbooks.Open(Filename:=InputFileName, _
                        UpdateLinks:=True, _
                        ReadOnly:=True, _
                        corruptload:=xlRepairFile)
If Err.Number > 0 Then
    MsgBox prompt:=Err.Description, Buttons:=vbOKOnly + vbCritical, Title:="Unable to open file " & sTerminalName
End If
On Error GoTo 0
If WB Is Nothing Then Exit Sub

Set WS = WB.Sheets(1)
With WS.UsedRange
    vaCurData = WS.Range("A1").Resize(.Rows.Count, .Columns.Count).Value                '** Get input Reseller data
End With

'** Initialise results array **
ReDim vaMatchResultsData(1 To UBound(vaCurData, 1), 1 To mudtParameters.MatchesCount * 2) '** set size of array for Results
For lCol = 1 To mudtParameters.MatchesCount
    lPtr = ((lCol - 1) * 2) + 1
    vaMatchResultsData(1, lPtr) = "Barcode #" & lCol
    vaMatchResultsData(1, lPtr + 1) = "#" & lCol & " % Match"
Next lCol

'** Check that we have the 2 required heading columns in row 1**
For lCol = 1 To UBound(vaCurData, 2)
    sCurHeading = NormaliseName(CStr(vaCurData(1, lCol)))
    If sCurHeading = mudtParameters.GroupHeading Then lMustMatchCol = lCol
    If sCurHeading = mudtParameters.MatchHeading Then lMatchCol = lCol
Next lCol
If lMustMatchCol > 0 _
And lMatchCol > 0 Then
    '** Process the reseller **
    lSheetCount = ResultsWB.Worksheets.Count
    Set wsResults = ResultsWB.Sheets.Add(after:=ResultsWB.Sheets(lSheetCount))        '** Add a new worksheet to the results workbook
    On Error Resume Next
    wsResults.Name = sTerminalName              '** set the sheetname to the reseller file terminal name
    On Error GoTo 0
'    ReDim vaResults(1 To UBound(vaCurData, 1), 1 To mudtParameters.MatchesCount * 2)
    '** MAIN LOOP **
    For lRow = 2 To UBound(vaCurData, 1)
        With Application
            .ScreenUpdating = True
            .StatusBar = "Processing Reseller file " & sTerminalName & ", row " & lRow & " of " & UBound(vaCurData, 1)
            .ScreenUpdating = False
        End With
        sCurResellerBrand = NormaliseName(vaCurData(lRow, lMustMatchCol))
        sCurResellerTitle = vaCurData(lRow, lMatchCol)
        ReDim vaCurDicBrandItem(1 To 3)
        On Error Resume Next
        vaCurDicBrandItem = mdicBrandBounds.Item(sCurResellerBrand)
        On Error GoTo 0

        If sCurResellerBrand = vaCurDicBrandItem(1) Then
       
            ReDim udtBrandmatches(1 To mudtParameters.MatchesCount + 1)     '>>
            For lPtr = 1 To UBound(udtBrandmatches)                         '>> Initialise array
                udtBrandmatches(lPtr).BarCode = ""                          '>> NOTE that we have a dummy array entry at the end
                udtBrandmatches(lPtr).BrandPercent = 0                      '>>
            Next lPtr                                                       '>>
           
            sCurResellerTitle = vaCurData(lRow, lMatchCol)
            For lDBRow = vaCurDicBrandItem(2) To vaCurDicBrandItem(3)
                sngCurMatchPercent = FuzzyPercent(String1:=sCurResellerTitle, _
                                                  String2:=mudtDatabase(lDBRow).Title, _
                                                  Algorithm:=mudtParameters.Algorithm, _
                                                  Normalised:=False)
                If sngCurMatchPercent >= mudtParameters.MinPercent Then
                    For lPtr1 = 1 To mudtParameters.MatchesCount
                        If sngCurMatchPercent > udtBrandmatches(lPtr1).BrandPercent Then
                            For lPtr2 = mudtParameters.MatchesCount - 1 To lPtr1 Step -1
                                If udtBrandmatches(lPtr2).BarCode <> "" Then
                                    udtBrandmatches(lPtr2 + 1).BarCode = udtBrandmatches(lPtr2).BarCode
                                    udtBrandmatches(lPtr2 + 1).BrandPercent = udtBrandmatches(lPtr2).BrandPercent
                                End If
                            Next lPtr2
                            udtBrandmatches(lPtr1).BarCode = mudtDatabase(lDBRow).BarCode
                            udtBrandmatches(lPtr1).BrandPercent = sngCurMatchPercent
                            Exit For
                        End If
                    Next lPtr1
                End If
            Next lDBRow
            For lCol = 1 To mudtParameters.MatchesCount
                If udtBrandmatches(lCol).BrandPercent > 0 Then
                    lPtr = ((lCol - 1) * 2) + 1
                    vaMatchResultsData(lRow, lPtr) = "'" & udtBrandmatches(lCol).BarCode
                    vaMatchResultsData(lRow, lPtr + 1) = udtBrandmatches(lCol).BrandPercent
                End If
            Next lCol
        End If
    Next lRow
   
    '** Store results into worksheet **
    wsResults.Range("A1").Resize(UBound(vaCurData, 1), UBound(vaCurData, 2)).Value = vaCurData
    With wsResults.Range("A1").Offset(, UBound(vaCurData, 2))
        For lCol = 1 To mudtParameters.MatchesCount
            lPtr = ((lCol - 1) * 2)
'            .Offset(, lPtr).Columns.NumberFormat = "@"
            With .Offset(, lPtr + 1).Resize(wsResults.Rows.Count, 1)
                .NumberFormat = "0.00%"
                .HorizontalAlignment = xlLeft
            End With
        Next lCol
        .Resize(UBound(vaMatchResultsData, 1), UBound(vaMatchResultsData, 2)).Value = vaMatchResultsData
    End With
    wsResults.UsedRange.Resize(1).Font.Bold = True
    wsResults.Cells.EntireColumn.AutoFit
   
End If

WB.Close savechanges:=False

Application.ScreenUpdating = True

End Sub

Private Function GetTerminalName(ByVal Filenamex As Variant) As String
'*****************************************************************
'** Return final element of filename (excluding file extension) **
'*****************************************************************
Dim lUB As Long

Dim saSplit() As String
Dim saSplit2() As String

saSplit = Split(Filenamex, Delimiter:=Application.PathSeparator)
lUB = UBound(saSplit)
saSplit2 = Split(saSplit(lUB), ".")
GetTerminalName = saSplit2(0)
End Function
The macro itself is called "click_GetBarcodes", for convenience I placed a button to call the macro in the Parameters sheet.

Note the progress commentary in the bottom Left corner.

Let me know how you get on

Best wishes

Alan
 
Upvote 0
Hi Alan,
I've tried to follow your instructions but I'm really a newbie regarding VBA and also I don't know how to code, my apologies :(

But hopefully you can help me, so this is what I've done:
1) I created a folder with 4 files for now.
2) I've added the Database file + 1 reseller file looking exactly the same as in my example.
3) I created a FuzzyLookup.xlsm with a VBA code that I found from a previous post you made in this thread.
4) I created a Parameters.xlsm with the code you put in your previous answer.

But when I launch GetBarcodes I get an Error (see screenshot)
So I definitely messed up somewhere but I'm not sure where?!

One precision however, I'm running Excel on MacOS.

I've added the files on my Dropbox so can take a look
Fuzzy Lookup

Thanks a lot for your help!
Greg

Screenshot 2024-04-13 at 23.49.17.jpg
 
Upvote 0
Hi Greg,
Try creating two worksheets 'Parameters' and 'Database' in the driver workbook and populate them with the parameters and the database data, ensuring that the Database data is sorted into Brand sequence.
If you still get issues click the [Debug] button and note the failing line of code.
Cheers
Alan
 
Upvote 0
Hi Alan,
Ok, so I tried to add the database into the same worksheet as the parameters and now I got a different error which seems to be MacOS related.

Error is: Runtime Error 429 - ActiveX Component Can't Create Object
=> The line is: Set mdicBrandBounds = CreateObject("Scripting.Dictionary")

What should I do?
Thanks!

Greg
 
Upvote 0
Hmmm, it seems that the dictionary object doesn't work too well on Mac.
I'll have to replace it with an alternative
 
Upvote 0
Hi Greg, yes I found that, but I can't test it, so I've replaced the dictionary with a collection which should work.
I've also added 2 more parameter entries to allow you to optionally show the Database Title and/or Quantity in the results, which I thought may be useful for checking purposes.
Here's the Parameters sheet contents:
GetBarcodes V2.xlsm
ABC
1KeywordValueComment
2Group HeadingBrandHeading in Database and Reseller files of column which MUST match exactly for an entry to be a candidate for comparison
3Match HeadingTitleHeading in Database and Reseller files of column to be fuzzy matched
4DB QuantityQuantityHeading in Database of column containing quantity
5DB BarcodeBarcodeHeading in Database of column containing Barcode
6# Matches per Entry5Return the best 5 matching barcodes
7Min % Match5%Ignore any matches in 'Title' column below 5%
8Match Algorithm4Set to '2' to match pairs then triplets then quads etc, or '4' for Levenstein match. Algorithm 4 is more accurate but slower.
9Show DB TitleYesData item in Database to be shown in results Value must start with 'Y' to be included
10Show DB QuantityNoData item in Database to be shown in results Value must start with 'Y' to be included
Parameters
Cell Formulas
RangeFormula
C6C6="Return the best "&B6&" matching barcodes"
C7C7="Ignore any matches in '"&B3&"' column below "&TEXT(B7,"#%")
Cells with Conditional Formatting
CellConditionCell FormatStop If True
A:CExpression=$A1<>""textNO

Here's some partial results:
Book3
ABCDEFGHIJ
1BrandTitleQuantityBarcodeBarcode #1#1 % Match#1 DB titleBarcode #2#2 % Match#2 DB title
2GARNIERRulldeo.GARNIER Action Control 50ml50ml360054247511260.00%Rulldeodorant Mineral Action Control, GARNIER, 50 ml360054247517446.00%Rulldeodorant Action Thermic Women, GARNIER, 50 ml
3Coca-ColaSoft Drink COCA-COLA Zero 2L2l544900013184361.00%Coca-Cola Zero 2L500011265137957.00%Coca-Cola Zero 1L
4FELIXMagushapu kaste ananassiga FELIX 500g500g900029583013049.00%Pastakaste ürdiga, FELIX, 360 g475002250511846.00%Pastakaste, FELIX, 500 g
5ColgateCOLGATE HAMBAPASTA TRIPLE ACTION 75ML75ml871895126573861.00%Colgate hambapasta lastele kids 3-5a 50ml692035483597152.00%Hambapasta Triple Action, COLGATE, 75 ml
Reseller 3


Here's the code:
VBA Code:
Option Explicit

Type Params
    GroupHeading As String
    MatchHeading As String
    MatchesCount As Long
    MinPercent As Single
    Algorithm As Single
    DBQuantity As String
    DBBarcode As String
    ShowTitle As Boolean
    ShowQty As Boolean
End Type
Dim mudtParameters As Params

Type BrandBounds
    BrandName As String
    BrandLB As Long
    BrandUB As Long
End Type

Dim mcolBrandbounds As Collection

Type DatabaseData
    Brand As String
    Title As String
    Qty As String
    BarCode As String
End Type
Dim mudtDatabase() As DatabaseData

Type BarCodeMatches
    BarCode As String
    BrandPercent As Single
    MatchText As String
    Qty As String
End Type

Sub click_GetBarcodes()

Dim lUB As Long
Dim lLastUB  As Long
Dim lRow As Long

Dim saCurName() As String
Dim sCurTerminalName As String
Dim sPrevBrand As String
Dim sCurBrand As String

Dim vCurFilename As Variant
Dim vResellerFiles As Variant
Dim vaDatabase As Variant
Dim vaBrandLimits As Variant
Dim vaDicBrandItem As Variant
Dim vaBrandBounds As Variant

Dim udtBrandBound As BrandBounds

Dim WBcur As Workbook
Dim wbResults As Workbook

Dim wsDB As Worksheet
Dim wsCur As Worksheet

mudtParameters = GetParameters()

Application.DisplayAlerts = False

'*********************************
'**Get database data into array **
'*********************************
Set wsDB = ThisWorkbook.Sheets("Database")
With wsDB.UsedRange
    vaDatabase = wsDB.Range("A1").Resize(.Rows.Count, .Columns.Count).Value
End With

'*************************************************************************************
'** Set up Brands bounds collection                                              **
'** Note that the database entries MUST be sorted into ascending brand sequence!    **
'*************************************************************************************
Set mcolBrandbounds = New Collection
lUB = 0
sPrevBrand = ""
ReDim vaBrandBounds(1 To 3)
For lRow = 2 To UBound(vaDatabase, 1)
    sCurBrand = NormaliseName(vaDatabase(lRow, 1))
    If sCurBrand <> "" Then
        If sCurBrand < sPrevBrand Then
            MsgBox prompt:="Database file MUST be sorted into ascending Brand sequence", _
                    Buttons:=vbOKOnly + vbCritical, _
                    Title:="Database Data Error"
            Set mcolBrandbounds = Nothing
            Exit Sub

        ElseIf sCurBrand = sPrevBrand Then
            '** Update End row for current brand
            vaBrandBounds(3) = lRow
        Else
            '** Here if new brand entry row encountered **
            If sPrevBrand <> "" Then
                On Error Resume Next
                mcolBrandbounds.Add key:=sPrevBrand, Item:=vaBrandBounds
                On Error GoTo 0
            End If
            vaBrandBounds(1) = sCurBrand
            vaBrandBounds(2) = lRow
            vaBrandBounds(3) = lRow
            
            sPrevBrand = sCurBrand
            
        End If
    End If
Next lRow
On Error Resume Next
mcolBrandbounds.Add key:=sPrevBrand, Item:=vaBrandBounds        '** Write final entry **
On Error GoTo 0

'**************************************************
'** Store database in the udt array mudtDatabase **
'**************************************************
Call PopulateDatabaseEntries(DBArray:=vaDatabase)

'***********************************
'** Get input reseller file names **
'***********************************
vResellerFiles = Application.GetOpenFilename(FileFilter:="Excel Files (*.xls*), *.xls*", _
                                            Title:="Please select Reseller Excel file(s)", _
                                            MultiSelect:=True)
If IsArray(vResellerFiles) = False Then Exit Sub

'****************************
'** Create output workbook **
'****************************
Set wbResults = Nothing
On Error Resume Next
Set wbResults = Workbooks.Add
On Error GoTo 0
If wbResults Is Nothing Then
    If Err.Number > 0 Then
        MsgBox prompt:=Err.Description, _
                Buttons:=vbOKOnly + vbCritical, _
                Title:="Cannot create Results Workbook"
        
    End If
    Exit Sub
End If

'****************************
'** Process Reseller files **
'****************************
For Each vCurFilename In vResellerFiles
    
    Call ProcessInputFile(DBData:=vaDatabase, _
                          InputFileName:=vCurFilename, _
                          ResultsWB:=wbResults)
    
Next vCurFilename
On Error Resume Next
wbResults.Sheets(1).Delete
On Error GoTo 0

Application.StatusBar = False

End Sub

Private Sub PopulateDatabaseEntries(ByVal DBArray As Variant)
'************************************
'** Create entries in mudtDataBase **
'************************************
Dim lRow As Long
Dim lCol As Long
Dim lColBrand As Long
Dim lColTitle As Long
Dim lColQty As Long
Dim lColBarCode As Long
Dim lEntriesPtr As Long

Dim sCurHeading As String

For lCol = 1 To UBound(DBArray, 2)
    sCurHeading = NormaliseName(CStr(DBArray(1, lCol)))
    Select Case sCurHeading
        Case NormaliseName(mudtParameters.GroupHeading)
            lColBrand = lCol
        Case NormaliseName(mudtParameters.MatchHeading)
            lColTitle = lCol
        Case NormaliseName(mudtParameters.DBQuantity)
            lColQty = lCol
        Case NormaliseName(mudtParameters.DBBarcode)
            lColBarCode = lCol
    End Select
Next lCol

ReDim mudtDatabase(1 To 1)
lEntriesPtr = 1
mudtDatabase(1).Brand = ""
On Error Resume Next
For lRow = 2 To UBound(DBArray, 1)
    If Trim$(DBArray(lRow, lColBrand)) <> "" Then
        lEntriesPtr = lEntriesPtr + 1
        ReDim Preserve mudtDatabase(1 To lEntriesPtr)
        On Error Resume Next
        mudtDatabase(lEntriesPtr).BarCode = CStr(DBArray(lRow, lColBarCode))
        mudtDatabase(lEntriesPtr).Brand = NormaliseName(DBArray(lRow, lColBrand))
        mudtDatabase(lEntriesPtr).Qty = CStr(DBArray(lRow, lColQty))
        mudtDatabase(lEntriesPtr).Title = CStr(DBArray(lRow, lColTitle))
        On Error GoTo 0
    End If
Next lRow

End Sub

Private Function GetParameters() As Params
'***********************************************
'** Return parameters from sheet 'parameters' **
'***********************************************
Dim lRow As Long
Dim lEndRow As Long

Dim sCurKeyword As String
Dim sCurValue As String

Dim vaParamData As Variant

'** Store parameter data into array **
vaParamData = ThisWorkbook.Sheets("Parameters").Range("A1").CurrentRegion.Resize(, 2).Value

For lRow = 2 To UBound(vaParamData, 1)
    sCurKeyword = LCase$(Replace(vaParamData(lRow, 1), " ", ""))    '** Remove all spaces and convert to lowercase
    Select Case sCurKeyword
        Case ""             '** Ignore empty keyword cells **
            
        Case "groupheading"
            GetParameters.GroupHeading = NormaliseName(vaParamData(lRow, 2))
        Case "matchheading"
            GetParameters.MatchHeading = NormaliseName(vaParamData(lRow, 2))
        Case "#matchesperentry"
            GetParameters.MatchesCount = Val(vaParamData(lRow, 2))
        Case "min%match"
            GetParameters.MinPercent = Val(vaParamData(lRow, 2))
        Case "matchalgorithm"
            GetParameters.Algorithm = Val(vaParamData(lRow, 2))
        Case "dbquantity"
            GetParameters.DBQuantity = CStr(vaParamData(lRow, 2))
        Case "dbbarcode"
            GetParameters.DBBarcode = CStr(vaParamData(lRow, 2))
        Case "showdbtitle"
            GetParameters.ShowTitle = LCase$(Left$(vaParamData(lRow, 2), 1)) = "y"
        Case "showdbquantity"
            GetParameters.ShowQty = LCase$(Left$(vaParamData(lRow, 2), 1)) = "y"
    End Select
Next lRow

End Function
Private Function NormaliseName(ByVal NameX As String) As String
'*************************************************************************************
'** Remove all but "abcdefghijklmnopqrstuvwxyz0123456789", and convert to lowercase **
'*************************************************************************************
Dim lPtr As Long

Dim sChar As String
Dim sResult As String

For lPtr = 1 To Len(NameX)
    sChar = LCase$(Mid$(NameX, lPtr, 1))
    If InStr("abcdefghijklmnopqrstuvwxyz0123456789", sChar) > 0 Then sResult = sResult & sChar
Next lPtr
NormaliseName = sResult
End Function

Private Sub ProcessInputFile(ByVal DBData As Variant, _
                             ByVal InputFileName As Variant, _
                             ByRef ResultsWB As Workbook)

Dim lCol As Long
Dim lRow As Long
Dim lPtr As Long
Dim lMustMatchCol As Long
Dim lMatchCol As Long
Dim lLB As Long
Dim lUB As Long
Dim lDBRow As Long
Dim lPtr1 As Long
Dim lPtr2 As Long
Dim lSheetCount As Long
Dim lResultsColumnCount As Long

Dim sCurHeading As String
Dim sTerminalName As String
Dim sCurResellerBrand As String
Dim sCurResellerTitle As String
Dim sCurDBTitle As String

Dim sngCurMatchPercent As Single

Dim udtBarCodeMatches() As BarCodeMatches
Dim udtBrandBounds As BrandBounds

Dim vaCurData As Variant
Dim vaMatchResultsData As Variant
Dim vaCurBrandItem As Variant

Dim WB As Workbook

Dim WS As Worksheet
Dim wsResults As Worksheet

lMustMatchCol = 0
lMatchCol = 0

sTerminalName = GetTerminalName(InputFileName)

Application.StatusBar = "Processing " & sTerminalName
Application.ScreenUpdating = False

On Error Resume Next
Set WB = Nothing
Set WB = Workbooks.Open(Filename:=InputFileName, _
                        UpdateLinks:=True, _
                        ReadOnly:=True, _
                        corruptload:=xlRepairFile)
If Err.Number > 0 Then
    MsgBox prompt:=Err.Description, Buttons:=vbOKOnly + vbCritical, Title:="Unable to open file " & sTerminalName
End If
On Error GoTo 0
If WB Is Nothing Then Exit Sub

Set WS = WB.Sheets(1)
With WS.UsedRange
    vaCurData = WS.Range("A1").Resize(.Rows.Count, .Columns.Count).Value                '** Get input Reseller data
End With

'** Initialise results array **
lResultsColumnCount = 2
If mudtParameters.ShowQty = True Then lResultsColumnCount = lResultsColumnCount + 1
If mudtParameters.ShowTitle = True Then lResultsColumnCount = lResultsColumnCount + 1
ReDim vaMatchResultsData(1 To UBound(vaCurData, 1), 1 To mudtParameters.MatchesCount * lResultsColumnCount) '** set size of array for Results
For lCol = 1 To mudtParameters.MatchesCount
    lPtr = ((lCol - 1) * lResultsColumnCount) + 1
    vaMatchResultsData(1, lPtr) = "Barcode #" & lCol
    vaMatchResultsData(1, lPtr + 1) = "#" & lCol & " % Match"
    lPtr1 = lPtr + 1
    If mudtParameters.ShowTitle = True Then
        lPtr1 = lPtr1 + 1
        vaMatchResultsData(1, lPtr1) = "#" & lCol & " DB " & mudtParameters.MatchHeading
    End If
    If mudtParameters.ShowQty = True Then
        lPtr1 = lPtr1 + 1
        vaMatchResultsData(1, lPtr1) = "#" & lCol & " DB Quantity"
    End If
Next lCol

'** Check that we have the 2 required heading columns in row 1**
For lCol = 1 To UBound(vaCurData, 2)
    sCurHeading = NormaliseName(CStr(vaCurData(1, lCol)))
    If sCurHeading = mudtParameters.GroupHeading Then lMustMatchCol = lCol
    If sCurHeading = mudtParameters.MatchHeading Then lMatchCol = lCol
Next lCol
If lMustMatchCol > 0 _
And lMatchCol > 0 Then
    '** Process the reseller **
    lSheetCount = ResultsWB.Worksheets.Count
    Set wsResults = ResultsWB.Sheets.Add(after:=ResultsWB.Sheets(lSheetCount))        '** Add a new worksheet to the results workbook
    On Error Resume Next
    wsResults.Name = sTerminalName              '** set the sheetname to the reseller file terminal name
    On Error GoTo 0
    '** MAIN LOOP **
    For lRow = 2 To UBound(vaCurData, 1)
        With Application
            .ScreenUpdating = True
            .StatusBar = "Processing Reseller file " & sTerminalName & ", row " & lRow & " of " & UBound(vaCurData, 1)
            .ScreenUpdating = False
        End With
        sCurResellerBrand = NormaliseName(vaCurData(lRow, lMustMatchCol))
        sCurResellerTitle = vaCurData(lRow, lMatchCol)

        If CollectionKeyExists(coll:=mcolBrandbounds, key:=sCurResellerBrand) Then
            ReDim vaCurBrandItem(1 To 3)
            vaCurBrandItem = mcolBrandbounds.Item(sCurResellerBrand)
        
            '** Initialise array **
            ReDim udtBarCodeMatches(1 To mudtParameters.MatchesCount + 1)
            For lPtr = 1 To UBound(udtBarCodeMatches)
                With udtBarCodeMatches(lPtr)
                    .BarCode = ""
                    .BrandPercent = 0
                    .MatchText = ""
                    .Qty = ""
                End With
            Next lPtr
            
            sCurResellerTitle = vaCurData(lRow, lMatchCol)
            For lDBRow = vaCurBrandItem(2) To vaCurBrandItem(3)
                sngCurMatchPercent = FuzzyPercent(String1:=sCurResellerTitle, _
                                                  String2:=mudtDatabase(lDBRow).Title, _
                                                  Algorithm:=mudtParameters.Algorithm, _
                                                  Normalised:=False)
                If sngCurMatchPercent >= mudtParameters.MinPercent Then
                    For lPtr1 = 1 To mudtParameters.MatchesCount
                        If sngCurMatchPercent > udtBarCodeMatches(lPtr1).BrandPercent Then
                            For lPtr2 = mudtParameters.MatchesCount - 1 To lPtr1 Step -1
                                If udtBarCodeMatches(lPtr2).BrandPercent <> 0 Then
                                    With udtBarCodeMatches(lPtr2 + 1)
                                        .BarCode = udtBarCodeMatches(lPtr2).BarCode
                                        .BrandPercent = udtBarCodeMatches(lPtr2).BrandPercent
                                        .MatchText = udtBarCodeMatches(lPtr2).MatchText
                                        .Qty = udtBarCodeMatches(lPtr2).Qty
                                    End With
                                End If
                            Next lPtr2
                            With udtBarCodeMatches(lPtr1)
                                .BarCode = mudtDatabase(lDBRow).BarCode
                                .BrandPercent = sngCurMatchPercent
                                .MatchText = mudtDatabase(lDBRow).Title
                                .Qty = mudtDatabase(lDBRow).Qty
                            End With
                            Exit For
                        End If
                    Next lPtr1
                End If
            Next lDBRow
            For lCol = 1 To mudtParameters.MatchesCount
                If udtBarCodeMatches(lCol).BrandPercent > 0 Then
                    lPtr = ((lCol - 1) * lResultsColumnCount) + 1
                    vaMatchResultsData(lRow, lPtr) = "'" & udtBarCodeMatches(lCol).BarCode
                    vaMatchResultsData(lRow, lPtr + 1) = udtBarCodeMatches(lCol).BrandPercent
                    lPtr1 = lPtr + 1
                    If mudtParameters.ShowTitle = True Then
                        lPtr1 = lPtr1 + 1
                        vaMatchResultsData(lRow, lPtr1) = udtBarCodeMatches(lCol).MatchText
                    End If
                    If mudtParameters.ShowQty = True Then
                        lPtr1 = lPtr1 + 1
                        vaMatchResultsData(lRow, lPtr1) = udtBarCodeMatches(lCol).Qty
                    End If
                End If
            Next lCol
        End If
    Next lRow
    
    '** Store results into worksheet **
    wsResults.Range("A1").Resize(UBound(vaCurData, 1), UBound(vaCurData, 2)).Value = vaCurData
    lResultsColumnCount = 2
    If mudtParameters.ShowTitle = True Then lResultsColumnCount = lResultsColumnCount + 1
    If mudtParameters.ShowQty = True Then lResultsColumnCount = lResultsColumnCount + 1
    With wsResults.Range("A1").Offset(, UBound(vaCurData, 2))
        For lCol = 1 To mudtParameters.MatchesCount
            lPtr = ((lCol - 1) * lResultsColumnCount)
            With .Offset(, lPtr + 1).Resize(wsResults.Rows.Count, 1)
                .NumberFormat = "0.00%"
                .HorizontalAlignment = xlLeft
            End With
        Next lCol
        .Resize(UBound(vaMatchResultsData, 1), UBound(vaMatchResultsData, 2)).Value = vaMatchResultsData
    End With
    wsResults.UsedRange.Resize(1).Font.Bold = True
    wsResults.Cells.EntireColumn.AutoFit
    
End If

WB.Close savechanges:=False

Application.ScreenUpdating = True

End Sub

Function CollectionKeyExists(coll As Collection, key As String) As Boolean

    On Error GoTo EH

    IsObject (coll.Item(key))
    
    CollectionKeyExists = True
EH:
End Function

Private Function GetTerminalName(ByVal Filenamex As Variant) As String
'*****************************************************************
'** Return final element of filename (excluding file extension) **
'*****************************************************************
Dim lUB As Long

Dim saSplit() As String
Dim saSplit2() As String

saSplit = Split(Filenamex, Delimiter:=Application.PathSeparator)
lUB = UBound(saSplit)
saSplit2 = Split(saSplit(lUB), ".")
GetTerminalName = saSplit2(0)
End Function
 
Upvote 0
Ok so I think there is progress, but unfortunately I got a new error.
Might be related to MacOS also

=> See screenshot

Screenshot 2024-04-14 at 18.06.40.jpg
 
Upvote 0

Forum statistics

Threads
1,225,757
Messages
6,186,848
Members
453,379
Latest member
gabriellegonzalez

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top