Hello. This is my first question here so apologies in advance for any mistakes or oversights.
I am working with a spreadsheet that contains ~10,000 rows of people's details. There are duplicate entries for each person (same surname, DOB etc.), but the postcodes are frequently different. Some entries for a group of people have a postcode For example like this:
I want my code to look through the surnames and DOB iteratively to find a matching group. Then compare the postcodes in that group to determine whether there is a match or not. If there is a match, the correct ID is the first one in the group. If not, I would specify there is a mismatch.
Here is what I have written so far. It works on the majority but there are cases when it doesn't output the correct comment.
There are some errors with this code, notably for groups where there are 2 of the same postcode, using Picard above as an example:
Whereas all the comments should be 'Mismatch'.
Also for groups where they should all be 'Mismatch' it sometimes says the first entry as 'Correct to ...' and the rest 'Mismatch' when in fact the whole group should be 'Mismatch'. This led me to write the code at the bottom that would make the cell above 'Mismatch' but I want it to work correctly without this.
I have been trying to fiddle around with it for hours but to no avail.
Thank you for your assistance. I hope this is somewhat clear!
I am working with a spreadsheet that contains ~10,000 rows of people's details. There are duplicate entries for each person (same surname, DOB etc.), but the postcodes are frequently different. Some entries for a group of people have a postcode For example like this:
A | B | C | D | E |
ID | Last Name | DOB | Postcode | Comment |
1 | Solo | 13/07/1942 | SW11 5WS | Mismatch |
2 | Solo | 13/07/1942 | Mismatch | |
3 | Solo | 13/07/1942 | Mismatch | |
4 | Rodgers | 22/06/1981 | PQ5 7ST | Correct to ID 4 |
5 | Rodgers | 22/06/1981 | PQ5 7ST | Correct to ID 4 |
6 | Picard | 31/01/1950 | AB1 2CD | Mismatch |
7 | Picard | 31/01/1950 | AB1 2CD | Mismatch |
8 | Picard | 31/01/1950 | Mismatch | |
9 | Picard | 31/01/1950 | Mismatch | |
10 | Jones | 17/09/1970 | YU5 0RT | Mismatch |
11 | Jones | 17/09/1970 | FD7 1SA | Mismatch |
12 | David | 24/07/1990 | Correct to ID 12 | |
13 | David | 24/07/1990 | Correct to ID 12 |
I want my code to look through the surnames and DOB iteratively to find a matching group. Then compare the postcodes in that group to determine whether there is a match or not. If there is a match, the correct ID is the first one in the group. If not, I would specify there is a mismatch.
Here is what I have written so far. It works on the majority but there are cases when it doesn't output the correct comment.
VBA Code:
Sub ProcessNames()
Dim wb As Workbook, ws As Worksheet
Dim dict As Object, sKey As String
Dim iLastRow As Long, iRow As Long
Dim Count As Long
Set dict = CreateObject("Scripting.Dictionary")
Set wb = ThisWorkbook
Set ws = wb.Sheets("Sheet1")
iLastRow = ws.Cells(Rows.Count, 1).End(xlUp).Row
' this count was just some improvisation to make the code work
Count = 0
For iRow = 2 To iLastRow
' create key using LastName and DOB
sKey = UCase(ws.Cells(iRow, "B")) & Format(Cells(iRow, "C"), "DDMMYYYY")
If dict.exists(sKey) Then
If ws.Cells(iRow, "D") = ws.Cells(dict(sKey), "D") Or ws.Cells(iRow, "D") = ws.Cells(dict(sKey), "D") = "" Then
Count = Count + 1
dict(sKey) = (iRow - Count)
End If
Else
dict(sKey) = iRow
Count = 0
End If
Next
' update correct IDs in Comment
For iRow = 2 To iLastRow
sKey = UCase(ws.Cells(iRow, "B")) & Format(Cells(iRow, "C"), "DDMMYYYY")
If ws.Cells(iRow, "D") = ws.Cells(dict(sKey), "D") Or ws.Cells(iRow, "D") = ws.Cells(dict(sKey), "D") = "" Then
ws.Cells(iRow, "E") = "Correct to ID " & ws.Cells(dict(sKey), "A")
Else
ws.Cells(iRow, "E") = "Mismatch"
End If
Next
' Ideally I wouldn't need this but it can't seem to make it work without it
For iRow = 2 To iLastRow
sKey = UCase(ws.Cells(iRow, "B")) & Format(Cells(iRow, "C"), "DDMMYYYY")
If ws.Cells(iRow, "E") = "Mismatch" Then
ws.Cells((iRow - 1), "E") = "Mismatch"
End If
Next
End Sub
There are some errors with this code, notably for groups where there are 2 of the same postcode, using Picard above as an example:
A | B | C | D | E |
6 | Picard | 31/01/1950 | AB1 2CD | Correct to ID 6 |
7 | Picard | 31/01/1950 | AB1 2CD | Correct to ID 6 |
8 | Picard | 31/01/1950 | Mismatch | |
9 | Picard | 31/01/1950 | Mismatch |
Whereas all the comments should be 'Mismatch'.
Also for groups where they should all be 'Mismatch' it sometimes says the first entry as 'Correct to ...' and the rest 'Mismatch' when in fact the whole group should be 'Mismatch'. This led me to write the code at the bottom that would make the cell above 'Mismatch' but I want it to work correctly without this.
I have been trying to fiddle around with it for hours but to no avail.
Thank you for your assistance. I hope this is somewhat clear!