I know there is a lot of threads about this and I have tried to copy and follow them, but have to luck - I have no experience with macros and basic formulas only. I have a large file of nearly 70,000 rows containing college application information that needs to be cleaned to send to another data collection agency.
I cleaned out to 24,004 rows that match my criteria and now I must remove all duplicate entries. What I need to do it is find duplicate rows that contain the same last name (Column D) and date of birth (Column F), highlight them, and go back and verify by hand if they are the same person who applied under different user id or actually different people who happen to have the same name and dob. After these are verified, I need to delete them because we pay per person for the data collection!
I have already gone through and removed duplicate user id rows, however you do not have to include this and it could be imputed wrong so cannot rely only on that.
I have attached a short example of the workbook.
Thank you!
[TABLE="width: 471"]
<tbody>[TR]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[TD]E[/TD]
[TD]F[/TD]
[/TR]
[TR]
[TD]UserID[/TD]
[TD]first_name[/TD]
[TD]middle_name[/TD]
[TD]last_name[/TD]
[TD]suffix[/TD]
[TD]dob[/TD]
[/TR]
[TR]
[TD]11111111[/TD]
[TD]First 1[/TD]
[TD]Middle 1[/TD]
[TD]Last 1[/TD]
[TD][/TD]
[TD]1919-11-23 00:00:00[/TD]
[/TR]
[TR]
[TD]22222222[/TD]
[TD]First 2[/TD]
[TD]Middle 2[/TD]
[TD]Last 2[/TD]
[TD][/TD]
[TD]1952-12-31 00:00:00[/TD]
[/TR]
[TR]
[TD]33333333[/TD]
[TD]First 3[/TD]
[TD]Middle 3[/TD]
[TD]Last 3[/TD]
[TD][/TD]
[TD]1988-11-25 00:00:00[/TD]
[/TR]
[TR]
[TD]44444444[/TD]
[TD]First 4[/TD]
[TD]Middle 4[/TD]
[TD]Last 4[/TD]
[TD][/TD]
[TD]1990-12-30 00:00:00[/TD]
[/TR]
[TR]
[TD]55555555[/TD]
[TD]First 5[/TD]
[TD]Middle 5[/TD]
[TD]Last 5[/TD]
[TD][/TD]
[TD]1991-01-22 00:00:00[/TD]
[/TR]
[TR]
[TD]66666666[/TD]
[TD]First 6[/TD]
[TD]Middle 6[/TD]
[TD]Last 6[/TD]
[TD][/TD]
[TD]1991-03-22 00:00:00[/TD]
[/TR]
[TR]
[TD]77777777[/TD]
[TD]First 7[/TD]
[TD]Middle 7[/TD]
[TD]Last 7[/TD]
[TD][/TD]
[TD]1991-10-28 00:00:00[/TD]
[/TR]
[TR]
[TD]88888888[/TD]
[TD]First 8[/TD]
[TD]Middle 8[/TD]
[TD]Last 8[/TD]
[TD][/TD]
[TD]1991-11-12 00:00:00[/TD]
[/TR]
[TR]
[TD]99999999[/TD]
[TD]First 9[/TD]
[TD]Middle 9[/TD]
[TD]Last 9[/TD]
[TD][/TD]
[TD]1991-11-15 00:00:00[/TD]
[/TR]
[TR]
[TD]10101010[/TD]
[TD]First 10[/TD]
[TD]Middle 10[/TD]
[TD]Last 10[/TD]
[TD][/TD]
[TD]1991-12-27 00:00:00[/TD]
[/TR]
[TR]
[TD]12345678[/TD]
[TD]First 11[/TD]
[TD]Middle 11[/TD]
[TD]Last 11[/TD]
[TD][/TD]
[TD]1992-01-24 00:00:00[/TD]
[/TR]
[TR]
[TD]12121212[/TD]
[TD]First 12[/TD]
[TD]Middle 12[/TD]
[TD]Last 12[/TD]
[TD][/TD]
[TD]1992-10-27 00:00:00[/TD]
[/TR]
[TR]
[TD]13131313[/TD]
[TD]First 13[/TD]
[TD]Middle 13[/TD]
[TD]Last 13[/TD]
[TD][/TD]
[TD]1993-01-30 00:00:00[/TD]
[/TR]
[TR]
[TD]14141414[/TD]
[TD]First 14[/TD]
[TD]Middle 14[/TD]
[TD]Last 14[/TD]
[TD][/TD]
[TD]1993-12-31 00:00:00[/TD]
[/TR]
[TR]
[TD]15151515[/TD]
[TD]First 15[/TD]
[TD]Middle 15[/TD]
[TD]Last 15[/TD]
[TD][/TD]
[TD]1994-05-16 00:00:00[/TD]
[/TR]
[TR]
[TD]16161616[/TD]
[TD]First 16[/TD]
[TD]Middle 16[/TD]
[TD]Last 16[/TD]
[TD][/TD]
[TD]1995-06-04 00:00:00[/TD]
[/TR]
[TR]
[TD]17171717[/TD]
[TD]First 17[/TD]
[TD]Middle 17[/TD]
[TD]Last 17[/TD]
[TD][/TD]
[TD]1995-06-06 00:00:00[/TD]
[/TR]
[TR]
[TD]18181818[/TD]
[TD]First 18[/TD]
[TD]Middle 18[/TD]
[TD]Last 18[/TD]
[TD][/TD]
[TD]1996-01-10 00:00:00[/TD]
[/TR]
[TR]
[TD]19191919[/TD]
[TD]First 19[/TD]
[TD]Middle 19[/TD]
[TD]Last 19[/TD]
[TD][/TD]
[TD]1996-09-07 00:00:00[/TD]
[/TR]
[TR]
[TD]20202020[/TD]
[TD]First 20[/TD]
[TD]Middle 20[/TD]
[TD]Last 20[/TD]
[TD][/TD]
[TD]1996-10-18 00:00:00[/TD]
[/TR]
[TR]
[TD]21212121[/TD]
[TD]First 21[/TD]
[TD]Middle 21[/TD]
[TD]Last 21[/TD]
[TD][/TD]
[TD]1996-11-15 00:00:00[/TD]
[/TR]
[TR]
[TD]22222222[/TD]
[TD]First 22[/TD]
[TD]Middle 22[/TD]
[TD]Last 22[/TD]
[TD][/TD]
[TD]1996-11-25 00:00:00[/TD]
[/TR]
[TR]
[TD]23232323[/TD]
[TD]First 23[/TD]
[TD]Middle 23[/TD]
[TD]Last 23[/TD]
[TD][/TD]
[TD]1996-12-19 00:00:00[/TD]
[/TR]
[TR]
[TD]24242424[/TD]
[TD]First 24[/TD]
[TD]Middle 24[/TD]
[TD]Last 24[/TD]
[TD][/TD]
[TD]1996-12-29 00:00:00[/TD]
[/TR]
[TR]
[TD]25252525[/TD]
[TD]First 25[/TD]
[TD]Middle 25[/TD]
[TD]Last 25[/TD]
[TD][/TD]
[TD]1997-01-07 00:00:00[/TD]
[/TR]
[TR]
[TD]26262626[/TD]
[TD]First 26[/TD]
[TD]Middle 26[/TD]
[TD]Last 26[/TD]
[TD][/TD]
[TD]1998-12-03 00:00:00[/TD]
[/TR]
[TR]
[TD]27272727[/TD]
[TD]First 27[/TD]
[TD]Middle 27[/TD]
[TD]Last 27[/TD]
[TD][/TD]
[TD]1999-02-13 00:00:00[/TD]
[/TR]
[TR]
[TD]28282828[/TD]
[TD]First 28[/TD]
[TD]Middle 28[/TD]
[TD]Last 28[/TD]
[TD][/TD]
[TD]2005-06-05 00:00:00[/TD]
[/TR]
[TR]
[TD]29292929[/TD]
[TD]First 29[/TD]
[TD]Middle 29[/TD]
[TD]Last 29[/TD]
[TD][/TD]
[TD]2011-01-29 00:00:00[/TD]
[/TR]
[TR]
[TD]30303030[/TD]
[TD]First 30[/TD]
[TD]Middle 30[/TD]
[TD]Last 30[/TD]
[TD][/TD]
[TD]2012-01-23 00:00:00[/TD]
[/TR]
[TR]
[TD]31313131[/TD]
[TD]First 31[/TD]
[TD]Middle 31[/TD]
[TD]Last 31[/TD]
[TD][/TD]
[TD]2012-10-12 00:00:00[/TD]
[/TR]
[TR]
[TD]32323232[/TD]
[TD]First 32[/TD]
[TD]Middle 32[/TD]
[TD]Last 32[/TD]
[TD][/TD]
[TD]2012-11-15 00:00:00[/TD]
[/TR]
[TR]
[TD]33033033[/TD]
[TD]First 4[/TD]
[TD]Middle 4[/TD]
[TD]Last 4[/TD]
[TD][/TD]
[TD]1990-12-30 00:00:00[/TD]
[/TR]
[TR]
[TD]34343434[/TD]
[TD]First 18[/TD]
[TD]Middle 18[/TD]
[TD]Last 18[/TD]
[TD][/TD]
[TD]1996-01-10 00:00:00[/TD]
[/TR]
[TR]
[TD]35353535[/TD]
[TD]First 9[/TD]
[TD]Middle 9[/TD]
[TD]Last 9[/TD]
[TD][/TD]
[TD]1991-11-15 00:00:00[/TD]
[/TR]
</tbody>[/TABLE]
I cleaned out to 24,004 rows that match my criteria and now I must remove all duplicate entries. What I need to do it is find duplicate rows that contain the same last name (Column D) and date of birth (Column F), highlight them, and go back and verify by hand if they are the same person who applied under different user id or actually different people who happen to have the same name and dob. After these are verified, I need to delete them because we pay per person for the data collection!
I have already gone through and removed duplicate user id rows, however you do not have to include this and it could be imputed wrong so cannot rely only on that.
I have attached a short example of the workbook.
Thank you!
[TABLE="width: 471"]
<tbody>[TR]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[TD]E[/TD]
[TD]F[/TD]
[/TR]
[TR]
[TD]UserID[/TD]
[TD]first_name[/TD]
[TD]middle_name[/TD]
[TD]last_name[/TD]
[TD]suffix[/TD]
[TD]dob[/TD]
[/TR]
[TR]
[TD]11111111[/TD]
[TD]First 1[/TD]
[TD]Middle 1[/TD]
[TD]Last 1[/TD]
[TD][/TD]
[TD]1919-11-23 00:00:00[/TD]
[/TR]
[TR]
[TD]22222222[/TD]
[TD]First 2[/TD]
[TD]Middle 2[/TD]
[TD]Last 2[/TD]
[TD][/TD]
[TD]1952-12-31 00:00:00[/TD]
[/TR]
[TR]
[TD]33333333[/TD]
[TD]First 3[/TD]
[TD]Middle 3[/TD]
[TD]Last 3[/TD]
[TD][/TD]
[TD]1988-11-25 00:00:00[/TD]
[/TR]
[TR]
[TD]44444444[/TD]
[TD]First 4[/TD]
[TD]Middle 4[/TD]
[TD]Last 4[/TD]
[TD][/TD]
[TD]1990-12-30 00:00:00[/TD]
[/TR]
[TR]
[TD]55555555[/TD]
[TD]First 5[/TD]
[TD]Middle 5[/TD]
[TD]Last 5[/TD]
[TD][/TD]
[TD]1991-01-22 00:00:00[/TD]
[/TR]
[TR]
[TD]66666666[/TD]
[TD]First 6[/TD]
[TD]Middle 6[/TD]
[TD]Last 6[/TD]
[TD][/TD]
[TD]1991-03-22 00:00:00[/TD]
[/TR]
[TR]
[TD]77777777[/TD]
[TD]First 7[/TD]
[TD]Middle 7[/TD]
[TD]Last 7[/TD]
[TD][/TD]
[TD]1991-10-28 00:00:00[/TD]
[/TR]
[TR]
[TD]88888888[/TD]
[TD]First 8[/TD]
[TD]Middle 8[/TD]
[TD]Last 8[/TD]
[TD][/TD]
[TD]1991-11-12 00:00:00[/TD]
[/TR]
[TR]
[TD]99999999[/TD]
[TD]First 9[/TD]
[TD]Middle 9[/TD]
[TD]Last 9[/TD]
[TD][/TD]
[TD]1991-11-15 00:00:00[/TD]
[/TR]
[TR]
[TD]10101010[/TD]
[TD]First 10[/TD]
[TD]Middle 10[/TD]
[TD]Last 10[/TD]
[TD][/TD]
[TD]1991-12-27 00:00:00[/TD]
[/TR]
[TR]
[TD]12345678[/TD]
[TD]First 11[/TD]
[TD]Middle 11[/TD]
[TD]Last 11[/TD]
[TD][/TD]
[TD]1992-01-24 00:00:00[/TD]
[/TR]
[TR]
[TD]12121212[/TD]
[TD]First 12[/TD]
[TD]Middle 12[/TD]
[TD]Last 12[/TD]
[TD][/TD]
[TD]1992-10-27 00:00:00[/TD]
[/TR]
[TR]
[TD]13131313[/TD]
[TD]First 13[/TD]
[TD]Middle 13[/TD]
[TD]Last 13[/TD]
[TD][/TD]
[TD]1993-01-30 00:00:00[/TD]
[/TR]
[TR]
[TD]14141414[/TD]
[TD]First 14[/TD]
[TD]Middle 14[/TD]
[TD]Last 14[/TD]
[TD][/TD]
[TD]1993-12-31 00:00:00[/TD]
[/TR]
[TR]
[TD]15151515[/TD]
[TD]First 15[/TD]
[TD]Middle 15[/TD]
[TD]Last 15[/TD]
[TD][/TD]
[TD]1994-05-16 00:00:00[/TD]
[/TR]
[TR]
[TD]16161616[/TD]
[TD]First 16[/TD]
[TD]Middle 16[/TD]
[TD]Last 16[/TD]
[TD][/TD]
[TD]1995-06-04 00:00:00[/TD]
[/TR]
[TR]
[TD]17171717[/TD]
[TD]First 17[/TD]
[TD]Middle 17[/TD]
[TD]Last 17[/TD]
[TD][/TD]
[TD]1995-06-06 00:00:00[/TD]
[/TR]
[TR]
[TD]18181818[/TD]
[TD]First 18[/TD]
[TD]Middle 18[/TD]
[TD]Last 18[/TD]
[TD][/TD]
[TD]1996-01-10 00:00:00[/TD]
[/TR]
[TR]
[TD]19191919[/TD]
[TD]First 19[/TD]
[TD]Middle 19[/TD]
[TD]Last 19[/TD]
[TD][/TD]
[TD]1996-09-07 00:00:00[/TD]
[/TR]
[TR]
[TD]20202020[/TD]
[TD]First 20[/TD]
[TD]Middle 20[/TD]
[TD]Last 20[/TD]
[TD][/TD]
[TD]1996-10-18 00:00:00[/TD]
[/TR]
[TR]
[TD]21212121[/TD]
[TD]First 21[/TD]
[TD]Middle 21[/TD]
[TD]Last 21[/TD]
[TD][/TD]
[TD]1996-11-15 00:00:00[/TD]
[/TR]
[TR]
[TD]22222222[/TD]
[TD]First 22[/TD]
[TD]Middle 22[/TD]
[TD]Last 22[/TD]
[TD][/TD]
[TD]1996-11-25 00:00:00[/TD]
[/TR]
[TR]
[TD]23232323[/TD]
[TD]First 23[/TD]
[TD]Middle 23[/TD]
[TD]Last 23[/TD]
[TD][/TD]
[TD]1996-12-19 00:00:00[/TD]
[/TR]
[TR]
[TD]24242424[/TD]
[TD]First 24[/TD]
[TD]Middle 24[/TD]
[TD]Last 24[/TD]
[TD][/TD]
[TD]1996-12-29 00:00:00[/TD]
[/TR]
[TR]
[TD]25252525[/TD]
[TD]First 25[/TD]
[TD]Middle 25[/TD]
[TD]Last 25[/TD]
[TD][/TD]
[TD]1997-01-07 00:00:00[/TD]
[/TR]
[TR]
[TD]26262626[/TD]
[TD]First 26[/TD]
[TD]Middle 26[/TD]
[TD]Last 26[/TD]
[TD][/TD]
[TD]1998-12-03 00:00:00[/TD]
[/TR]
[TR]
[TD]27272727[/TD]
[TD]First 27[/TD]
[TD]Middle 27[/TD]
[TD]Last 27[/TD]
[TD][/TD]
[TD]1999-02-13 00:00:00[/TD]
[/TR]
[TR]
[TD]28282828[/TD]
[TD]First 28[/TD]
[TD]Middle 28[/TD]
[TD]Last 28[/TD]
[TD][/TD]
[TD]2005-06-05 00:00:00[/TD]
[/TR]
[TR]
[TD]29292929[/TD]
[TD]First 29[/TD]
[TD]Middle 29[/TD]
[TD]Last 29[/TD]
[TD][/TD]
[TD]2011-01-29 00:00:00[/TD]
[/TR]
[TR]
[TD]30303030[/TD]
[TD]First 30[/TD]
[TD]Middle 30[/TD]
[TD]Last 30[/TD]
[TD][/TD]
[TD]2012-01-23 00:00:00[/TD]
[/TR]
[TR]
[TD]31313131[/TD]
[TD]First 31[/TD]
[TD]Middle 31[/TD]
[TD]Last 31[/TD]
[TD][/TD]
[TD]2012-10-12 00:00:00[/TD]
[/TR]
[TR]
[TD]32323232[/TD]
[TD]First 32[/TD]
[TD]Middle 32[/TD]
[TD]Last 32[/TD]
[TD][/TD]
[TD]2012-11-15 00:00:00[/TD]
[/TR]
[TR]
[TD]33033033[/TD]
[TD]First 4[/TD]
[TD]Middle 4[/TD]
[TD]Last 4[/TD]
[TD][/TD]
[TD]1990-12-30 00:00:00[/TD]
[/TR]
[TR]
[TD]34343434[/TD]
[TD]First 18[/TD]
[TD]Middle 18[/TD]
[TD]Last 18[/TD]
[TD][/TD]
[TD]1996-01-10 00:00:00[/TD]
[/TR]
[TR]
[TD]35353535[/TD]
[TD]First 9[/TD]
[TD]Middle 9[/TD]
[TD]Last 9[/TD]
[TD][/TD]
[TD]1991-11-15 00:00:00[/TD]
[/TR]
</tbody>[/TABLE]