Match Duplicate Records

amerifax

Board Regular
Joined
Dec 17, 2008
Messages
247
I'm trying to find duplicate occurrences in my spreadsheet.

1. I have seven columns of data.
2. It is the first four columns that might reflect a duplication.
3. When there is a duplication the duplicate roles are always contiguous.
4. Within the spreadsheet some roles might have two or three roles that are duplicates

In the seventh row "G" I would like to place an "X" for all roles that have a duplicate value.
Example:
Smith John - 2314 Main - Waukesha Wi 53022 - 262-255-1212 - X
Smith John - 2314 Main - Waukesha Wi 53022 - 262-255-1212
Smith John - 2314 Main - Waukesha Wi 53022 - 262-255-1212
Smith Tom - 2314 Main - Waukesha Wi 53022 - 262-255-1212
Johnson Bob - 1221 Main - Milwaukee WI 53027 - 414-444-8989 - X
Johnson Bob - 1221 Main - Milwaukee WI 53027 - 414-444-8989

Thanks
Bob
 
>>gehusi<<

If I have a duplicate match where the name and address and phonecolumns are a perfect match, I will only retain one record.

On the other hand if I have a duplication where the address and phone match, but not the name. There is a high probability that I will keep the record.
Marketing Reasons:
1. same address and phone, different name indicates, an unmarried couple.
2. Same name most likely a family.

Another search is based on the latitude and longitude. This information gives us the ability to determine if an apartment and approximate number of units.

Another words, we have to manipulate the data substantially, what we're dealing with a quarter of 1 million records. I have found that all of the formulas that I'm using a Nextel are not as bad as the one called count if.

I was wondering about the possibility if I sorted on the phone number, and rather than look at any number and going through the database looking for another occurrence. If instead I could look at the past five numbers for comparison. Kind of like a variable.

But one of the biggest items I have is comparing two of those databases and trying to find duplicate values.

So you see, I don't really have any idea of the capabilities of Excel.
I'm running a Pentium 4 CPU 3.20GHz with2.00GB of Ram

Tomorrow, or should I say today, I got to get to bed, I will be trying a AMD Athlon 64x2 Dual Core Processor 4200+

Even if I had one of the fastest machines out there I am wondering if using the =IF(COUNTIF(D:D,D2)>1,"X","") command that cycles through a quarter of 1 million records to find a duplicate. How much faster will it be. Right now it takes me a couple hours plus to run the program

Bob
 
Upvote 0

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.

Forum statistics

Threads
1,224,590
Messages
6,179,756
Members
452,940
Latest member
rootytrip

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top