Lookup a duplicate value in a table only once

BarKv

New Member
Joined
Jul 23, 2018
Messages
8
I have a dataset consisting of approx 200,000 rows. One of the columns is named ProjectNumber, and if I remove the duplicates there is approx 2,000 unique values left. I have a lookup formula to another table with projects, to check if the ProjectNumber exists. I know how to do the lookup formula itself (using MATCH), but the formula takes 7 seconds to finish up checking every of the 200,000 rows. In comparison, when I remove the duplicates, and check only the 2,000 remaining rows, the formula takes less than 0,1 seconds. (I'm using a RangeTimer macro to measure the time.) So, my worksheet with all the 200,000 rows is very slow :/

I know how to remove the duplicates, but I'm making this sheet for others to use, and I would like an error message to be displayed in the table with the 200,000 rows. I only need one of the rows with a non existing ProjectNumber to display the error message.

So, my question is this: Is there any formula that can do a lookup for the ProjectNumber, but if the same ProjectNumber already has been checked, then the formula should do nothing. So basically, I need a formula that checks a duplicate value only once.

What I have tried:
  • IF(ROW([@ProjectNumber])-1=MATCH([@ProjectNumber],[ProjectNumber],0),"Lookup","Do not lookup")
    This formula only does a lookup if the row number equals the row where the ProjectNumber first appears. However, the MATCH part is doing what I'm trying to avoid in the first place, by checking the entire 200,000 rows, for each row. So, this formula also slows Excel down.

Any ideas? :)
 
In B2 enter and copy down:

=IF(ISNA(MATCH(A2,$A$1:A1,0)),IF(ISNA(VLOOKUP(A2,Table,1,0)),"Error: This number does not exist",""),"")

Whoa! The formula did exactly what I wanted :)

However, it worked perfect with only a 100 rows in total. But when I expanded the table to 200,000 rows, Excel stopped responding, ate up all the memory, and finally the computer crashed :/

Even though the MATCH part is genious, it looks like the last few rows with this formula, it needs to check almost all the 200,000 rows. So I'm kinda back to where I started. I will try to break the formula up into several columns, and see if this helps.
 
Upvote 0

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes
Whoa! The formula did exactly what I wanted :)

However, it worked perfect with only a 100 rows in total. But when I expanded the table to 200,000 rows, Excel stopped responding, ate up all the memory, and finally the computer crashed :/

Even though the MATCH part is genious, it looks like the last few rows with this formula, it needs to check almost all the 200,000 rows. So I'm kinda back to where I started. I will try to break the formula up into several columns, and see if this helps.

It looks up every cell if it is a first occurrence.

If you have 200,000 cells to check, that means 200,000 formulas. If you also have many other formulas in the workbook, a performance problem might occur.

By the way, why do you want this check, that is, what is the purpose?
 
Upvote 0
It looks up every cell if it is a first occurrence.

If you have 200,000 cells to check, that means 200,000 formulas. If you also have many other formulas in the workbook, a performance problem might occur.

By the way, why do you want this check, that is, what is the purpose?


Yes, it's great that it only looks up the cell if it is a first occurrence. I was kinda hoping that the fomula (for 200,000 rows) wouldn't be that heavy if it could skip duplicates. I have tried a simple MATCH formula in my table, and whenever I include the whole column in the lookup array, Excel simply cannot finish calculating. I edited your formula a bit, and set the lookup array to the 50 rows before the one I'm checking. That worked well, and I'm down to 0.2 seconds to calculate the MATCH formula alone. Maybe I need to work on this solution somehow, and implement it in my original formula.

The purpose of the check: Customers send their data through a predefined workbook containing tables on each worksheet. The data is later imported into a system, but the data needs to be checked first. For instance, if the table containing tasks refers to a project number which the customer hasn't registered yet, the task itself cannot be imported either. I could maybe just summarize every error on a single worksheet, but the purpose of having these error messages in the table is to quickly identify problem rows. I already have other formulas in the workbook, and some of them are pretty heavy, so that is why I'm trying to find formulas that need less CPU.
 
Upvote 0
I'd say this is a perfect job for VBA

Code:
Sub ProjectNumber()


With ThisWorkbook.Sheets(1)
LRow = .Cells(.Rows.Count, "A").End(xlUp).Row 'make sure all lines refer to the correct locations
PNumber = "x"


For Each Cell In .Range("A2:A" & LRow)
If Cell.Value = PNumber Then GoTo ncell
''''your lookup function''''
PNumber = Cell.Value
ncell:
Next


End With
End Sub

This code will loop through the column, only executing your code for the first Project Number of its kind it finds. If the current Project Number is equal to the previous Project Number, the code skips to the next line.
Be aware that this piece of code assumes your Project Numbers are in either ascending or descending order.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,824
Messages
6,181,186
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top