Access query eliminating duplicates

toveyj

New Member
Joined
Jun 15, 2007
Messages
22
Hi

I have a table of data with duplicate entries for actions that have happened to the same unique ID (Initial_UBRN).
I have got a query that will find just the rows where the latest action - [ACTION Date] field - happened but this gives me
duplicates as there may have been more than one action on the latest date. I am trying to work out
how to go one step further and just pick up one of these duplicated entries. It doesn't
matter which one of the duplicates I return. However this one extra step in a single SQL query is too much for my brain to gasp! I thought fundamentally it is because my subquery is only grouped by Initial_UBRN so I tried adding [ACTION Date] to the GROUP BY statement in the sub-query but then I get all the rows returned, not just the Max(ACTION Date) row. Grouping by MaxACTIONDate isn't allowed and gives an error.

SELECT bk1.*
FROM Book_All_Prov_East AS bk1
INNER JOIN (
SELECT Initial_UBRN, Max([ACTION Date]) AS MaxActionDate FROM Book_All_Prov_East GROUP BY [Initial_UBRN]) AS bk2 ON (bk1.[ACTION Date] = bk2.MaxActionDate) AND (bk1.[Initial_UBRN]=bk2.[Initial_UBRN]);

Any ideas?
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
Make you outer query an Aggregate Query too, and just use the "FIRST" function on each field to find the first occurrence of each one.
 
Upvote 0
I was going to give you different SQL ... but it occurs to me that it would be better to help you get your data normalized. Start by Creating a table where the unique ID is actually unique ...

the table you have now would be a related table if you want to keep all the entries.

or perhaps you wish to remove the duplicates? What if the Action_Date is duplicated as well? Why are there duplicates?

Add this tracking field to your table:

Code:
- dtmAdd, date/time, DefaultValue =Now(), Description: date/time record was added to the table

While this won't help you now, it will help in the future should this happen again. I put tracking fields in all tables.
 
Last edited:
Upvote 0
if you still want the SQL, try something like this:

Code:
SELECT bk1.*
FROM Book_All_Prov_East AS bk1
WHERE bk1.Initial_UBRN
IN (SELECT bk2.Initial_UBRN
      FROM Book_All_Prov_East AS bk2
      WHERE 
         nz(bk2.[ACTION Date],0)= nz(
         dMax("[ACTION Date]", "Book_All_Prov_East", "Initial_UBRN = " & bk2.Initial_UBRN ) ,0)
      AND  bk2.Initial_UBRN = bk1.Initial_UBRN )
ORDER BY ... whatever
 
Upvote 0
Thanks for the suggestions which are really appreciated.
Joe - I tried using the MIN function on ACTION Date in the Outer query but I think I must be
using the wrong syntax so any suggestions on what would work would be appreciated.

Strive4Peace - that main table is built up by importing data from a reporting extract file.
The "Unique_ID" identifies a specific order if you like but different thing can happen to it on different days
so it is not really unique at all. I need to find the last time an action was logged against this number.
I don't want to make it unique in my main table as I need to know about various actions at different times
but for this query I only want to pull out the last one. I tried your SQL but after an hour it was still running.
There are around 200,000 records in the table so it isn't massive. My code above has the merits of running
in a couple of minutes but falls down on not eliminating the last few duplicates as explained in the original
post.
 
Upvote 0
Joe - I tried using the MIN function on ACTION Date in the Outer query but I think I must be
using the wrong syntax so any suggestions on what would work would be appreciated.

You are not trying to use the MIN function on bk1.*, are you?
You can only apply it to specific fields, not the asterisk (for all fields).

Exactly how many fields do you need/want to return?
It may help if you post a small sample of your data with the pertinent fields.
 
Upvote 0
Personally, I usually prefer nesting queries as opposed to subqueries since I find the performance to be better.

Make a Group/Totals query, grouping on your ID and getting the Max Date. Use this query to limit the records from your table by linking on both of the fields.
 
Upvote 0

Forum statistics

Threads
1,223,243
Messages
6,170,967
Members
452,371
Latest member
Frana

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top