Best method to delete duplicates via dax or power query? Concatenates...

Guitarmageddon · Aug 27, 2021

So i have a running sheet that, until now, I have been taking a daily output someone dumps in a folder, and manually pasting it to the end of my running sheet. A pretty obvious case for using power query, yes I know.

Here is what is keeping me from fully implementing a query on the data and saving me huge amounts of time:

1. these files are a simple 1 tab report, of orders the system has automatically created that day. we run the report every morning around 9 and dump it in the folder. Then the next day the process repeats.
2. it gets tricky because we have some orders that generate late at night, but not yet into the next day. So the previous days report doesnt have them, and the next days report DOES have them, but they would correctly reflect a date previous. I hope this is making sense... so a file created on 12-1 would have all 12-1 orders, up to the morning. then, a file dated 12-2 would have all the 12-2 orders, plus whatever generated overnight before 12-1, that missed the cutoff for yesterday mornings report.
3. To add another layer, if these orders aren't APPROVED by a buyer, they can just sit there and get duplicated on each days output, creating unnecessary duplication....

What this boils down to is:
I can get all the files (its retail 4-5-4 calendar YTD) into a query, that's simple enough. My intent was to combine several columns into concatenate that I needed to sniff out the duplicate entries from the situations I lined out above. And then in theory, just have power query somehow identify the duplicates and delete. However, concatenating with M in power query tells me that Im not allowed because I cant concatenate dates/numbers....no I guess thats out. Then if I load my query to the data model, and try to insert a calculated column in there, tells me no dice, you can only concatenate a max of two columns, and I would probably need 5-6, of mixed text/dates/numbers, to sniff out the duplicates.

What might be the best way to attack this?

GraH · Aug 27, 2021

Simply remove duplicates, without selecting a single column. PQ will keep unique records.
Then you can perform a group by on order number with all rows on aggregation. Finally from these tables retain last record. If you have the file creation date that's the best way I'd guess.

Guitarmageddon · Aug 27, 2021

no way....how the heck did I miss such a simple and obvious function..... i didnt realize where it was. Ive learned something today haha.

Guitarmageddon · Aug 27, 2021

So I believe I just need to remove the metadata column where PQ shows the source file name. Then delete duplicates should work as intended.

Guitarmageddon · Aug 27, 2021

GraH said:
GraH said:

Simply remove duplicates, without selecting a single column. PQ will keep unique records.
Then you can perform a group by on order number with all rows on aggregation. Finally from these tables retain last record. If you have the file creation date that's the best way I'd guess.

Click to expand...

Ok, so I have attached a picture for my next question....

I loaded/connected a query of my master date table, which connects the calendar date with my fiscal 4-5-4 retail calendar. I was able to use "related" to bring in the year. Next, I need to connect the "PERIOD" column (aka the month name) from my master date table, via calculated column, to give me the month for both order date, and delivery date. What formula would I be using for this? I tried to use "lookupvalue" but couldnt make that work.

I am defaulting to using a calculated column, because for the business case, I am loading this to a flat table and not a pivot table. Is there a better way to handle?

Guitarmageddon · Aug 27, 2021

for example, here is the MANUAL version of this table, and you can see the "month generated", "delivery month" columns. That is what I am trying to relate via my example above.

Use the "order date" / "delivery date" to my master date table.

Guitarmageddon · Aug 27, 2021

Ok sorry to be rambling here to myself but I think I did get it using LOOKUP VALUE. success!

GraH · Aug 28, 2021

for the learning, that's why most of us if not all are here, right?

I miss the point why you want to create extra columns and fetch data from a related table?
Simply use those fields from your calendar table inside the pivot work.
You can even use a double link to your calendar table. One will be inactive, but with USERELATIONSHIP you switch between both inside your measures.

Guitarmageddon · Aug 28, 2021

GraH said:
for the learning, that's why most of us if not all are here, right?

I miss the point why you want to create extra columns and fetch data from a related table?
Simply use those fields from your calendar table inside the pivot work.
You can even use a double link to your calendar table. One will be inactive, but with USERELATIONSHIP you switch between both inside your measures.

Well the reason I wanted to insert columns is because I wanted to duplicate the flat table that the planners using this report are accustomed to using. The type of data it was felt weird to try and manipulate via a pivot table. I may give that another shot to see.

Can you explain more about USERELATIONSHIP?

GraH · Aug 28, 2021

The masters themselves explain it best. Using USERELATIONSHIP in DAX - SQLBI

Best method to delete duplicates via dax or power query? Concatenates...

Guitarmageddon

Board Regular

GraH

Well-known Member

Guitarmageddon

Board Regular

Guitarmageddon

Board Regular

Guitarmageddon

Board Regular

Attachments

Guitarmageddon

Board Regular

Attachments

Guitarmageddon

Board Regular

Attachments

GraH

Well-known Member

Guitarmageddon

Board Regular

GraH

Well-known Member

Similar threads

Share this page

Best method to delete duplicates via dax or power query? Concatenates...

Board Regular

Well-known Member

Board Regular

Board Regular

Board Regular

Attachments

Board Regular

Attachments

Board Regular

Attachments

Well-known Member

Board Regular

Well-known Member

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock