Guitarmageddon
Board Regular
- Joined
- Dec 22, 2014
- Messages
- 161
So i have a running sheet that, until now, I have been taking a daily output someone dumps in a folder, and manually pasting it to the end of my running sheet. A pretty obvious case for using power query, yes I know.
Here is what is keeping me from fully implementing a query on the data and saving me huge amounts of time:
1. these files are a simple 1 tab report, of orders the system has automatically created that day. we run the report every morning around 9 and dump it in the folder. Then the next day the process repeats.
2. it gets tricky because we have some orders that generate late at night, but not yet into the next day. So the previous days report doesnt have them, and the next days report DOES have them, but they would correctly reflect a date previous. I hope this is making sense... so a file created on 12-1 would have all 12-1 orders, up to the morning. then, a file dated 12-2 would have all the 12-2 orders, plus whatever generated overnight before 12-1, that missed the cutoff for yesterday mornings report.
3. To add another layer, if these orders aren't APPROVED by a buyer, they can just sit there and get duplicated on each days output, creating unnecessary duplication....
What this boils down to is:
I can get all the files (its retail 4-5-4 calendar YTD) into a query, that's simple enough. My intent was to combine several columns into concatenate that I needed to sniff out the duplicate entries from the situations I lined out above. And then in theory, just have power query somehow identify the duplicates and delete. However, concatenating with M in power query tells me that Im not allowed because I cant concatenate dates/numbers....no I guess thats out. Then if I load my query to the data model, and try to insert a calculated column in there, tells me no dice, you can only concatenate a max of two columns, and I would probably need 5-6, of mixed text/dates/numbers, to sniff out the duplicates.
What might be the best way to attack this?
Here is what is keeping me from fully implementing a query on the data and saving me huge amounts of time:
1. these files are a simple 1 tab report, of orders the system has automatically created that day. we run the report every morning around 9 and dump it in the folder. Then the next day the process repeats.
2. it gets tricky because we have some orders that generate late at night, but not yet into the next day. So the previous days report doesnt have them, and the next days report DOES have them, but they would correctly reflect a date previous. I hope this is making sense... so a file created on 12-1 would have all 12-1 orders, up to the morning. then, a file dated 12-2 would have all the 12-2 orders, plus whatever generated overnight before 12-1, that missed the cutoff for yesterday mornings report.
3. To add another layer, if these orders aren't APPROVED by a buyer, they can just sit there and get duplicated on each days output, creating unnecessary duplication....
What this boils down to is:
I can get all the files (its retail 4-5-4 calendar YTD) into a query, that's simple enough. My intent was to combine several columns into concatenate that I needed to sniff out the duplicate entries from the situations I lined out above. And then in theory, just have power query somehow identify the duplicates and delete. However, concatenating with M in power query tells me that Im not allowed because I cant concatenate dates/numbers....no I guess thats out. Then if I load my query to the data model, and try to insert a calculated column in there, tells me no dice, you can only concatenate a max of two columns, and I would probably need 5-6, of mixed text/dates/numbers, to sniff out the duplicates.
What might be the best way to attack this?