Power Query Consultation

AmitM · Jul 7, 2015

Hi all,

I am building a model which tracks forecasts of consumption.

My users will use a template file and enter line items and Qty for each round.
(On a nutshell, let's assume we have a 2 Y of project and repeating forecasts for the following 2 months)

The way I built the system so far is that a power-query imports all the templates located in a folder and imports them into a single power pivot table.

That final table consists many projects and all their progress throughout the repeating assessments.

Here are my concerns:
1) I want to be able to present the total Qty at the end of each project + slice it to by line items
2) I want to show the behavior throughout time (like in a line chart)
3) Not to double count anything

I thought of 2 ways how doing it:
1) Track only the changes only- so that the main table will show the first revision, and when something was added/substracted or the quantities have changed, it will add a line item with the delta. To my understanding, this way I can present the final Qty at any given moment, but to display the behavior I will need to create an accumulative graph.
2) Import all the files and let the system search for the latest timestamp and display only it. That way I can also show the differences between the dates of the files (timestamps) in a graph wise rather easy.

Appreciate your feedbacks which option is better.

If you can, helping to clarify how doing it will be more than appreciated

Thanks a lot

ImkeF · Jul 8, 2015

Hi Amit,
Why not do both: Create one table that contains all the full versions (for your 1) reports) and another table that only shows the differences to the previous version (for your 2) reports). This would make your PP live much easier.

This is how you create the table with the differences:
- Load the previous version in a new query and reverse the signs (switch + to -).
- Append the data from the current round
- Group on your key attributes and SUM on amount.
- This will give you the differences between both versions.
- Filter on amount “is not 0”

AmitM · Jul 10, 2015

Hi,

That is interesting solution

2 questions:
1) won't it create a cumbersome model? many queries and PP
2) If I want to take it to the next level and have my users pick the revision (by date) they want versus the revision (by date) they want also, how do you think that can be accomplished? This way they will be able also to benchmark first revision to the latest, to see if they forecast correctly.

ImkeF · Jul 10, 2015

Hi Amit,
As you might have expected – it depends

2) depends on which answering time you’re envisaging when the users select the versions to compare? How long are your tables?:

a) instance response: aa) PP or ab) PQ with full versions for all (append) <1.1 Mio rows.
b) some secs to mins if you have to refresh your PQ queries.

1) Depends on what makes a model cumbersome for you:
a) need to use advanced DAX
b) need to use PQ as well as PP
c) many identical queries
d) many files you store your data in
e) what else?

re 1a) It’s possible to use PQ only with different xlsx files as a source, but not PP only, as you’re missing the UNION command there. Up to 1,1 Mio rows, you can stick to a pure PQ solution for this – for longer tables, you need at least PP’s muscles (storage & compression), but don’t have to use its brain (DAX).

How about the templates the users fill in their values?: Will they be stored for each round or overwritten?
How will your model be able to identify if theres a new/completed version to consider?

AmitM · Jul 10, 2015

Hi,

The templates will be filled every time and stored in the same place.
The amount of rows in each template shouldn't be more than thousands of rows.
When combining all with PQ to a single PP table, it will contain millions of rows in the future.

I couldn't teach myself DAX as well as I expected. Somehow it is quite difficult for me, so this is the main issue.

Each file name has a different name, and when using PQ I use this name to create fields.

Ex: X1111 siteA 010115.xlsx
X1111 <gs id="9722a5d0-7353-4674-af61-47480bff95d4" ginger_software_uiphraseguid="74110a59-b372-4600-85fc-3cd86419c171" class="GINGER_SOFTWARE_mark">siteA</gs> 040115.xlsx
E2222 <gs id="d42ed8d2-0d15-4a14-8b4f-3e92f5dc1414" ginger_software_uiphraseguid="57893046-6c7e-42c4-bba0-891333676eea" class="GINGER_SOFTWARE_mark">siteB</gs> 010115.xlsx

So as an example, I have 3 templates in the same folder:
1 project with 2 forecasts (the later is the most accurate one, contains all the data about the project)
1 different project (E2222) in a different site, but has the same forecast date (010115).

I thought to use calculate, filter and max as I need to both find the latest of version per project and site, not sure how to do that

Thanks

Power Query Consultation

AmitM

Board Regular

Excel Facts

ImkeF

Well-known Member

AmitM

Board Regular

ImkeF

Well-known Member

AmitM

Board Regular

Similar threads

Forum statistics

Share this page

Power Query Consultation

AmitM

Board Regular

Excel Facts

ImkeF

Well-known Member

AmitM

Board Regular

ImkeF

Well-known Member

AmitM

Board Regular

Similar threads

Forum statistics

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock