I made a PowerPivot model for which I needed combinations (a cross-join between tree dimensions, where A < B < C to filter duplicates). The result would exceed Excel's limit of 1 million rows, and I failed to find an efficient way to do this, i.e. supporting easy refresh on data updates. I considered posting this question on StackOverflow, then saw the PowerPivot-related activity on there is not very high, so then opted to turn here instead.
My workaround (for one cross-join between 2 dimensions at a time) involved transposing (Excel) the secondary dimension so as to make a table, then unpivoting it (Power Query) and loading the result (>1M rows) directly to the data model, filtering out duplicates in PowerPivot to load the results (<1M rows) back to Excel to repeat the process. Not very elegant or efficient.
My reasoning about the alternatives:
Am I missing something? How would you guys approach this?
Potential ways this could be resolved if they exist:
- a way to populate a PowerPivot data model table from a DAX/MDX query
- a way to have DAX Studio directly dump results to the data model or have it communicate with Power Query for the added query refresh functionality
- a way to use MDX/DAX/SQL queries in Power Query's M (potentially through VB?).
My workaround (for one cross-join between 2 dimensions at a time) involved transposing (Excel) the secondary dimension so as to make a table, then unpivoting it (Power Query) and loading the result (>1M rows) directly to the data model, filtering out duplicates in PowerPivot to load the results (<1M rows) back to Excel to repeat the process. Not very elegant or efficient.
My reasoning about the alternatives:
- DAX does have a CROSSJOIN function, but I don't see a way PowerPivot allows directly populating a *table* from a DAX query (as opposed to a column, which fails for cross-joins as the result is multi-column). Trying to directly operate on the results in the query without an intermediate table seems inefficient to me, as I'd like to base multiple calculations on the intermediate cross-joined table.
- DAX Studio could easily do crossjoins using MDX OR using DAX, but can only directly populate Excel (as opposed to the PP data model), meaning it's bound to its 1M row limit (and doesn't support auto-refresh).
- Power Query can do it using the above-mentioned workaround (un-pivoting a table of all combinations), but is too dumb to filter results for duplicates and puts in an extra Excel step, while...
- Excel itself suffers from the 1M row limit and also fails auto-refresh.
Am I missing something? How would you guys approach this?
Potential ways this could be resolved if they exist:
- a way to populate a PowerPivot data model table from a DAX/MDX query
- a way to have DAX Studio directly dump results to the data model or have it communicate with Power Query for the added query refresh functionality
- a way to use MDX/DAX/SQL queries in Power Query's M (potentially through VB?).