Filtering a column based on the values in a different column in the current context

ruve1k

Board Regular
Joined
Aug 31, 2008
Messages
171
It's great to see this new section of the Mr. Excel message board!
I was wondering if any of the Pros could take a look at this thread and give some insight and hopefully suggest something more efficient.
Thanks!
 
Guys you are scaring the new people :)

If you are reading this and are new to PowerPivot, rest assured that what's being discussed here is VERY advanced relative to what you need to know.

Hell, even I don't understand Alberto half the time :)
 
Upvote 0

Excel Facts

Repeat Last Command
Pressing F4 adds dollar signs when editing a formula. When not editing, F4 repeats last command.
ruve1k - I'm intrigued about the context situation. I don't know your dataset, but I'm interested to find out why context wouldn't hold. Without building the model, it's hard to "hit the wall" but if you've already done so, I would be interested to know more.

Great solution Alberto. I was of the belief that EARLIER was recursive and could therefore be as costly as SUMX (although I guess it depends what operations it's performing). Am I wrong or is it simply an "it depends" situation?

Great discussion ruve1k
 
Upvote 0
Guys you are scaring the new people :)

If you are reading this and are new to PowerPivot, rest assured that what's being discussed here is VERY advanced relative to what you need to know.

Hell, even I don't understand Alberto half the time :)

:) Rob, believe it or not, I don't understand Excel questions much more than half the time.

Anwyay, I agree, this is much more advanced than what is normally needed... I tend to like complex discussions, my fault.
Moreover, with "normal" table size I think Ruve's first solution should perform pretty well, under 10 million rows the difference should be very very low.
 
Upvote 0
ruve1k - I'm intrigued about the context situation. I don't know your dataset, but I'm interested to find out why context wouldn't hold. Without building the model, it's hard to "hit the wall" but if you've already done so, I would be interested to know more.

Great solution Alberto. I was of the belief that EARLIER was recursive and could therefore be as costly as SUMX (although I guess it depends what operations it's performing). Am I wrong or is it simply an "it depends" situation?

Great discussion ruve1k

EARLIER is very fast, it just need to go back one level in the stack of row contexts. A better (and cleaner) solution would be to completely normalize the model, going into a pure many-to-many pattern, in that case you can follow the canonical pattern without the need to use EARLIER.
Moreover, with PowerPivot 2.0, if you leverage the m2m pattern, you can use SUMMARIZE, making the formula much faster.

Alberto
 
Upvote 0
Moreover, with "normal" table size I think Ruve's first solution should perform pretty well, under 10 million rows the difference should be very very low.
Alberto,
My data set has only 100K rows.
Should performance be significantly worse if the measure was a distinct count rather than a sum?
E.g.
Code:
=SUMX(VALUES(myTable[Field_2]), 
    CALCULATE([B][COLOR=Red][DistinctCountMeasure][/COLOR][/B],
             ALL(myTable),
             myTable[Field_1]=EARLIER(myTable[Field_2])
    )
 )
...because my formula with a sum measure inside CALCULATE is not giving me trouble. It's actually a version with a distinct count inside the CALCULATE that is really slow.
 
Upvote 0
Alberto,
My data set has only 100K rows.
Should performance be significantly worse if the measure was a distinct count rather than a sum?
E.g.
Code:
=SUMX(VALUES(myTable[Field_2]), 
    CALCULATE([B][COLOR=red][DistinctCountMeasure][/COLOR][/B],
             ALL(myTable),
             myTable[Field_1]=EARLIER(myTable[Field_2])
    )
 )
...because my formula with a sum measure inside CALCULATE is not giving me trouble. It's actually a version with a distinct count inside the CALCULATE that is really slow.

Well... no. Distinct count are very fast in Vertipaq. If your formula is just COUNTROWS (DISTINCT (...)) it should be fast.
If, on the other hand, the formula is more complex... well, it would be much better if you post the complete formula, otherwise any consideration would be pointless.

Alberto
 
Upvote 0
I am actually just using a simple DISTINCTCOUNT([Column]) in the SQL Server 2012 RC0 release and I find that performance is much slower than the same formula with a SUM.
 
Upvote 0
David,
Now that I had a few minutes to breath, I reread your suggestion about the single-column table of distinct values from Filed_1. It sounded like a good idea but when I tried it I didn't see any material impact on the sluggish performance.
Now I'm just wondering conceptually if it make any difference when applying a filter expression to a measure, whether the filter expression is applied to the Field_1 on the fact table or to Field_1 on a related dim table.

Hopefully over the weekend I'll have chance to work on Alberto's solution and see how that performs for me.
 
Upvote 0
Hi ruve1k. That's interesting. I usually find a reasonable performance gain using a related dim table, although more so on huge fact tables where the number of distinct dimension values is relatively small.

I would be interested to hear how you get on with Alberto's solution. Sounds like you've got a fun weekend coming up!

Sorry I couldn't be of more help. If I get chance, I'll try a few options on some of my datasets and see if I come up with anything.
 
Upvote 0

Forum statistics

Threads
1,224,862
Messages
6,181,465
Members
453,045
Latest member
Abraxas_X

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top