I’ve got this theory. It’s based on incomplete understanding, so… I strove forth to complete the understanding and totally failed.
It’s basically that…
=CALCULATE([Calc], MyTable[Col1] = 5, MyTable[Col2] > 10)
should be faster than
=CALCULATE([Calc], FILTER(MyTable, MyTable[Col1] = 5 && MyTable[Col2] > 10))
Let us take this journey of failure together, and see what we can learn
Why would it be faster?
Well, to be honest, I am not REALLY going to be surprised if they have the same perf. I thought I would have to make things more complicated to see a perf difference. But, in general, I believe the first measure should be faster because each parameter passed to calculate only references one column. This is, in fact, required when using this simpler form of filter expression to calculate (you can’t use a measure and you can’t reference more than one column). These requirements get you into “guaranteed fast mode”.
The second measure breaks “the rules” of the simple form, and it references more than 1 column in a single expression. I really don’t think it matters much that we are calling FILTER… it’s that within the FILTER we are referencing two different columns, and the storage engine doesn’t enjoy that complexity… offloading it to the expression engine instead.
Are Those Equivalent?
Well, that was kind of a setup. I knew those expressions wouldn’t return the same value. But… I think I thought (heh) that this would:
=CALCULATE([Calc], FILTER(ALL(MyTable), MyTable[Col1] = 5 && MyTable[Col2] > 10))
but it turns out that wasn’t right either. I believe the actual equivalence is
=CALCULATE([Calc], FILTER(ALL(MyTable[Col1], MyTable[Col2]), MyTable[Col1] = 5 && MyTable[Col2] > 10))
Which… I guess isn’t wildly surprising or anything, but it’s worth nothing. I’m at least mildly amused by passing multiple columns to ALL(), just because I don’t do it that often. Just remember when using the simple form, you are clearing out any existing filters on that column. Also realize that the simpler form is probably just “syntax sugar”, and some added rules to steer you towards better perf.
So… was it faster?
I have no idea. I am literally writing this blog article after flat out giving up. I mean, if I was really motivated today (tough, heading into the Super Bowl!), I would have done this work on a Tabular server… where I have better access to traces and perf counters and such. Because I was being lazy, I spent more time, failed, and eventually gave up. Why? Because Power Pivot is too darn fast!
I mean, I had 15 million rows. Which… isn’t massive but certainly not tiny and is far more than traditional excel would even contemplate. I wasn’t surprised that the measures were fast. But I figured… I would be able to force enough iterations to make them slow.
SlowWrap := AVERAGEX(ALL(MyTable), [SlowCalc])
You feel me? Walk 15 million rows, and for each of the 15 million rows, go run our measures. Well, that was instant. So, I tried like umpteen variations of wrapping wrappers and using SUMX(VALUES(MyTable[SomeColumn]), [SomeWrapper])
and it was just always stupid fast. Then, I realized I wanted to eat some skittles. And so I quit. Because Skittles are delicious. #BeastMode
This was somewhat of a failure in the ol Tiny LIzard Rule #1… don’t solve problems that you don’t have. (Still looking for that elusive Rule #2). But, when you do run into perf trouble that is measure related… It is worth keeping this in mind. When possible, reference just ONE column per clause. I’m sure that has happened to me, I should have blogged about it then rather than trying and failing to make up the data
#GoHawks!
- The streak is alive! – August 10, 2017
- DAX KEEPFILTERS Plus Bonus Complaining! – July 20, 2017
- VAR: The best thing to happen to DAX since CALCULATE() – July 11, 2017
- Review: Analyzing Data with Power BI and Power Pivot for Excel – June 20, 2017
- Power BI Date Table – June 2, 2017
Scott Senkeresty here… Founder and Owner of Tiny Lizard. Yes, I have done some stuff: A Master’s Degree in Computer Science from California Polytechnic State University in beautiful San Luis Obispo, California. Over twelve years developing software for Microsoft, across Office, Windows and Anti-Malware teams… Read More >>
David Hager says
The alternative is to use a slow computer 🙂