My theory for today’s post is to give a really serious look at a typical LTD measure, in hopes that it can reinforce our understand of filter context. Let’s see how it goes
LTD := CALCULATE([Sales], FILTER(ALL(Calendar), Calendar[Date] <= MAX(Calendar[Date]))
We can see at left that indeed it works. The $3,266,374 from 2001 plus the $596,747+$550,817 equals the $4,412,937 in the LTD measure, showing that it works at both the Year level, and the Month level… which is pretty cool.
Before I continue, I would like to say I am kind of tired of writing this pattern, and wouldn’t mind a DATESLTD() function. Though, I wouldn’t want it to work only a “real” date tables… since I use this on say… a table of just years.
Anywho… let’s dissect this guy a bit.
First question: Why do we filter the Calendar table?
The reality is if you instead filtered based on Sales[OrderDate]… you are going to get (pretty darn close to) the same results… in this case. The only differences are going to be associated with where your Sales data ends vs where your Calendar table ends. However, that just isn’t how you do things You are going to filter by your lookup table probably 20 to 1 over your fact table.
Off the top of my head, I can think of two reasons for that. One, performance. There are simply less rows in the Calendar table, so filtering it is real darn fast. If you have millions of rows in your fact table… filtering it would take longer.
The second reason is that it is possible you want the filter to flow to some other table, and not just Calendar to Sales, but say… Calendar to Discount, Taxes, Returns, whatever. Some other fact table that should also be filtered by the Calendar table to correctly calculate [Sales].
Second question: Why do we need the ALL( ) around Calendar?
The all is there to remove filters from the initial filter context. If it were not there… we would only be filtering rows in the current date range. For Nov 2002, our measure would turn into “Nov2002-To-Date”, which looks well, exactly like [Sales]. I just tried adding individual dates onto rows of the pivot, but that just make a new initial context that was day specific, and we got (for example) Nov13th2002-To-Date for the Nov13 cell, which again looks remarkably like [Sales]. If you want to include more rows than just the current context… ALL( ) is kinda required.
Third question: I see you are filtering based on just Calendar[Date], should our ALL actually be ALL(Calendar[Date]) ?
That’s a good observation and question. I really didn’t know you had it in you
Functionally, I can’t think of why it would matter. Because we are filtering at the lowest level (days in our case), I don’t tend to think it matters. It’s not like 12/5/2004 is suddenly going to be in February if we clear the month filters via our table wide ALL. Don’t worry, if I am wrong I am confident Marco will descend from on high to tell us.
This situation would be different if we wrote something like:
In that case, removing all filters off the Calendar would remove the notion of days/weeks/months, causing all rows to return the Year-based values (2nd column at right) on every row.
But ALL(Calendar[CalendarYear]) would leave those filters (on days/weeks) and produce a kinda odd result where we get a correct LTD at the year level, but just normal sales value at the Day/Month/Quarter level (3rd column of numbers in our chart). Obviously a silly measure, but it does illustrate the point that filtering at the most granular level seems… good.
Back to our measure, I don’t have a strong preference. I could imagine some perf impact, but I don’t have an intuitive feel for which style is better. <shrug>.
Forth Question: What’s the deal with using MAX() ?
Man, I dunno. It is so weird . Lemme add some colors here to easily refer to terms…
FILTER(ALL(Calendar), Calendar[Date] <= MAX(Calendar[Date]))
Probably the most important concept in all of DAX is understanding “what is my filter context”.
Based on previous discussion, I think we already understand this one. That Calendar starts in the “original filter context” (rows and columns in our filter table, and any slicers and such). The ALL( ) just nukes off that filter context completely. And we are left with the entire Calendar table.
Now, the FILTER( ) function is going to iterate over every row (from the whole Calendar), doing the comparisons. I am gonna go ahead and guess it is creating a row context to do this iteration… cuz, how else would it work?!
Easily confirmed at this term since it is sooooo nekked! It is not wrapped in any sort of aggregate function, which always totally blows up… unless you are inside a row context (just like evaluating a calculated column!).
Okay, so… we are iterating over the whole table (which started as the initial filter context, and got stripped by the ALL() ), and we are walking one row at a time comparing this column and then we hit that crazy MAX. If we ignore the row/filter context discussion briefly… we should at least point out that MAX( ) totally “makes sense” in that if we are looking at the Month of June… we want to include all the days of June, so we use MAX(Calendar[Date]) to grab June 30 and before. MIN( ) would give us only June 1 for June, which would be non-good.
Okay, so what filter context is used by the MAX( ) ? By inspecting the results, we can “intuit” that it is the original filter context (I would assume it is “whatever filter context existed before you got into the FILTER() ).
It is clearly not the max of the entire calendar table, else every cell would give us the same value, which would be the grand total, and… we don’t get that.
But it is also not just the one row in our row context created by the FILTER() function… because that wouldn’t filter anything out. That would be … is Jan7 <= Jan7? Yep. Is Mar13 <= Mar13? Yep. Nothing would filter out.
So clearly, it is the original filter context. I mean… I guess? By induction. Why is this? I HAVE NO STINKING IDEA. I mean… it just is. I guess? <shrug>. I just calls it like I see’s ‘em.
If our theory is that “aggregate function inside the test expression to FILTER() use the original filter context”… then what would you expect to happen if I did this:
FILTER(ALL(Calendar), MIN(Calendar[Date]) = MAX(Calendar[Date]))
When does the MIN = MAX? Only at the day level — min day of (June) does not equal max(June)… but min(Jun7) does equal max(Jun7).
So, I expect blank’s at the month/quarter/year level, and… something at the day level. And indeed, that is what I get! Hurray! Of course, I’m not sure I expected the grand total value…?
I got the blank Year/Quarter/Month I expected, but in my head… I was going to get just July 1 data on July 1. My head was wrong.
What actually happened (when evaluating say… the 7/1/2001 cell) is that our FILTER( ) function walked ALL(Calendar), then said… okay… for the first row in the Calendar… Jan 1, 2001… does MIN(context) = MAX(context)? Which has nothing to do with Jan 1. context is merely… whatever the original context was (apparently I said 7/1/2001 above). Then it does the same for Jan 2. (Does 7/1 = 7/1?) And when it gets all done, looking at every value in the Calendar table, and 7/1/2001 continued to equal 7/1/2001 for every single row in the Calendar table… they were all included in our final calculation, and thus… Grand Totals For Everyone!
I think I will go ahead and call our understanding of this measure complete If there is a fifth question, I DON’T WANT TO HEAR IT, BECAUSE MY HEAD HURTS AFTER THE 4TH QUESTION!!
Was our experiment successful? Did understanding THE HECK out of the LTD measure aid in our understanding of filter context…?