I’m guessing my posts on Performance will continue to be the most popular for some time, but I’m trying to start filling out some posts on the the Fundamentals now. Last time I talked about Calculated Columns vs Calculated Fields (Measures), today we are going to discuss “Relationships”.
Two Flavors of Tables
There are not actually two flavors of tables. It’s not that black and white. But I think it can be useful to think of tables in this way.
There are “lookup” tables (probably called “dimension” tables if you enjoy that sorta thing) and “data” tables (or “fact” tables, if that’s how you roll). Data tables generally have numbers… stuff you add up like… Sales Dollars. Lookup tables, on the other hand, are smaller and have just unique values.
Looking at the picture at the left, there are a few things to notice.
1) We have created a relationship between these two tables, and that relationship is based on the SalesTerritoryKey. Note the names don’t need to match between the tables. We know they are related because there is a cute line between them.
2) That line… it has a direction. It is pointing from the data table to the lookup table. This is maybe the most critical thing to remember about relationships: relationships have direction.
3) There is only one line between these two tables, and it is “solid”. This means the relationship is “Active”. You can have multiple relationships between the same two tables, but only one of these relationships may be active. A typical use case for this is with dates, where maybe you have an OrderDate and a ShipDate… you have to pick one to be active, then write measures to “activate” the other field as appropriate via the USERELATIONSHIP() function.
Back to our two flavors of tables… the Territories table is a “Lookup Table” and the Sales table is a “Data Table”. In a perfect world, you would have 1 data (fact) table, and maybe a bunch of lookup (dimension) tables (see larger model later in this article). We would add lookup tables for Customers, Date, Products, etc. Reality is rarely this clean.
It is completely acceptable to have “multi-hop” relationships. Maybe that Country field in the Territories table is actually a CountryId that “looks up” more information from another lookup table (making our star schema a snowflake if that’s your preferred terminology). You kinda couldn’t refer to the Territories table as a “lookup table”… at least not with respect to that relationship to a new Countries table, ya know? Also, It is common for a single lookup table to be related to many fact tables. Some tables may be hanging out without any relationships. Etc. It can all get a bit complex, but that is what makes it fun powerful.
Relationships are Many to One
Over time, I am slowly thinking less about my tables (in terms of “lookup tables” and “data tables”) and just thinking about it more in terms of the relationships. And, here the important thing to remember is that they are always many to one. There are many sales for one product. We can refer to data on the “many side” of the relationship and know exactly what we mean.
There are two cases where this many to one wants to be … non-true. The first is easy… many to many relationship. This is not supported. Well, in terms of defining power pivot relationships it is not supported. DAX very much supports it… it just takes a bit more careful thinking. But if you try to create a relationship between two fields and that fields shows up more than once in both tables, you will simply get an error.
The other case… one to one. These don’t happen super frequently, but it is certainly supported. The only thing really interesting here is that Power Pivot can not “intuit” the direction of the relationship. What I mean is that… well, like I said… relationships have direction. When you create a many to one relationships the engine always makes sure to create it in the correct direction without you having to think about it. The “one side” is always the lookup table… and the arrow will point at the lookup table, even if you draw it the other way. However, the engine can not know the correct direction for a one to one relationship… so, make sure your relationship is pointing the expected direction!
What is a Relationship?
That was a lot of words without saying what a relation is or telling you what it is good for .
At a really “mechanical” left… relationships make some intuitive sense. Just the term “Lookup Table” sort of implies the usage. It means that if you are sitting in the fact table you can “lookup” values in the lookup table. Which is to say, if I am looking at a Sales row with ProductId=777, I can go to my Products lookup table and know that the Description=Frisbee and my Cost=$0.88 and my product Category=Toys for that product.
And the other way… if I am sitting on my Products table, I can “know” which Sales were in Category=Food. So… yes, the relationship has direction but you can gather information from the other side of the relationship regardless of which side you on.
But the most important understanding is that any filters applied to a lookup table are also applied to related data tables. And, the reverse is not true. That is where the direction really matters.
It does not matter why this filter is happening. If I have a relationship between my Calendar table and Sales[SalesDate], it does not matter “how” a filter got applied to the Calendar table so that Calendar[Year]=2009… the impact is that the filter will also be applied to the data table (Sales). It could be from a slicer. It could be because you put Year on rows. It could be because you wrote a CALCULATE(). Anything. But, the reverse is not true. A filter on your Sales table has no impact on your Calendar table. Relationships have direction.
Unlearn what you have Learned
Above, you see a pivot table that I created based on the Sales model using the measure:
Total Units := COUNTROWS(Sales)
And we have 4 filters on our pivot table.
- Colors from the Products table, on rows.
- Year from the Calendar table, on rows.
- Country (which must be France) from the Country table, on a slicer
- Gender from the Customers table, on columns
There are two things I want you to notice here:
- I only wrote 1 measure. There is nothing that I wrote specific to Product Count by Color,Year,Country and Gender. I wrote one measure. The count. All the other magic is happening because of these relationships and filtering.
- There was no copying of data via VLOOKUP, etc. The data stays in its original table… I don’t need a single table!
That is a very “normal excel” way of thinking about things. “How do I do a VLOOKUP” is a super common question when folks first get to Power Pivot. You do not need VLOOKUP. You leave your data where it is, and leverage the power of relationships to build pivot tables that pull from multiple tables. You can not do this in old-skool Excel, and you never really enjoyed doing it anyway
This makes you insanely more efficient. Want to view sales by Age? Drop it on your pivot table. Done. Want to view sales by weather? Import weather data, relate OrderDate to weather date… drop “Rainfall” on your table… done.
I already mentioned that Many to Many relationships are not directly supported. There are some great articles on how to deal w/ that, so don’t stop you from loving Power Pivot There is also no support for “Composite Keys”. The relationship is between exactly 1 field in each table. If, in order to uniquely identify a row, you need 2 different columns, you must create a calculated column that does a string concatenation of the fields, then relate based on that. Feels like one they might improve in the future, but not a show stopper.
You also can’t relate a table to itself (a self-join), even if you do some tricky indirection through another table. I actually can’t recall this coming up for me… typically it’s a “parent-child” sorta thing, and for that I use PATH(), PATHITEM(), etc… view some info here. You also might be able to get away with creating a copy of your table?
Wrap It Up
This is getting longer than I anticipated. I feel a bit bad about not cleaning showing/proving that a filter on a data table doesn’t filter lookup tables, but just take my word for it I do bring it up because it has an important implication — which is when you want to filter you almost always want to filter on the lookup table. This way you a filtering the lookup table and the data/fact table. You are also applying that filter on something (typically) small – since lookup tables tend to be much smaller than data/fact tables… better for perf!
I also didn’t really nail down multi-hop relationships, but know they “just work”. A filter on an “outer” lookup table, will filter the “inner” lookup table, which will filer the data table. Easy peezy.
Somehow I didn’t mention the important dax functions RELATED() and RELATEDTABLE(), which seems odd… suppose I will cover them in the future.
As I am writing this – I am realizing that as much as I believe “Power Pivot puts the power in Power BI”… I think you could claim that it is relationships that put the power in Power Pivot!