So, here we are… staring down two tables in the field of insight battle. It’s a standard (if highly simplified) “Budget vs Actuals” situation. We are total n00bs, and we see some fields that look similar, and we think… “sweet, let’s relate those two bad boys!” Alas:
The relationship cannot be created because each column contains duplicate values. Select at least one column that contains only unique values.
Oh. Right. As we discussed here, Power Pivot does not directly support Many to Many relationships. While you can do some DAX magic to work around that…
trying to directly relate these fields causes the error you see above. In case it is not totally clear, the error is just saying the same item showed up multiple times on both sides… eg, there are 4 candles in the Budget table and there are 4 candles in the Actuals table (as in… there is more than just one).
Ok, so what can we do about this?
Well, first we need to recognize there are “real” many to many problems… then there are “fake” ones like the example above
“Fake” Many to Many Problems
I boldly claim the initial example was a totally “fake” Many to Many problem, and I present my evidence below and left. All I did was create a new “lookup” table with just the unique values of the products in it. We made a table that can be the one side of a many to one relationship. Well, technically it’s the one side of two one to many relationships (both Actuals and Budgets).
And this will work super fantastic. Any filters on Products[Product] (say, putting them on Rows of a pivot table) will cause the filters to be applied to both the Actuals and Budget tables, allowing us to do some fancy comparisons.
Assuming we wish to do those comparisons by date… we will have the exact same problem. Trying to relate Actuals[Date] to Budgets[Date] will give us the big ugly error message about duplicate values.
So, we will have a separate lookup table for dates, and relate both Budgets and Actuals tables to that table. Likely, we were going to have a Calendar table anyway – since that is the only way to use all the cool DAX Time Intelligence functions, as described here.
This whole scenario is fairly common. In my experience, it is probably more common that “real” many to many problems. I should mention that Power Query is a great way to create these unique tables of values from raw data… but we will leave that for another day. Well, unless Ken Puls already had a blog entry for that, in which case… hopefully he points us to it in the comments
“Real” Many to Many Problems
It’s past midnight here in the Seattle area, and I refuse to go back and update the previous section to provide a consistent example. I’m a bad bad man like that. Just go with it. Okay, we are now a rather eccentric store selling products you see at right (displayed along with their prices… in our “Products” lookup table). Below that is our Sales table, showing just… for each date, how many of each product did we sell. Great, so far so good.
We write a simple measure…
Total Sales := SUMX(Sales, Sales[Quantity] * RELATED(Product[Price]))
And we can now easily create a pivot table with total sales by product. Hurray for us!
But then, disaster strikes. Our crazy boss decides he wants to “categorize” the products and view total sales by these categories. You like that idea well enough, but the boss decides that some products fit across multiple categories. While a banana is clearly “Food”, it also looks like a gun, so it is a “Weapon”. Naturally.
A slingshot is both a Toy and a Weapon. Etc. So, you end up with a table (like at left), where the same item may show many times, and the same category may show many times.
One product may be in Many categories.
One category may have Many products.
There is a Many to Many relationship between categories and products. Things just got “Real”. Notice, that this was really a business requirement. We don’t really have this problem because of some geek technology weirdness… we have this problem because the business said we have this problem, and now we have to solve it.
From a modeling/relationship standpoint, I don’t really find this too weird actually. Having a Products table and Sales table feels super natural. And then, we just need to add‘in our mapping of Products to Categories. This table that I call “ProductCategoryMap” can’t be used as a “lookup table”, as neither set of values are distinct (the values of Product are not unique, the values of Category… also are not unique).
What I am NOT showing you, because I think it vaguely confuses things at this point… is that you would likely ALSO have a separate (lookup) table for the Categories. One row per category, and you might have things like… Category Manager, Category Id, Category Short Name, whatever. That table would work totally normal. It would filter the ProductCategoryMap, like any other Lookup table filters the fact tables hanging off it.
Okay… so, we have this map (frequently called a “bridge” table). Uh… does it like… do anything?
Spoiler: Nothing good
See at right… if I put Category from the map onto rows, and put our [Total Sales] measure, it’s clearly not “working”.
We really shouldn’t be surprised by that. We are doing a SUM of rows in the Sales table, and there is no relationship between the Sales table, and the ProductCategoryMap table. They are sorta kinda “indirectly” related via the Products table… in that any filter on Products[Product] will filter both Sales and ProductCategoryMap tables per the usual “filters in the lookup table, flow down to the fact/data tables” or “filters on the 1 side of a 1 to many, are applied to the many side”. However you like to think about that.
Okay, let’s boldly write a measure that does work.
Total Sales M2M := CALCULATE([Total Sales], ProductCategoryMap)
I guess that looks weird. I dunno. It used to look totally magic to me, but I am starting to get used to it. It’s easy to remember from a “pattern” standpoint… “Oh, your measure isn’t paying attention to some table used for many to many purposes… well, just pass the bridge table to CALCULATE, and you are good”. That’s kinda… it.
The 2nd (and also 3rd, 4th, etc) parameters to calculate can be a simple expression (like Products[Product] = “Ice Cream”)… but that is really just syntax sugar for the “real” syntax… which is that it takes a table. Now, almost always, we pass a table… because it is returned from a function… say… ALL(), VALUES(), FILTER(), or even SAMEPERIODLASTYEAR(). And what it means is “calculate the given measure… but ONLY for rows that match in this table I am passing you”.
It’s more uncommon what we are doing here… just… flat out passing the whole table. But, it does suit our needs quite nicely. If we put ProductCategoryMap[Category] on rows of a pivot… then our new [Total Sales M2M] measure will be evaluated for each row… and once that Category filter gets applied, we are left with just the Products in our bridge table that map to the Category on rows. (Yes, I realize you need to go re-read that sentence. Sorry, go ahead… I will wait).
Once we are left with just the Products mapping to the correct category… well, apparently CALCULATE also filters the Products table by the same Products. Now, if you are a student of Rob Collie and “filters flow down hill” … this is where the model breaks down a bit… so, feel free to just roll with it, follow the pattern, and don’t worry about it. Rob has been wildly successful without every considering that a filter on the “many side” can impact the “one side” of a relationship… but that is exactly what is happening here. I have learned to adjust that to think “if we have a row context, the filters flow both ways”… and even then, I only think about that when weird stuff is happening or I need to work on a many to many problem. Rob’s mental model works 99% of the time, don’t lose that if you have it.
Anyway, … once the filtered products are applied to the Products table (from the bridge), we are back to normalcy, in that… the filter will also flow into the fact/data table we called “Sales”, and only products within that category will be applied in our SUMX expression.
And you can see that, magically, it works! It’s a bit un-nerving to see that adding Food+Toy+Weapon does not equal the grand total (21.5 + 10.75 + 12 does not equal 32.25). Of course, this is because Rock is in two categories. Banana and Sling Shot… are in two categories. The numbers are totally correct and what you want, just… maybe not what you are used to
And that my friends was A LOT of words just to say “take your measure, wrap it in a calculate, and pass a map/bridge table as a parameter to the calculate”. Such is life.
That a Wrap
The summary here is basically “You probably don’t even have a many to many problem… just go create a lookup table, silly! But if you do… create a bridge table, and pass it to calculate”. Easy Cheezy Lemon Squeezy!
Again, you will probably have a lookup table for each column in your map/bridge… I left off a Category lookup here because it will work “like normal”… the important thing here is the bridge table (which is also the one you pass to CALCULATE to make your M2M measures work).
If nothing else, this should maybe convince you that “Power Pivot does not support Many to Many relationships” is totally false. You just have to do some pretty minor DAX work…and it works super peachy keen.
- The streak is alive! – August 10, 2017
- DAX KEEPFILTERS Plus Bonus Complaining! – July 20, 2017
- VAR: The best thing to happen to DAX since CALCULATE() – July 11, 2017
- Review: Analyzing Data with Power BI and Power Pivot for Excel – June 20, 2017
- Power BI Date Table – June 2, 2017
Scott Senkeresty here… Founder and Owner of Tiny Lizard. Yes, I have done some stuff: A Master’s Degree in Computer Science from California Polytechnic State University in beautiful San Luis Obispo, California. Over twelve years developing software for Microsoft, across Office, Windows and Anti-Malware teams… Read More >>
Donald Parish says
I built this example using Power BI Designer, and it detects the relationships automatically. No magic DAX needed!
Scott Senkeresty says
Yes! Great thing to point out Donald. Designer supports many to many relationships “directly” ! Pretty awesome.
lisa says
hi – how do i use the Power BI Designer? I’m having the same issue and can not resolve it. Any help will be greatly appreciated.
Dan F. says
This was really helpful. Thanks for posting this!
Isaac D says
I get the use of a lookup table as a way of solving the “fake” many-to-many problems, and that works for me as long as I have a static set of unique values.
What do I do when I need to generate the lookup table dynamically by pulling unique values from one or more of my fact tables?
For example, I want to create a table for customer names and phone numbers based on email addresses. All of the incoming connections from the ERP have duplicate values in the EmailAddress fields.
I could manually create a lookup table which contains all the unique email addresses at a given point in time, then relate the fact tables to that lookup table, but then the company adds new customers with new email addresses, and I’ll need to update the lookup table again and again.
Is there a simple way to use DAX to generate a lookup table dynamically using the most recent set of unique email addresses and the customer names and numbers most recently associated with those addresses?
Isaac D says
I should mention that I’m using Excel 2013 with the PowerPivot addin, but I don’t have PowerQuery.
Agustin Garcia says
Hello,
I have a similar case, like the “‘fake’ Many to Many problem” with two tables like:
Budget: Proj Name, Start Date, Department (and some other)
Actuals: Proj Name, Start Date, Department (and some other columns)
One of my needs is to compare variances by Proj Name and by StartDate. I tried by creating two lookup tables (one with ProjName only and the other with Start Date column only)
When I setup the relationships, I can do it only by referring to one of the lookup tables (either PrjName or StartDate), but if I try to create a relationship with the other lookup and the two main tables (Budget and Actuals), it generates but as inactive, stating that another relationship already exists (“There is already an active relationship between these two tables. You can deactivate the existing relationship, and them make this relationship active”)
I basically need to identify in which Quarter/Month the variances occur and then understand which project on that certain Q/Month is causing it
Thanks in advance for looking into it.
ed says
Could you not do this in some cases with concatenating fields? For example, in our ERP system, the OrderHeader table and the OrderLine table require you to join in a many-to-many the InvoiceNumber, OrderNumber, and OrderType. Those 3 fields give a unique string. Easy to do in Access, but DAX won’t do it.
But it seems I could create a new custom calculation field that was simply InvoiceNumber&OrderNumber&OrderType in both tables, then join those in PowerPivot.
Scott Senkeresty says
This is a bit different. While it is true that Power Pivot doesn’t let you create a relationship between two tables using more than 1 field (and for that problem, concatenated fields is a good fix), many to many problems are a bit different. Think of tables “teams” and “players” that includes historical data. 1 team obviously has many players… but… across history… a single player might be on multiple teams too. There is Favre the Packer and Favre the… whatever other freakin team he played for. Manning of the Colts, and Manning of the Broncos. Many to Many.
ed says
Thanks. So my understanding of many-to-many is incorrect? i.e. that definition does not include the example I gave above?
I *do* understand why my example would not work in your scenario, which is why I said “some cases” but if what I showed cannot be within the definition of many-to-many, then yes, my comment is off base.
In any event, I’ve archived your article off in to my Evernote account for future reference and I know it will be helpful, assuming MS doesn’t enhance the DAX engine before I have a situation like described. 🙂