Hi All,
So I'm usually posting on here about my Excel exploits, so something a little different today as I'm coming back to Power BI after a couple of years away.
I'm building a Sankey Chart which has 4 steps to visualise flow - there should be no difference in the start and end figures.
I'm working from three tables:
Table 1 - Source, Mid1, Count
Table 2 - Mid1, Mid2, Count
Table 3 - Mid2, Destination, Count
So it's Source to Destination going through various categories in stages Mid1 and Mid2
I initially used Union and Summarize to create a new table on which to develop the Sankey Chart - visually it worked, I have the 4 stages, but there is a fundamental drop-off. There are 33,000 records (and we end up with that figure at the Destination on the right of the Chart), but the Source (at the left of the Chart) has 99,000, so it is tripling what the original figure should be. I can also see stages from Mid1 and Mid2 in the Source 'column' on the left of the chart.
I've not built a Sankey in Power BI before so was researching how best to structure my data after failing miserably when going at it alone and various articles online seemed to suggest structuring my 'Step' data in different tables and then UNION/SUMMARIZE-ing it. If I do just 3 phases (ie Source, Mid1 and Destination) then I get 66,000 in the first 'Column'. I can't seem to get my head around how to achieve the UNION without duplicating records somehow.
So I guess the question is (being lost right now), can anyone suggest the best way to structure the data for what I'm trying to achieve?
Source has 18 categories
Mid1 has 9 categories
Mid2 has 7 categories
Destination has 3 categories
Obviously each of those 33k records can have any form of journey through these categories
Any help appreciated from you lovely guys, as always!
Mads
So I'm usually posting on here about my Excel exploits, so something a little different today as I'm coming back to Power BI after a couple of years away.
I'm building a Sankey Chart which has 4 steps to visualise flow - there should be no difference in the start and end figures.
I'm working from three tables:
Table 1 - Source, Mid1, Count
Table 2 - Mid1, Mid2, Count
Table 3 - Mid2, Destination, Count
So it's Source to Destination going through various categories in stages Mid1 and Mid2
I initially used Union and Summarize to create a new table on which to develop the Sankey Chart - visually it worked, I have the 4 stages, but there is a fundamental drop-off. There are 33,000 records (and we end up with that figure at the Destination on the right of the Chart), but the Source (at the left of the Chart) has 99,000, so it is tripling what the original figure should be. I can also see stages from Mid1 and Mid2 in the Source 'column' on the left of the chart.
I've not built a Sankey in Power BI before so was researching how best to structure my data after failing miserably when going at it alone and various articles online seemed to suggest structuring my 'Step' data in different tables and then UNION/SUMMARIZE-ing it. If I do just 3 phases (ie Source, Mid1 and Destination) then I get 66,000 in the first 'Column'. I can't seem to get my head around how to achieve the UNION without duplicating records somehow.
So I guess the question is (being lost right now), can anyone suggest the best way to structure the data for what I'm trying to achieve?
Source has 18 categories
Mid1 has 9 categories
Mid2 has 7 categories
Destination has 3 categories
Obviously each of those 33k records can have any form of journey through these categories
Any help appreciated from you lovely guys, as always!
Mads