Poking around on the web I find various formulas for creating histograms. I'm getting different results for each formula and I want to know which one is accurate.
I have a table of purchase orders and I want to get the distribution of how many POs are in various dollar ranges (e.g. how many are for between $100 and $500). In the Orders table each purchase order can have multiple lines, which add up to the total value of the purchase order. So I need to sum [Extended Merch Amt] for all the lines with the same Orders[PO No.] value to get the full PO value. My base measures are:
The competing histogram measures are:
And my results are:
[TABLE="width: 384"]
<tbody>[TR]
[TD]Label[/TD]
[TD]PO Count Distribution[/TD]
[TD]PO Distribution Count[/TD]
[/TR]
[TR]
[TD]Up to $10[/TD]
[TD="align: right"]89,426[/TD]
[TD="align: right"]7,557[/TD]
[/TR]
[TR]
[TD]$10 to $50[/TD]
[TD="align: right"]133,536[/TD]
[TD="align: right"]34,782[/TD]
[/TR]
[TR]
[TD]$50 to $100[/TD]
[TD="align: right"]99,604[/TD]
[TD="align: right"]32,989[/TD]
[/TR]
[TR]
[TD]$100 to $500[/TD]
[TD="align: right"]144,745[/TD]
[TD="align: right"]112,692[/TD]
[/TR]
[TR]
[TD]$500 to $1,000[/TD]
[TD="align: right"]50,103[/TD]
[TD="align: right"]50,861[/TD]
[/TR]
[TR]
[TD]$1,000 to $5,000[/TD]
[TD="align: right"]55,332[/TD]
[TD="align: right"]65,079[/TD]
[/TR]
[TR]
[TD]$5,000 to $10,000[/TD]
[TD="align: right"]8,851[/TD]
[TD="align: right"]12,483[/TD]
[/TR]
[TR]
[TD]$10,000 to $50,000[/TD]
[TD="align: right"]6,533[/TD]
[TD="align: right"]9,261[/TD]
[/TR]
[TR]
[TD]$50,000 to $100,000[/TD]
[TD="align: right"]943[/TD]
[TD="align: right"]1,100[/TD]
[/TR]
[TR]
[TD]$100,000 to $500,000[/TD]
[TD="align: right"]828[/TD]
[TD="align: right"]1,021[/TD]
[/TR]
[TR]
[TD]$500,000 to $1 million[/TD]
[TD="align: right"]107[/TD]
[TD="align: right"]125[/TD]
[/TR]
[TR]
[TD]$1 million to $5 million[/TD]
[TD="align: right"]116[/TD]
[TD="align: right"]121[/TD]
[/TR]
[TR]
[TD]$5 million to $10 million[/TD]
[TD="align: right"]22[/TD]
[TD="align: right"]22[/TD]
[/TR]
[TR]
[TD]$10 million to $100 million[/TD]
[TD="align: right"]9[/TD]
[TD="align: right"]9[/TD]
[/TR]
[TR]
[TD]$100 million +[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]Grand Total[/TD]
[TD="align: right"]328,103[/TD]
[TD="align: right"]328,103[/TD]
[/TR]
</tbody>[/TABLE]
The rows for PO Count Distribution add up to more than the Grand Total so I suspect the PO Distribution Count is the accurate one since I'm doing the VALUES(Orders[PO No.]) but I like the simplicity of PO Count Distribution and am wondering if there's a way to make it work better. Any insight as to why they work as they do, and if there's a simpler solution? Thanks.
I have a table of purchase orders and I want to get the distribution of how many POs are in various dollar ranges (e.g. how many are for between $100 and $500). In the Orders table each purchase order can have multiple lines, which add up to the total value of the purchase order. So I need to sum [Extended Merch Amt] for all the lines with the same Orders[PO No.] value to get the full PO value. My base measures are:
Code:
PO Spend:=SUM ( Orders[Extended Merch Amt] )
Code:
PO Count:=DISTINCTCOUNT ( Orders[PO No.] )
The competing histogram measures are:
Code:
PO Count Distribution:=CALCULATE ( [PO Count],
FILTER (
Orders,
AND (
[PO Spend] >= MIN ( tblDollarRanges[Min] ),
[PO Spend] < MAX ( tblDollarRanges[Max] )
)
)
)
Code:
PO Distribution Count:=CALCULATE ( [PO Count],
FILTER (
VALUES ( Orders[PO No.] ),
COUNTROWS (
FILTER (
tblDollarRanges,
[PO Spend] >= tblDollarRanges[Min]
&& [PO Spend] < tblDollarRanges[Max]
)
)
)
)
And my results are:
[TABLE="width: 384"]
<tbody>[TR]
[TD]Label[/TD]
[TD]PO Count Distribution[/TD]
[TD]PO Distribution Count[/TD]
[/TR]
[TR]
[TD]Up to $10[/TD]
[TD="align: right"]89,426[/TD]
[TD="align: right"]7,557[/TD]
[/TR]
[TR]
[TD]$10 to $50[/TD]
[TD="align: right"]133,536[/TD]
[TD="align: right"]34,782[/TD]
[/TR]
[TR]
[TD]$50 to $100[/TD]
[TD="align: right"]99,604[/TD]
[TD="align: right"]32,989[/TD]
[/TR]
[TR]
[TD]$100 to $500[/TD]
[TD="align: right"]144,745[/TD]
[TD="align: right"]112,692[/TD]
[/TR]
[TR]
[TD]$500 to $1,000[/TD]
[TD="align: right"]50,103[/TD]
[TD="align: right"]50,861[/TD]
[/TR]
[TR]
[TD]$1,000 to $5,000[/TD]
[TD="align: right"]55,332[/TD]
[TD="align: right"]65,079[/TD]
[/TR]
[TR]
[TD]$5,000 to $10,000[/TD]
[TD="align: right"]8,851[/TD]
[TD="align: right"]12,483[/TD]
[/TR]
[TR]
[TD]$10,000 to $50,000[/TD]
[TD="align: right"]6,533[/TD]
[TD="align: right"]9,261[/TD]
[/TR]
[TR]
[TD]$50,000 to $100,000[/TD]
[TD="align: right"]943[/TD]
[TD="align: right"]1,100[/TD]
[/TR]
[TR]
[TD]$100,000 to $500,000[/TD]
[TD="align: right"]828[/TD]
[TD="align: right"]1,021[/TD]
[/TR]
[TR]
[TD]$500,000 to $1 million[/TD]
[TD="align: right"]107[/TD]
[TD="align: right"]125[/TD]
[/TR]
[TR]
[TD]$1 million to $5 million[/TD]
[TD="align: right"]116[/TD]
[TD="align: right"]121[/TD]
[/TR]
[TR]
[TD]$5 million to $10 million[/TD]
[TD="align: right"]22[/TD]
[TD="align: right"]22[/TD]
[/TR]
[TR]
[TD]$10 million to $100 million[/TD]
[TD="align: right"]9[/TD]
[TD="align: right"]9[/TD]
[/TR]
[TR]
[TD]$100 million +[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]Grand Total[/TD]
[TD="align: right"]328,103[/TD]
[TD="align: right"]328,103[/TD]
[/TR]
</tbody>[/TABLE]
The rows for PO Count Distribution add up to more than the Grand Total so I suspect the PO Distribution Count is the accurate one since I'm doing the VALUES(Orders[PO No.]) but I like the simplicity of PO Count Distribution and am wondering if there's a way to make it work better. Any insight as to why they work as they do, and if there's a simpler solution? Thanks.
Last edited: