Aggregate/subset PercentRank() functionality

mountainclimber11

Board Regular
Joined
Dec 1, 2009
Messages
79
We use PERCENTILERANK.INC() in Excel with no issues. The columns below labeled "Pctl Inc" use this formula:

Code:
=PERCENTRANK.INC($B$2:$B$6,B2,3)

However, in other cases we are looking for subset percent rank functionality (i.e. rank a subset of the values based on some other condition/filter). We have come close, but the functionality is slightly/significantly off.

My formula to achieve this is based off this link:
Calculate Percentile and Conditional Ranking in Excel Using SUMPRODUCT

Here is the specific formula I am referring to in that link:
<code style="margin: 0px; padding: 0px; border: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, sans-serif; white-space: inherit;">
Code:
=(SUMPRODUCT(($A$2:$A$6=A2)*(B2>=$B$2:$B$6)))/COUNTIF($A$2:$A$6,A2)


</code>(Note that the site link has a typo in the formula and mine is the one you want to use to test.)

If everything is working properly you will get ranks like this:

[TABLE="width: 307"]
<tbody>[TR]
[TD]Subset Labels[/TD]
[TD]Values to Rk[/TD]
[TD]Pctl Inc[/TD]
[TD]Pctl W/in[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]100[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0.25[/TD]
[TD="align: right"]0.5[/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]6[/TD]
[TD="align: right"]0.5[/TD]
[TD="align: right"]0[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]-1[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0[/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]7[/TD]
[TD="align: right"]0.75[/TD]
[TD="align: right"]1[/TD]
[/TR]
</tbody>[/TABLE]


"Subset Labels" -> tells you where to subset. In this sample there are 2 categories that we want to create subsets for: "a" and "b". That is, we want a percentile rank of all the a's amongst the the a's only. And the same for the b's.
"Values to Rk" -> the values that should be ranked.
"Pctl Inc" -> is just a regular PERCENTRANK.INC() formula. See formula above.
"Pctl W/in" -> this is the column/ranks I am trying to create via a percentile rank "within" the subset. See formula above. The ranks in the table above were manually typed in and do NOT reflect the subset example formula above.

One way to easily test your formula is to change the subset labels so they are all the same and compare the resulting ranks to the a regular PERCENTRANK.INC() formula. In that case (all of the subset labels are the same) if the subset column is correct you should get values like these:

[TABLE="width: 256"]
<tbody>[TR]
[TD="width: 64"]Subset Labels[/TD]
[TD="width: 64"]Values to Rk[/TD]
[TD="width: 64"]Pctl Inc[/TD]
[TD="width: 64"]Pctl W/in[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]100[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0.25[/TD]
[TD="align: right"]0.4[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]6[/TD]
[TD="align: right"]0.5[/TD]
[TD="align: right"]0.6[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]-1[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0.2[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]7[/TD]
[TD="align: right"]0.75[/TD]
[TD="align: right"]0.8[/TD]
[/TR]
</tbody>[/TABLE]

So, the goal is to emulate the PERCENTRANK.INC() formula results, but on a subset/aggregated basis. Again the Pctl W/in column above was hand typed in, but it is what we want.

It's a little confusing, but hopefully it is clear enough.

For all formulas and tables in this post the first cell "Subset Labels" is A1.
 
Your 2 examples are not consistent. The Pctl/W/in in the top table goes from 0 to 1 within a group, but the bottom table it goes from 0.2 ( being 1/5 of 1 for 5 values I'd say ). So which way do you want it to work?
 
Upvote 0
Try using an "array formula", e.g. this version

=PERCENTRANK.INC(IF(A$2:A$6=A2,B$2:B$6),B2,3)

confirmed with CTRL+SHIFT+ENTER (so that you see curly braces like { and } around the formula)
 
Upvote 0
[TABLE="width: 768"]
<colgroup><col width="64" span="12" style="width:48pt"> </colgroup><tbody>[TR]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[TD="width: 64"][/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]3[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 3"]there are 6 a results[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]99[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="align: right"]3[/TD]
[TD="align: right"]24[/TD]
[TD="align: right"]15[/TD]
[TD="align: right"]20[/TD]
[TD="align: right"]8[/TD]
[TD="align: right"]12[/TD]
[TD][/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]24[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]99[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]15[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]smallest[/TD]
[TD="align: right"]3[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]99[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]largest[/TD]
[TD="align: right"]24[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]20[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]99[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]mean[/TD]
[TD="align: right"]13.66667[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]8[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]99[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 4"]assume you want 4 equal percentiles[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]12[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]99[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 5"]there are 22 numbers ie 3,4,5…………..24[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 3"]o ne quarter of this is 5.5[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 5"]do you want to count how many numbers go into bins[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]3-8.5[/TD]
[TD][/TD]
[TD="align: right"]2[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]8.51 - 14[/TD]
[TD][/TD]
[TD="align: right"]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 2"]14.01 to 19.5[/TD]
[TD="align: right"]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD="colspan: 2"]19.51 to 24[/TD]
[TD="align: right"]2[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
 
Upvote 0
Here's an alternative formula ... but, as I said before, your examples are not consistent, so the formula you gave will generate one output, and mine will cover the other:

=(SUMPRODUCT(($A$2:$A$6=A2)*(B2>=$B$2:$B$6))-1)/(COUNTIF($A$2:$A$6,A2)-1)
 
Upvote 0
Your 2 examples are not consistent. The Pctl/W/in in the top table goes from 0 to 1 within a group, but the bottom table it goes from 0.2 ( being 1/5 of 1 for 5 values I'd say ). So which way do you want it to work?

Grrr... You are correct. My tables are not consistent with my descriptions. Here is the corrected post. Bold text changed/is new.

We use PERCENTILERANK.INC() in Excel with no issues. The columns below labeled "Pctl Inc" use this formula:

Code:
=PERCENTRANK.INC($B$2:$B$6,B2,3)

However, in other cases we are looking for subset percent rank functionality (i.e. rank a subset of the values based on some other condition/filter). We have come close, but the functionality is slightly/significantly off.

My formula to achieve this is based off this link:
Calculate Percentile and Conditional Ranking in Excel Using SUMPRODUCT

Here is the specific formula I am referring to in that link:
<code style="margin: 0px; padding: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, sans-serif; line-height: 13px; color: rgb(51, 51, 51); white-space: inherit; border: 0px; background-color: rgb(250, 250, 250);">Code:
=(SUMPRODUCT(($A$2:$A$6=A2)*(B2>=$B$2:$B$6)))/COUNTIF($A$2:$A$6,A2)


</code>(Note that the site link has a typo in the formula and mine is the one you want to use to test.)

If everything is working properly you will get ranks like this:

[TABLE="class: cms_table, width: 307"]
<tbody>[TR]
[TD]Subset Labels[/TD]
[TD]Values to Rk[/TD]
[TD]Pctl Inc[/TD]
[TD]Pctl W/in[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]100[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0.25[/TD]
[TD="align: right"]0.5[/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]6[/TD]
[TD="align: right"]0.5[/TD]
[TD="align: right"]0[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]-1[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0[/TD]
[/TR]
[TR]
[TD]b[/TD]
[TD="align: right"]7[/TD]
[TD="align: right"]0.75[/TD]
[TD="align: right"]1[/TD]
[/TR]
</tbody>[/TABLE]



"Subset Labels" -> tells you where to subset. In this sample there are 2 categories that we want to create subsets for: "a" and "b". That is, we want a percentile rank of all the a's amongst the the a's only. And the same for the b's.
"Values to Rk" -> the values that should be ranked.
"Pctl Inc" -> is just a regular PERCENTRANK.INC() formula. See formula above.
"Pctl W/in" -> this is the column/ranks I am trying to create via a percentile rank "within" the subset. See formula above. The ranks in the table above were manually typed in and do NOT reflect the subset example formula above.

One way to easily test your formula is to change the subset labels so they are all the same and compare the resulting ranks to the a regularPERCENTRANK.INC() formula. In that case (all of the subset labels are the same) if the subset column is correct you should get the same ranks in the "Pctl Inc" (which is just regularPERCENTRANK.INC()) and "Pctl W/in" columns. However, this is what you get when you use the fomula in the link given above, which is not what we want:

[TABLE="class: cms_table, width: 256"]
<tbody>[TR]
[TD="width: 64"]Subset Labels[/TD]
[TD="width: 64"]Values to Rk[/TD]
[TD="width: 64"]Pctl Inc[/TD]
[TD="width: 64"]Pctl W/in[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]100[/TD]
[TD="align: right"]1[/TD]
[TD="align: right"]1[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0.25[/TD]
[TD="align: right"]0.4[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]6[/TD]
[TD="align: right"]0.5[/TD]
[TD="align: right"]0.6[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]-1[/TD]
[TD="align: right"]0[/TD]
[TD="align: right"]0.2[/TD]
[/TR]
[TR]
[TD]a[/TD]
[TD="align: right"]7[/TD]
[TD="align: right"]0.75[/TD]
[TD="align: right"]0.8[/TD]
[/TR]
</tbody>[/TABLE]


So, the goal is to emulate the PERCENTRANK.INC() formula results, but on a subset/aggregated basis. Again the Pctl W/in column above is incorrect. If it were correct it would be exactly the same as the Pctl Inc column.

It's a little confusing, but hopefully it is clear enough.

For all formulas and tables in this post the first cell "Subset Labels" is A1.

PS - I cannot use array formulas in this case, sorry.
 
Upvote 0
Here's an alternative formula ... but, as I said before, your examples are not consistent, so the formula you gave will generate one output, and mine will cover the other:

=(SUMPRODUCT(($A$2:$A$6=A2)*(B2>=$B$2:$B$6))-1)/(COUNTIF($A$2:$A$6,A2)-1)

That is exactly what I am after!! Thank you so much for the quick reply!!
 
Upvote 0
Glenn's suggested formula seems to give the correct results, given your revised post......and in that example those are the same results as produced by my suggested array formula.....but the two would differ in some cases where there are duplicate values - this version would give the same results as my array formula in all cases.....without needing array entry

=1-(COUNTIFS(A$2:A$6,A2,B$2:B$6,">="&B2)-1)/(COUNTIF(A$2:A$6,A2)-1)

and would always give you the same results as PERCENTILE.INC where all subset labels are the same
 
Upvote 0
Glenn's suggested formula seems to give the correct results, given your revised post......and in that example those are the same results as produced by my suggested array formula.....but the two would differ in some cases where there are duplicate values - this version would give the same results as my array formula in all cases.....without needing array entry

=1-(COUNTIFS(A$2:A$6,A2,B$2:B$6,">="&B2)-1)/(COUNTIF(A$2:A$6,A2)-1)

and would always give you the same results as PERCENTILE.INC where all subset labels are the same


Impressive. Thank you! Even better. I thought I'd test it a bit more as well and found a minor flaw with both solutions:

If there is only one of the subset type (ex. only one "b") a #DIV/0 is returned, so I think this maybe the best formula to use:

Code:
=IFERROR(1-(COUNTIFS(A$2:A$6,A2,B$2:B$6,">="&B2)-1)/(COUNTIF(A$2:A$6,A2)-1),1)

... which is just barry's formula wrapped in IFERROR -> 1, which is what PERCENTRANK.INC() does when there is just one item to rank.

Thanks guys!
 
Upvote 0

Forum statistics

Threads
1,226,834
Messages
6,193,215
Members
453,779
Latest member
C_Rules

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top