JenniferMurphy
Well-known Member
- Joined
- Jul 23, 2011
- Messages
- 2,676
- Office Version
- 365
- Platform
- Windows
I need to randomly select an item from a list in which each item is given a relative chance of being selected.
In the table below, the items to be selected are in the Items column and the relative chances of being selected are in the Chances column. Item A has 10 chances to be selected, Item B has 9 chances, Item C has 8 chances, and so on down to Item J, which has 1 chance.
My solution for selecting an item is to add two helper columns. The Cumulative Chances column is just that. It allows me to generate a random integer on [1, N], where N is the sum of the chances. In this example, N=55. Then I just need to find the Range that contains the random number.
My solution for that is the Hits columns. I have provided 5 examples. In each one, the "random" number in the Random number row above is compared with each range. It returns a "1" if the number if greater than that range and a zero otherwise. The sum of the "hits" plus 1 is the index of the selected item.
This works, but I would like a solution that does not need the Hits helper column. It would be even better if it did not need the Cumulative Chances columns, but I think that is unlikely.
I would like to know if there is a reasonably simple worksheet expression that will select the item without the need for the Hits column.
I know that I can easily do this with a fairly simple UDF that loops through the ranges until it finds a hit. That's what I will do if there is not a simple worksheet expression.
This is related to my question in the thread in the link below, but it is a completely different question, so I thought it best to start a new thread.
In the table below, the items to be selected are in the Items column and the relative chances of being selected are in the Chances column. Item A has 10 chances to be selected, Item B has 9 chances, Item C has 8 chances, and so on down to Item J, which has 1 chance.
My solution for selecting an item is to add two helper columns. The Cumulative Chances column is just that. It allows me to generate a random integer on [1, N], where N is the sum of the chances. In this example, N=55. Then I just need to find the Range that contains the random number.
My solution for that is the Hits columns. I have provided 5 examples. In each one, the "random" number in the Random number row above is compared with each range. It returns a "1" if the number if greater than that range and a zero otherwise. The sum of the "hits" plus 1 is the index of the selected item.
This works, but I would like a solution that does not need the Hits helper column. It would be even better if it did not need the Cumulative Chances columns, but I think that is unlikely.
I would like to know if there is a reasonably simple worksheet expression that will select the item without the need for the Hits column.
I know that I can easily do this with a fairly simple UDF that loops through the ranges until it finds a hit. That's what I will do if there is not a simple worksheet expression.
Weighted Random Selection Methods.xlsx | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
B | C | D | E | F | G | H | I | J | K | |||
3 | Random number | 10 | 11 | 52 | 54 | 55 | ||||||
4 | Item selected | A | B | H | I | J | ||||||
5 | ||||||||||||
6 | Ranks | Items | Chances | Cumulative Chances | Range | Hits1 | Hits2 | Hits3 | Hits4 | Hits5 | ||
7 | 1 | A | 10 | 10 | 1 to 10 | 0 | 1 | 1 | 1 | 1 | ||
8 | 2 | B | 9 | 19 | 11 to 19 | 0 | 0 | 1 | 1 | 1 | ||
9 | 3 | C | 8 | 27 | 20 to 27 | 0 | 0 | 1 | 1 | 1 | ||
10 | 4 | D | 7 | 34 | 28 to 34 | 0 | 0 | 1 | 1 | 1 | ||
11 | 5 | E | 6 | 40 | 35 to 40 | 0 | 0 | 1 | 1 | 1 | ||
12 | 6 | F | 5 | 45 | 41 to 45 | 0 | 0 | 1 | 1 | 1 | ||
13 | 7 | G | 4 | 49 | 46 to 49 | 0 | 0 | 1 | 1 | 1 | ||
14 | 8 | H | 3 | 52 | 50 to 52 | 0 | 0 | 0 | 1 | 1 | ||
15 | 9 | I | 2 | 54 | 53 to 54 | 0 | 0 | 0 | 0 | 1 | ||
16 | 10 | J | 1 | 55 | 55 to 55 | 0 | 0 | 0 | 0 | 0 | ||
17 | 55 | 10 | 0 | 1 | 7 | 8 | 9 | |||||
Temp |
Cell Formulas | ||
---|---|---|
Range | Formula | |
G4 | G4 | =INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits1]]+1) |
H4 | H4 | =INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits2]]+1) |
I4 | I4 | =INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits3]]+1) |
J4 | J4 | =INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits4]]+1) |
K4 | K4 | =INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits5]]+1) |
D7:D16 | D7 | =TblTemp[[#Totals],[Items]]-[@Ranks]+1 |
E7:E16 | E7 | =SUM(INDEX([Chances],1):[@Chances]) |
F7:F16 | F7 | =[@[Cumulative Chances]]-[@Chances]+1 & " to " & [@[Cumulative Chances]] |
G7:G16 | G7 | =IF(RndNum1>[@[Cumulative Chances]],1,0) |
H7:H16 | H7 | =IF(RndNum2>[@[Cumulative Chances]],1,0) |
I7:I16 | I7 | =IF(RndNum3>[@[Cumulative Chances]],1,0) |
J7:J16 | J7 | =IF(RndNum4>[@[Cumulative Chances]],1,0) |
K7:K16 | K7 | =IF(RndNum5>[@[Cumulative Chances]],1,0) |
G17 | G17 | =SUBTOTAL(109,[Hits1]) |
H17 | H17 | =SUBTOTAL(109,[Hits2]) |
I17 | I17 | =SUBTOTAL(109,[Hits3]) |
J17 | J17 | =SUBTOTAL(109,[Hits4]) |
K17 | K17 | =SUBTOTAL(109,[Hits5]) |
B7:B16 | B7 | =ROW()-ROW(TblTemp[[#Headers],[Ranks]]) |
B17 | B17 | =SUBTOTAL(109,[Ranks]) |
C17 | C17 | =SUBTOTAL(103,[Items]) |
Named Ranges | ||
---|---|---|
Name | Refers To | Cells |
RndNum1 | =Temp!$G$3 | G7:G16 |
RndNum2 | =Temp!$H$3 | H7:H16 |
RndNum3 | =Temp!$I$3 | I7:I16 |
RndNum4 | =Temp!$J$3 | J7:J16 |
RndNum5 | =Temp!$K$3 | K7:K16 |
This is related to my question in the thread in the link below, but it is a completely different question, so I thought it best to start a new thread.
Help with math problem
I fear my math days are dwindling even faster than I thought. I hope someone can help me with this. I need a formula that will select an item from a list of N items such that: The odds of selecting item 1 is p (0
www.mrexcel.com