Worksheet expression to select items from a list using weighted random numbers

JenniferMurphy

Well-known Member
Joined
Jul 23, 2011
Messages
2,687
Office Version
  1. 365
Platform
  1. Windows
I need to randomly select an item from a list in which each item is given a relative chance of being selected.

In the table below, the items to be selected are in the Items column and the relative chances of being selected are in the Chances column. Item A has 10 chances to be selected, Item B has 9 chances, Item C has 8 chances, and so on down to Item J, which has 1 chance.

My solution for selecting an item is to add two helper columns. The Cumulative Chances column is just that. It allows me to generate a random integer on [1, N], where N is the sum of the chances. In this example, N=55. Then I just need to find the Range that contains the random number.

My solution for that is the Hits columns. I have provided 5 examples. In each one, the "random" number in the Random number row above is compared with each range. It returns a "1" if the number if greater than that range and a zero otherwise. The sum of the "hits" plus 1 is the index of the selected item.

This works, but I would like a solution that does not need the Hits helper column. It would be even better if it did not need the Cumulative Chances columns, but I think that is unlikely.

I would like to know if there is a reasonably simple worksheet expression that will select the item without the need for the Hits column.

I know that I can easily do this with a fairly simple UDF that loops through the ranges until it finds a hit. That's what I will do if there is not a simple worksheet expression.

Weighted Random Selection Methods.xlsx
BCDEFGHIJK
3Random number1011525455
4Item selectedABHIJ
5
6RanksItemsChancesCumulative ChancesRangeHits1Hits2Hits3Hits4Hits5
71A10101 to 1001111
82B91911 to 1900111
93C82720 to 2700111
104D73428 to 3400111
115E64035 to 4000111
126F54541 to 4500111
137G44946 to 4900111
148H35250 to 5200011
159I25453 to 5400001
1610J15555 to 5500000
17551001789
Temp
Cell Formulas
RangeFormula
G4G4=INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits1]]+1)
H4H4=INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits2]]+1)
I4I4=INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits3]]+1)
J4J4=INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits4]]+1)
K4K4=INDEX(TblTemp[Items],TblTemp[[#Totals],[Hits5]]+1)
D7:D16D7=TblTemp[[#Totals],[Items]]-[@Ranks]+1
E7:E16E7=SUM(INDEX([Chances],1):[@Chances])
F7:F16F7=[@[Cumulative Chances]]-[@Chances]+1 & " to " & [@[Cumulative Chances]]
G7:G16G7=IF(RndNum1>[@[Cumulative Chances]],1,0)
H7:H16H7=IF(RndNum2>[@[Cumulative Chances]],1,0)
I7:I16I7=IF(RndNum3>[@[Cumulative Chances]],1,0)
J7:J16J7=IF(RndNum4>[@[Cumulative Chances]],1,0)
K7:K16K7=IF(RndNum5>[@[Cumulative Chances]],1,0)
G17G17=SUBTOTAL(109,[Hits1])
H17H17=SUBTOTAL(109,[Hits2])
I17I17=SUBTOTAL(109,[Hits3])
J17J17=SUBTOTAL(109,[Hits4])
K17K17=SUBTOTAL(109,[Hits5])
B7:B16B7=ROW()-ROW(TblTemp[[#Headers],[Ranks]])
B17B17=SUBTOTAL(109,[Ranks])
C17C17=SUBTOTAL(103,[Items])
Named Ranges
NameRefers ToCells
RndNum1=Temp!$G$3G7:G16
RndNum2=Temp!$H$3H7:H16
RndNum3=Temp!$I$3I7:I16
RndNum4=Temp!$J$3J7:J16
RndNum5=Temp!$K$3K7:K16



This is related to my question in the thread in the link below, but it is a completely different question, so I thought it best to start a new thread.
 

Excel Facts

Do you hate GETPIVOTDATA?
Prevent GETPIVOTDATA. Select inside a PivotTable. In the Analyze tab of the ribbon, open the dropown next to Options and turn it off
Given some uniformly distributed random number that is >=1 and <= the sum of chances, you might try something like this:
Book1
CD
6ItemsChances
7A10
8B9
9C8
10D7
11E6
12F5
13G4
14H3
15I2
16J1
Temp
Cell Formulas
RangeFormula
D7:D16D7=ROWS(TblTemp)-[@Ranks]+1


Book1
MNO
154I
211B
3Random number10A
4Item selectedA
5
6SCAN
710
819
927
1034
1140
1245
1349
1452
1554
1655
Temp
Cell Formulas
RangeFormula
O1:O3O1=IF(AND(N1>=1,N1<=SUM(TblTemp[Chances])),INDEX(TblTemp[Items],XMATCH(N1,SCAN(0,TblTemp[Chances],LAMBDA(a,b,a+b)),1,1)),"out of range")
N4N4=IF(AND(N3>=1,N3<=SUM(TblTemp[Chances])),INDEX(TblTemp[Items],XMATCH(N3,N7#,1,1)),"out of range")
N7:N16N7=SCAN(0,TblTemp[Chances],LAMBDA(a,b,a+b))
Dynamic array formulas.

I've shown this two ways to illustrate that the SCAN function is generating the cumulative sum array, but the single formula approach is shown in cells O1:O3.
 
Upvote 0
Given some uniformly distributed random number that is >=1 and <= the sum of chances, you might try something like this:
Thank you very much for that detailed explanation. Let me play around with it a bit. I'll come back if I have questions.

Thank you
 
Upvote 0

Forum statistics

Threads
1,223,883
Messages
6,175,167
Members
452,615
Latest member
bogeys2birdies

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top