Grouping people by demographics and survey answers into 3 groups.

Billbat

New Member
Joined
Dec 10, 2024
Messages
3
Office Version
  1. 2021
Platform
  1. Windows
Hello,
I have a group of 36 people and their demographics (Race, Age, Gender, Education, and votes) from a questionnaire they have filled out. I was wondering if there way a way to automatically disperse people into 3 groups of 12 in the order of Race, Gender, Age, and education so that we have diverse groups. I would then like to be able to make sure they're answers to a vote are not all the same as well.

I do this manually right now. But then have it going to a seating chart afterwards using indexing.

NameNumberAgeRaceGenderEducation
Aubrey120HispanicFSome College
Ben259WhiteMSome College
Cheyenne330WhiteFSome College
Christina437AsianFSome College
David531HispanicMHigh School
Danny660WhiteMCollege Grad
Dawn749OtherFPost Grad
Denise869WhiteFHigh School
Don955HispanicMCollege Grad
Erica1037BlackFCollege Grad
Geo1130HispanicFSome College
Jamie1235AsianMHigh School
Jennifer1341AsianFPost Grad
Jethro14501/2 White, 1/2 BlackMHigh School
Joe1567WhiteMCollege Grad
Jonna1653WhiteFCollege Grad
Juan1725HispanicMHigh School
Justin1836WhiteMCollege Grad
Kandi1947BlackFHigh School
Kathy2061WhiteFSome College
Kyle2133WhiteMSome College
Leonard2232BlackMSome College
Marci2351WhiteFHigh School
Mark2445BlackMCollege Grad
Melanie2528WhiteFCollege Grad
Michelle2658WhiteFSome College
Nancy2750BlackFHigh School
Nick2830HispanicMHigh School
Sam2918AsianMHigh School
Viki3057WhiteFPost Grad
 

Excel Facts

How to calculate loan payments in Excel?
Use the PMT function: =PMT(5%/12,60,-25000) is for a $25,000 loan, 5% annual interest, 60 month loan.
So if I understand well, you want each group as uniform inside, and as different from other groups as possible. (if it is opposite and you wany grops as similar to each other and as a consequence as different inside gropp, then I'm more convinced by the solution, and the only difference will be that Minimum function shall be used below).

I'd use Solver for this task (If you don't know this tool check in microsoft help - it's preinstalled, but by default not active).

I'd initially assign some group 1-3 to each person. This assigned group would be later used as variable in Solver.

So for each group I'd calculate number of each race, each gender, each education, and each age group (I think it would be better than just dealing with age). This can be done with COUNTIFS and list of Unique values (gender, race, ...) can be produced with ... UNIQUE function.

The diversity between groups in each category can be assessed with standard deviation. and sum of all standard deviations can be treated as a measure or inter-group diversity.

So the data and calculations could look like this (see also results - they were extarcted from main table with FILTER function.
And below is the screenshot with SOLVER setup. Once again - if you want inner diversity within groups but small difference between groups choose minimum in solver goal).

losowanie zróżnicowanych grup.xlsx
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJ
1NameNumberAgeRaceGenderEducationAssigned groupCounter123Race123Gender123Education123Age123
2Aubrey120HispanicFSome College3121212Hispanic0522,516611F10274,041452Some College0373,511885301632,516611
3Ben259WhiteMSome College3White50105M21054,041452High School5804,041452454512,081666
4Cheyenne330WhiteFSome College3Asian1402,081666College Grad5142,081666556112,886751
5Christina437AsianFSome College2Other1000,57735Post Grad2011701073,785939
6David531HispanicMHigh School2Black5202,516611
7Danny660WhiteMCollege Grad31/2 White, 1/2 Black0100,57735
8Dawn749OtherFPost Grad1
9Denise869WhiteFHigh School1sum of std dev13,269598,08290410,63511,27097
10Don955HispanicMCollege Grad3
11Erica1037BlackFCollege Grad1Aim: maximise for differences between groups43,25846
12Geo1130HispanicFSome College2or minimize to have inner diversity but low diversity between groups
13Jamie1235AsianMHigh School2
14Jennifer1341AsianFPost Grad1Result
15Jethro14501/2 White, 1/2 BlackMHigh School2Group1Group2Group3
16Joe1567WhiteMCollege Grad3NameNumberAgeRaceGenderEducationAssigned groupNameNumberAgeRaceGenderEducationAssigned groupNameNumberAgeRaceGenderEducationAssigned group
17Jonna1653WhiteFCollege Grad1Dawn749OtherFPost Grad1Christina437AsianFSome College2Aubrey120HispanicFSome College3
18Juan1725HispanicMHigh School2Denise869WhiteFHigh School1David531HispanicMHigh School2Ben259WhiteMSome College3
19Justin1836WhiteMCollege Grad1Erica1037BlackFCollege Grad1Geo1130HispanicFSome College2Cheyenne330WhiteFSome College3
20Kandi1947BlackFHigh School1Jennifer1341AsianFPost Grad1Jamie1235AsianMHigh School2Danny660WhiteMCollege Grad3
21Kathy2061WhiteFSome College3Jonna1653WhiteFCollege Grad1Jethro14501/2 White, 1/2 BlackMHigh School2Don955HispanicMCollege Grad3
22Mark23145BlackMCollege Grad2Justin1836WhiteMCollege Grad1Juan1725HispanicMHigh School2Joe1567WhiteMCollege Grad3
23Melanie23228WhiteFCollege Grad3Kandi1947BlackFHigh School1Mark23145BlackMCollege Grad2Kathy2061WhiteFSome College3
24Michelle23358WhiteFSome College3Nancy23450BlackFHigh School1Nick23530HispanicMHigh School2Melanie23228WhiteFCollege Grad3
25Nancy23450BlackFHigh School1Marci2351WhiteFHigh School1Sam23618AsianMHigh School2Michelle23358WhiteFSome College3
26Nick23530HispanicMHigh School2Mark2445BlackMCollege Grad1Leonard2232BlackMSome College2Kyle2133WhiteMSome College3
27Sam23618AsianMHigh School2Melanie2528WhiteFCollege Grad1Nick2830HispanicMHigh School2Michelle2658WhiteFSome College3
28Kyle2133WhiteMSome College3Nancy2750BlackFHigh School1Sam2918AsianMHigh School2Viki3057WhiteFPost Grad3
29Leonard2232BlackMSome College2
30Marci2351WhiteFHigh School1
31Mark2445BlackMCollege Grad1
32Melanie2528WhiteFCollege Grad1
33Michelle2658WhiteFSome College3
34Nancy2750BlackFHigh School1
35Nick2830HispanicMHigh School2
36Sam2918AsianMHigh School2
37Viki3057WhiteFPost Grad3
Sheet1
Cell Formulas
RangeFormula
J2:L2J2=COUNTIF($G$2:$G$37,J1)
N2:N7N2=UNIQUE(D2:D37)
O2:Q7O2=COUNTIFS($D$2:$D$37,$N2,$G$2:$G$37,O$1)
R2:R7,AJ2:AJ5,AD2:AD5,X2:X3R2=STDEV(O2:Q2)
T2:T3T2=UNIQUE(E2:E37)
U2:W3U2=COUNTIFS($E$2:$E$37,$T2,$G$2:$G$37,U$1)
Z2:Z5Z2=UNIQUE(F2:F37)
AA2:AC5AA2=COUNTIFS($F$2:$F$37,$Z2,$G$2:$G$37,AA$1)
AG2:AI2AG2=COUNTIFS($C$2:$C$37,"<="&$AF2,$G$2:$G$37,AG$1)
AG3:AI5AG3=COUNTIFS($C$2:$C$37,"<="&$AF3,$G$2:$G$37,AG$1)-SUM(AG$2:AG2)
R9,AJ9,AD9,X9R9=SUM(R2:R8)
N11N11=SUM(N9:AK9)
I17:O28,Y17:AE28,Q17:W28I17=FILTER($A2:$G37,$G2:$G37=J15)
Dynamic array formulas.
Named Ranges
NameRefers ToCells
solver_adj=Sheet1!$G$2:$G$37Y17, Q17, I17, O2:Q7, AG2:AI5, U2:W3, J2:L2, AA2:AC5
solver_lhs1=Sheet1!$G$2:$G$37Y17, Q17, I17, O2:Q7, AG2:AI5, U2:W3, J2:L2, AA2:AC5
solver_lhs2=Sheet1!$G$2:$G$37Y17, Q17, I17, O2:Q7, AG2:AI5, U2:W3, J2:L2, AA2:AC5
solver_lhs3=Sheet1!$G$2:$G$37Y17, Q17, I17, O2:Q7, AG2:AI5, U2:W3, J2:L2, AA2:AC5


1733928639914.png
 
Upvote 0
Hi Kaper,
Thank you this is perfect.
Do you know if there is a way split groups evenly by priority?
For instance divide up Race as close to evenly as possible. Hispanic 2/2/3 White 5/5/5 Asian 2/1/2 Other 0/1/0 Black 3/2/2 and 1/2 White 0/1/0
Then for Gender try to get an even spread of Male and Female with Race being more important than Gender. Lastly trying to add in Education and age if possible.
I understand it is impossible to have everybody split perfectly because the sample is not going to be divisible by 3 every time.

This is the first time I'm using Solver and trying to understand it. I am using the Min to keep the groups at 12 people each.
If I'm understanding this correctly the solver parameters are set to the Sum of std dev. Anything below or equal to 3 is put into group 3, anything greater than or equal to 1 is put in group 2 and anything lower than 1 is turned into an integer 1. Then the counter is saying how many people go in each group?
What is actually changing the Assigned group that people are placed in?

Sorry if this is a lot to answer. Thanks again,
Billbat
 
Upvote 0
So for equal spread use Min as objective.
If you want to have equal spread of race being more important than gender equal spread, then multiply sum of race standard deviations by 2 (or may be even more. If it shall be really dominating, you may consider something like 10).
So change R9 formula to:
Excel Formula:
=2*SUM(R2:R8)

If you treat education or age as lower importance - you can add weight (multiplication by) of for instance 0.5 to these factors.

As for Solver action: you may initially assign all people to one group: write all the same numbers (from range 1..3) in column G.
When you start a Solver (clicking Solve), Solver tries to insert differnt values into these 36 cells and checks if these values first allow geting closer to fulfilling all constraints.
So at first it will focus on geting 12 1s, 12 2s and 12 3s in column G.
Then starts changing values between cells to keep 12 values of each group number, but distributing them that way, that they are as equally divided as possible.

Standard deviation, which i used here is a measure of a dispersion of data. So if we have 15 White - when 5 is assigned to each group standard deviation is 0. but if there is difference of 1 between one group and othr two groups, so
0 0 1, 0 1 1, 3 2 2, 3 3 2, 5 5 6, and so on, the standard deviation is the same in each of these cases (about 0.58)

While solver is running, you may see current value of the objective cell in status bar. It may be worth to give solver more time for next trials - you can do this in options in Solver window.
Another option is rerun the solver (without any changes done manually to column G). it may fing=d a better solution in the second (or may be even later) run.

In my case it was 3 runs in a row when I got such assignment to groups which is the best. And in case of these input data it will not change when we change the weights, because when any "features" can be divided by 3 (like 15 white persons and 3 postgrads) they are disrtibuted equally between groups. and in all other features there is one group with 1 (diffrence (more or less - it does not matter) than 2 other elements in this group.

losowanie zróżnicowanych grup.xlsx
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJ
1NameNumberAgeRaceGenderEducationAssigned groupCounter123Race123Gender123Education123Age123
2Jethro14501/2 White, 1/2 BlackMHigh School21212121/2 White, 1/2 Black0100,57735M6560,57735High School4540,57735303430,57735
3Christina437AsianFSome College1Asian1220,57735F6760,57735Some College4330,57735454330,57735
4Jennifer1341AsianFPost Grad2Black3220,57735Post Grad1110552330,57735
5Jamie1235AsianMHigh School3Hispanic3220,57735College Grad3340,57735703230,57735
6Sam23618AsianMHigh School2Other0010,57735
7Sam2918AsianMHigh School3White5550
8Erica1037BlackFCollege Grad3
9Kandi1947BlackFHigh School2sum of std dev2,8867511,1547011,7320512,309401
10Nancy23450BlackFHigh School1
11Nancy2750BlackFHigh School1Aim: maximise for differences between groups8,082904
12Mark23145BlackMCollege Grad2or minimize to have inner diversity but low diversity between groups
13Leonard2232BlackMSome College3
14Mark2445BlackMCollege Grad1Result
15Aubrey120HispanicFSome College1Group1Group2Group3
16Geo1130HispanicFSome College2NameNumberAgeRaceGenderEducationAssigned groupNameNumberAgeRaceGenderEducationAssigned groupNameNumberAgeRaceGenderEducationAssigned group
17David531HispanicMHigh School2Christina437AsianFSome College1Jethro14501/2 White, 1/2 BlackMHigh School2Jamie1235AsianMHigh School3
18Don955HispanicMCollege Grad3Nancy23450BlackFHigh School1Jennifer1341AsianFPost Grad2Sam2918AsianMHigh School3
19Juan1725HispanicMHigh School3Nancy2750BlackFHigh School1Sam23618AsianMHigh School2Erica1037BlackFCollege Grad3
20Nick23530HispanicMHigh School1Mark2445BlackMCollege Grad1Kandi1947BlackFHigh School2Leonard2232BlackMSome College3
21Nick2830HispanicMHigh School1Aubrey120HispanicFSome College1Mark23145BlackMCollege Grad2Don955HispanicMCollege Grad3
22Dawn749OtherFPost Grad3Nick23530HispanicMHigh School1Geo1130HispanicFSome College2Juan1725HispanicMHigh School3
23Cheyenne330WhiteFSome College3Nick2830HispanicMHigh School1David531HispanicMHigh School2Dawn749OtherFPost Grad3
24Denise869WhiteFHigh School3Michelle23358WhiteFSome College1Melanie23228WhiteFCollege Grad2Cheyenne330WhiteFSome College3
25Jonna1653WhiteFCollege Grad3Viki3057WhiteFPost Grad1Marci2351WhiteFHigh School2Denise869WhiteFHigh School3
26Kathy2061WhiteFSome College3Joe1567WhiteMCollege Grad1Melanie2528WhiteFCollege Grad2Jonna1653WhiteFCollege Grad3
27Melanie23228WhiteFCollege Grad2Justin1836WhiteMCollege Grad1Michelle2658WhiteFSome College2Kathy2061WhiteFSome College3
28Michelle23358WhiteFSome College1Kyle2133WhiteMSome College1Ben259WhiteMSome College2Danny660WhiteMCollege Grad3
29Marci2351WhiteFHigh School2
30Melanie2528WhiteFCollege Grad2
31Michelle2658WhiteFSome College2
32Viki3057WhiteFPost Grad1
33Ben259WhiteMSome College2
34Danny660WhiteMCollege Grad3
35Joe1567WhiteMCollege Grad1
36Justin1836WhiteMCollege Grad1
37Kyle2133WhiteMSome College1
Sheet1
Cell Formulas
RangeFormula
J2:L2J2=COUNTIF($G$2:$G$37,J1)
N2:N7N2=UNIQUE(D2:D37)
O2:Q7O2=COUNTIFS($D$2:$D$37,$N2,$G$2:$G$37,O$1)
R2:R7,AJ2:AJ5,AD2:AD5,X2:X3R2=STDEV(O2:Q2)
T2:T3T2=UNIQUE(E2:E37)
U2:W3U2=COUNTIFS($E$2:$E$37,$T2,$G$2:$G$37,U$1)
Z2:Z5Z2=UNIQUE(F2:F37)
AA2:AC5AA2=COUNTIFS($F$2:$F$37,$Z2,$G$2:$G$37,AA$1)
AG2:AI2AG2=COUNTIFS($C$2:$C$37,"<="&$AF2,$G$2:$G$37,AG$1)
AG3:AI5AG3=COUNTIFS($C$2:$C$37,"<="&$AF3,$G$2:$G$37,AG$1)-SUM(AG$2:AG2)
R9,AJ9,AD9,X9R9=SUM(R2:R8)
N11N11=SUM(N9:AK9)
I17:O28,Y17:AE28,Q17:W28I17=FILTER($A2:$G37,$G2:$G37=J15)
Dynamic array formulas.
Named Ranges
NameRefers ToCells
solver_adj=Sheet1!$G$2:$G$37Y17, Q17, I17, AG2:AI5, AA2:AC5, U2:W3, O2:Q7, J2:L2
solver_lhs1=Sheet1!$G$2:$G$37Y17, Q17, I17, AG2:AI5, AA2:AC5, U2:W3, O2:Q7, J2:L2
solver_lhs2=Sheet1!$G$2:$G$37Y17, Q17, I17, AG2:AI5, AA2:AC5, U2:W3, O2:Q7, J2:L2
solver_lhs3=Sheet1!$G$2:$G$37Y17, Q17, I17, AG2:AI5, AA2:AC5, U2:W3, O2:Q7, J2:L2
 
Upvote 0

Forum statistics

Threads
1,224,586
Messages
6,179,729
Members
452,939
Latest member
WCrawford

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top