Conditional Random Sampling

rizzcombat

New Member
Joined
Feb 22, 2016
Messages
5
I have a single document that has individuals personal information (first name, last name, demographics, etc) listed as the headers in the first column. One of the demographics is which state they're affiliated with. Each individual has a column with their states 2 letter abbreviation associated with them.


I need to create a random of 50% for each state, individually. In other words if Nevada as 200 individuals and Utah has 500 individuals, I'd need to randomly select 100 and 250 from each list, respectably. Obviously I know I could break the file up into 50 different sample groups, then recombine, but I wasn't sure if there was a way to achieve this sampling.

Any suggestions how to do this easily?
 

Excel Facts

Easy bullets in Excel
If you have a numeric keypad, press Alt+7 on numeric keypad to type a bullet in Excel.
Two possible ways come to mind. Both start with adding a column with =RAND() as the formula. Put that in the top data row, and copy down to the end.

The first way is just to sort the data first by the state column, then by the RAND() column, then take the top n rows.

You could use a COUNTIF formula to find out how many items are in each state.

The second way is to filter the rows by that column, select Number Filter, Top 10, and change the 10 to the number of rows you want. There are other ways, including VBA, but try those first and see if they work for you.
 
Last edited:
Upvote 0
Hi Eric,

Thanks for the help. Unfortunately, because I need to do this for 50 states and each state would have a different number of people to sort these two approaches would still take a significant amount of time. Is there any other types of conditional formatting you can recommend? I'm happy to use VBA if you'd like to suggest that.

-Tom
 
Upvote 0
If you need to do it for every state, then it could take quite a bit of time if you have to do it manually. A VBA solution might be best. If you'd like that, let me know and I'll write something up. You do need to provide some additional information.

What sheet is your data on? What columns have each field? Are there headers? Any blank lines or subtotal lines? (If you could provide a sample of that data using the HTML maker link in my signature, that could help.)

How do you want the results presented? A sheet for each state? One sheet just like the original sheet, but just containing the random records? Something else?

Let me know.
 
Last edited:
Upvote 0
Thank you very much for your help.

The data will be on sheet 1 - it's currently a CSV (so there's only one page). The very first column has the field, it's labeled as "CouncilCodeName". There will not be any blank lines or subtotal lines. My work computer has permissions in place where I can't edit Excel to allow the HTML maker - however I've attached a sample document to view via Box.com: https://app.box.com/s/88yatmlch8rsdbpizxrqv6dclj04v1ty

Lastly I would prefer to have a new column created that I could filter by, essentially a dummy variable "Y/N" or "1/0" to include/exclude them from the sample. I don't care if they end up on a new sheet - or if they are pasted onto the existing document. Thanks
 
Upvote 0
Based on the spreadsheet you uploaded, here's a macro that will flag 50% of the records, by state, for filtering. You can change the percentage to anything, but there will be a minimum of 1 record selected from each state.

1) Open your workbook (CSV file)
2) Right click on the sheet tab on the bottom and select View Code
3) Paste this macro onto the sheet that opens:
Rich (BB code):
Sub StateSampler()
Dim Percent As Integer, r As Long, LastRow As Long, wk As String, States As Object

    Percent = 50
    Set States = CreateObject("Scripting.Dictionary")
    
    LastRow = Columns("M:M").End(xlDown).Row
    Application.ScreenUpdating = False
    
    For r = 2 To LastRow
        wk = Cells(r, "M")
        States.Item(wk) = States.Item(wk) + Percent / 100
        Cells(r, "W") = r
        Cells(r, "X") = Rnd()
    Next r
    
    With ActiveSheet.Sort
        .SortFields.Clear
        .SortFields.Add Key:=Range("M2")
        .SortFields.Add Key:=Range("X2")
        .SetRange Range("A1:X" & LastRow)
        .Header = xlYes
        .Apply
    End With
    
    For r = 2 To LastRow
        wk = Cells(r, "M")
        If States.Item(wk) > 0 Then
            Cells(r, "X") = 1
            States.Item(wk) = States.Item(wk) - 1
        Else
            Cells(r, "X") = 0
        End If
    Next r
    
    With ActiveSheet.Sort
        .SortFields.Clear
        .SortFields.Add Key:=Range("W2")
        .SetRange Range("A1:X" & LastRow)
        .Header = xlYes
        .Apply
    End With
    
    Columns("W:W").EntireColumn.Delete
    
End Sub
4) Change the percentage in red to the number you want. (You may also note in the code references to "M" (the state column) and W and X (work columns), which you can change as needed. The 1/0 flag is in column W.
5) Run the macro by pressing F5.

Depending on what you want to do at this point, you can delete the code and save it as a macro-free workbook, or save it with the code. You can close the VBA editor with the red X, or Alt-Q.

There are probably other methods, and maybe some of the real VBA gurus will chime in, but this does work. Let me know how this works for you.
 
Last edited:
Upvote 0
Thank you very, very much Eric. This appears to work perfectly.

If i have any other questions, please let me know. Very much appreciate your assistance.

-Tom
 
Upvote 0
Based on the spreadsheet you uploaded, here's a macro that will flag 50% of the records, by state, for filtering. You can change the percentage to anything, but there will be a minimum of 1 record selected from each state.

1) Open your workbook (CSV file)
2) Right click on the sheet tab on the bottom and select View Code
3) Paste this macro onto the sheet that opens:
Rich (BB code):
Sub StateSampler()
Dim Percent As Integer, r As Long, LastRow As Long, wk As String, States As Object

    Percent = 50
    Set States = CreateObject("Scripting.Dictionary")
   
    LastRow = Columns("M:M").End(xlDown).Row
    Application.ScreenUpdating = False
   
    For r = 2 To LastRow
        wk = Cells(r, "M")
        States.Item(wk) = States.Item(wk) + Percent / 100
        Cells(r, "W") = r
        Cells(r, "X") = Rnd()
    Next r
   
    With ActiveSheet.Sort
        .SortFields.Clear
        .SortFields.Add Key:=Range("M2")
        .SortFields.Add Key:=Range("X2")
        .SetRange Range("A1:X" & LastRow)
        .Header = xlYes
        .Apply
    End With
   
    For r = 2 To LastRow
        wk = Cells(r, "M")
        If States.Item(wk) > 0 Then
            Cells(r, "X") = 1
            States.Item(wk) = States.Item(wk) - 1
        Else
            Cells(r, "X") = 0
        End If
    Next r
   
    With ActiveSheet.Sort
        .SortFields.Clear
        .SortFields.Add Key:=Range("W2")
        .SetRange Range("A1:X" & LastRow)
        .Header = xlYes
        .Apply
    End With
   
    Columns("W:W").EntireColumn.Delete
   
End Sub
4) Change the percentage in red to the number you want. (You may also note in the code references to "M" (the state column) and W and X (work columns), which you can change as needed. The 1/0 flag is in column W.
5) Run the macro by pressing F5.

Depending on what you want to do at this point, you can delete the code and save it as a macro-free workbook, or save it with the code. You can close the VBA editor with the red X, or Alt-Q.

There are probably other methods, and maybe some of the real VBA gurus will chime in, but this does work. Let me know how this works for you.
Hi Eric,

This is great, I'm trying to adapt it to my use case - QA purposes - which is a bit different but I'm having trouble doing so.

I have a dataset with info from cases that need to be audited (4k+ lines) and I need to draw samples based on a chart on a different sheet, this chart tells me how many lines have to be audited for each person per country.

I made this example to better explain what I'm trying to do:


In Sheet 1 I would need the macro to select X amount of rows for each analyst based on the info in Sheet 2 - so for the first person "Maci Huber" I would need it to select and highlight 14 rows for NZ and 13 rows for UK randomly.

That way each analyst would know which lines to audit by filtering according a color code.

Example: Analyst 1 filters by red and gets the lines he has to work on.

Before selecting the sample I also need to filter out everything that says "no" on column "C" Sheet 1

I'm learning VBA and most of what I do is simple code and try to make up frankenstein code from what I find online but this one is getting to tricky for me.

I would really appreciate your help

- Dan
 
Upvote 0
Welcome to the MrExcel forum!

First, it's better to open a new thread with a question rather than append to one that's 6 years old. I am an old-timer so you're lucky, but more than likely no one will see it.

Next, some people can't or won't download files from the internet for security reasons. I'm one of those, so I'm sorry I can't see your spreadsheet. There is a tool here called XL2BB (see a link in my signature or the reply box) which shows you how to get it. It lets you show a representative sample of your data in a manner that makes it easy for the helpers here to copy your data to a test workbook. It's easy to download, install, and use.

So if you can show a sample, it looks like you'll need to show samples of 2 sheets, I'll gladly take a look. If not, then please open a new thread with your question. There are some people here who will download files, and one of them will probably take a look.
 
Upvote 0

Forum statistics

Threads
1,223,164
Messages
6,170,444
Members
452,326
Latest member
johnshaji

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top