VBA to take a Random 10% sample of rows with data in column A.

anewman5high

New Member
Joined
Aug 25, 2017
Messages
11
Hello,

I'm trying to get a piece of code to sample 10% of rows in a sheet by copying them into another sheet.

I've managed to do this but the caveat is that I only want rows which have data in Column A. The way I've tried to do this so far is by hiding anything without data in column A but the vba still copies the hidden ones over.

This is the code so far:

Code:
Sub TakeSample()
Dim LastRow As Long
Dim NbRows As Long
Dim RowList()
Dim i As Long, J As Long, k As Long
Dim RowNb As Long
Dim s As String
Sheets("All Data").Activate


Application.ScreenUpdating = False
For i = 1 To Range("A" & Rows.Count).End(xlUp).Row
    s = i & ":" & i
    If IsEmpty(Cells(i, 1).Value) Then
         Rows(s).EntireRow.Hidden = True
    End If
Next
Application.ScreenUpdating = True
    
    
    Sheets("All Data").Activate
    
    LastRow = Range("A" & Rows.Count).End(xlUp).Row
    NbRows = LastRow * 0.1
    'NbRows = IIf(LastRow < 200, LastRow * 0.1, 10)
    ReDim RowList(1 To NbRows)
    k = 1
    For i = 1 To NbRows
        RowNb = Rnd() * LastRow
        For J = 1 To k
            If (RowList(J) = RowNb) Then GoTo NextStep
        Next J
        RowList(k) = RowNb
        Rows(RowNb).Copy Destination:=Sheets("1 in 10 Sample").Cells(k, "A")
        k = k + 1
NextStep:
    Next i
End Sub

Bonus points if the solution can only select from rows where the data in column A starts with UR.

Even more bonus points if the code can copy the rows below the one picked at random until it finds another row with data in column A.

Thanks in advance, let me know if you need more info!

Alan
 

Excel Facts

Using Function Arguments with nested formulas
If writing INDEX in Func. Arguments, type MATCH(. Use the mouse to click inside MATCH in the formula bar. Dialog switches to MATCH.
Hello,

I'm trying to get a piece of code to sample 10% of rows in a sheet by copying them into another sheet.

I've managed to do this but the caveat is that I only want rows which have data in Column A. The way I've tried to do this so far is by hiding anything without data in column A but the vba still copies the hidden ones over.

Bonus points if the solution can only select from rows where the data in column A starts with UR.

Even more bonus points if the code can copy the rows below the one picked at random until it finds another row with data in column A.

Thanks in advance, let me know if you need more info!

Alan


I doubt this is final, but let me know where we need to go from here.

Code:
Sub Cherry_Picker()
    For i = 0 To 9
        Randomize Timer
        IsValid = False
        Do Until IsValid
            rand_row = CInt(Rnd * Sheets("All Data").UsedRange.Rows.Count / 10) + (1 + (Sheets("All Data").UsedRange.Rows.Count / 10) * i)
            If Left(Sheets("All Data").Cells(rand_row, 1), 2) = "UR" Then
                IsValid = True
                Valid_Count = Valid_Count + 1
                Sheets("All Data").Rows(rand_row).Copy Destination:=Sheets("1 in 10 Sample").Cells(Valid_Count, "A")
            End If
        Loop
    Next i
End Sub

This code will look at your number of rows, and then pick 1 result from each 10% chunk (ie: if 1,000 rows, it will pick a sample from 1-100, then from 201 to 300, then 301 to 400, etc), if column A does not start with "UR" it will pick another sample.
 
Upvote 0
I doubt this is final, but let me know where we need to go from here.

Code:
Sub Cherry_Picker()
    For i = 0 To 9
        Randomize Timer
        IsValid = False
        Do Until IsValid
            rand_row = CInt(Rnd * Sheets("All Data").UsedRange.Rows.Count / 10) + (1 + (Sheets("All Data").UsedRange.Rows.Count / 10) * i)
            If Left(Sheets("All Data").Cells(rand_row, 1), 2) = "UR" Then
                IsValid = True
                Valid_Count = Valid_Count + 1
                Sheets("All Data").Rows(rand_row).Copy Destination:=Sheets("1 in 10 Sample").Cells(Valid_Count, "A")
            End If
        Loop
    Next i
End Sub

This code will look at your number of rows, and then pick 1 result from each 10% chunk (ie: if 1,000 rows, it will pick a sample from 1-100, then from 201 to 300, then 301 to 400, etc), if column A does not start with "UR" it will pick another sample.

That's great thanks, it almost does exactly what I was hoping! I have two follow up questions if that's OK.

1. is there a way to copy all rows below the one it's taken at random until it finds the next cell that starts with UR? Essentially the cell starting with UR is the name of the item and the rows below have some information in them with would be useful to copy until it hits the next instance of UR in column A.

2. This is less important but can the count be based on the number of rows which start with UR in column A instead of the total number of filled rows? I've played around with the "UsedRange.Rows.Count" line but I can't get it to work and am probably doing it very wrong!

Thanks again, even if the above two points aren't possible then what you've done is massively useful!

Alan
 
Upvote 0
That's great thanks, it almost does exactly what I was hoping! I have two follow up questions if that's OK.

1. is there a way to copy all rows below the one it's taken at random until it finds the next cell that starts with UR? Essentially the cell starting with UR is the name of the item and the rows below have some information in them with would be useful to copy until it hits the next instance of UR in column A.

2. This is less important but can the count be based on the number of rows which start with UR in column A instead of the total number of filled rows? I've played around with the "UsedRange.Rows.Count" line but I can't get it to work and am probably doing it very wrong!

Thanks again, even if the above two points aren't possible then what you've done is massively useful!

Alan


Please clarify both of your requests. Provide me some examples if possible. I have no doubt what you want is possible. I just need to make sure I fully understand it.
 
Upvote 0

Forum statistics

Threads
1,224,812
Messages
6,181,091
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top