Selecting a random 10% of data from different tabs

Tori Murphy

New Member
Joined
Jul 5, 2022
Messages
10
Office Version
  1. 2013
Platform
  1. Windows
Hi Everyone.

I am a total newbie to Macros, I love them, think they are amazing but have no clue how to use them or make them so I am praying someone out there can help me. I have googled this to hell and back and I'm still none the wiser!

I have an excel document with 5 tabs containing data. I would like to select a random 10% from each tab so that information can be placed in a new 'audit tab'. I guess it would be like a VLookup but randomly and without an originating source.

God this sounds so much easier now I've typed it.

Any help would be super appreciated.

Thanks

Tori
 
Press the Ctrl + F11 keys at the same time, so you can see VBE(Visual Basic Editor) screen. Press the Alt + I keys at the same time then press the M key, to a standard module will be added. Please all the code lines below in there. Then you can run a macro named "RandomSelect".

To make sure all the transferred data are unique, you can check a log at the immediate window on VBE. To see the immediate window, activate VBE then press Ctrl +G keys at the same time.

VBA Code:
Option Explicit

Const Level_1 As String = "Assurance Check CM's"    'change here to your Tab6 name
Const Level_2 As String = "Assurance Check Gov"    'change here to your Tab7 name

Sub RandomSelect()
    Dim sh As Worksheet
    Application.ScreenUpdating = False
    'Clear worksheets for result
    ThisWorkbook.Sheets(Level_1).Cells.Clear
    ThisWorkbook.Sheets(Level_2).Cells.Clear

    'Level 1 Check
    For Each sh In ThisWorkbook.Worksheets     'loop all the worksheets
        If sh.Name <> Level_1 And sh.Name <> Level_2 Then
            Call CommonRoutine(sh, Sheets(Level_1))
        End If
    Next

    'Level 2 Check
    Set sh = ThisWorkbook.Sheets(Level_1)
    Call CommonRoutine(sh, Sheets(Level_2))
    Application.ScreenUpdating = True
    
    MsgBox "Done"
End Sub

Sub CommonRoutine(ByVal sh As Worksheet, ByVal TargetSh As Worksheet)
    Const percentage As Long = 10
    Dim i As Long, myNum As Long, lLastRow As Long
    Dim myFlag() As Boolean
    Randomize    'initialize
    lLastRow = sh.Range("A" & Rows.Count).End(xlUp).Row     'last row
    ReDim myFlag(1 To lLastRow - 1)    'array for judjeng unique

    For i = 1 To (lLastRow - 1) * (percentage / 100)  '10%
        Do    'generate unique random numbers
            myNum = Int((lLastRow - 2 + 1) * Rnd + 1)
        Loop Until myFlag(myNum) = False

        Debug.Print sh.Name & " - Row # " & myNum & "has been transferred"    'just for checking which data was chosen

        'Copy Data to the result worksheet
        sh.Cells(myNum, 1).EntireRow.Copy TargetSh.Cells(Rows.Count, 1).End(xlUp).Offset(1)

        myFlag(myNum) = True
    Next
End Sub
 
Upvote 0

Excel Facts

Does the VLOOKUP table have to be sorted?
No! when you are using an exact match, the VLOOKUP table can be in any order. Best-selling items at the top is actually the best.
Murphy

The 4 datasheets contain a table with a lot of empty listrows.
lLastRow = sh.Range("A" & Rows.Count).End(xlUp).Row 'last row
That number 'll show the number of the last row of that table, not the number of the last used line.

VBA Code:
Option Compare Text

Sub random10()
     Dim a(), iSh, DC()
     t = Timer
     Set dict = CreateObject("scripting.dictionary")     'dictionary that collects all your data

     For iSh = 1 To 5    'loop first 5 worksheets
          Set sh = ThisWorkbook.Worksheets(iSh)
          With sh
               Set c = .UsedRange.Columns(1).SpecialCells(xlConstants)     'the cells filled with data in the 1st column of the usedrange
               If c.Areas.Count > 1 Then MsgBox "their addresses are " & c.Address, vbInformation, UCase("multiple areas in " & .Name)
               If c.Cells(1).Address <> "$A$1" Then MsgBox "no data in A1", vbInformation, UCase(.Name)
               ll = c.Areas(1).Rows.Count - 2     'the first area minus A1:A2
               If ll < 1 Then GoTo volgende
               arr = c.Areas(1).Offset(2).Resize(ll, 5).Value2     'read 5 columns to array
          End With
          seq = Evaluate("transpose(row(A1:A" & ll & "))")     'hopefully your last line < 65.000 (problem with the transpose)
          If UBound(arr) > UBound(seq) Then MsgBox "problem, sheet " & sh.Name & " oversized"

          ptr = 0     'reset pointer
     'we want to random pick a row, but if that row wasn't okay, later never pick it again !!!!
          For i = UBound(seq) To 1 Step -1    'in the 1st loop, you can choose all the numbers, 2nd loop = 1 less, etc
               r0 = Application.RandBetween(1, i)     'random number between 1 and the actual max
               r = seq(r0)
               If Len(arr(r, 1)) > 0 Then     'there was data in that cell
                    ReDim a(1 To 5)     'of this line we reserve 5 elements to be added to the dictionary
                    For j = 1 To 5     'in a loop copy the other cells in that same row
                         a(j) = arr(r, j)
                    Next
                    dict.Add dict.Count, a      'add to dictionary
                    ptr = ptr + 1     'increment pointer
                    If ptr >= UBound(arr) * 0.1 Then Exit For      '10% achieved : YOU CAN MODIFY THIS TO 0.99 AND CHECK IF THERE ARE NO DUPLICATES IN THE TABLE
               End If
               seq(r0) = seq(i)     'move the last number of that loop to the position of r0, so every row 'll only be used once !! <<< VERY IMPORTANT ROW !!!!!
          Next

volgende:
     Next

     If dict.Count Then
          arr = Application.Index(dict.items, 0, 0)
          seq = Evaluate("transpose(row(A1:A" & UBound(arr) & "))")     'hopefully your last line < 65.000 (problem with the transpose)
          ReDim DC(1 To 1, 1 To 1)
          ReDim DC(1 To WorksheetFunction.RoundUp(UBound(arr) * 0.1, 0), 1 To 1)
          ptr = 0     'reset pointer
     'we want to random pick a row, but if that row wasn't okay, later never pick it again !!!!
          For i = UBound(seq) To 1 Step -1    'in the 1st loop, you can choose all the numbers, 2nd loop = 1 less, etc
               r0 = Application.RandBetween(1, i)     'random number between 1 and the actual max
               r = seq(r0)
               If Len(arr(r, 1)) > 0 Then     'there was data in that cell
                    ptr = ptr + 1
                    DC(ptr, 1) = r
                    If ptr >= UBound(DC) Then Exit For      '10% achieved : YOU CAN MODIFY THIS TO 0.99 AND CHECK IF THERE ARE NO DUPLICATES IN THE TABLE
               End If
               seq(r0) = seq(i)     'move the last number of that loop to the position of r0, so every row 'll only be used once !! <<< VERY IMPORTANT ROW !!!!!
          Next

          With Worksheets("level 1 Check").Range("A" & Rows.Count).End(xlUp).Offset(2).Resize(dict.Count, UBound(a))
               .Value = Application.Index(dict.items, 0, 0)
               MyLayout .Offset(0)
          End With

          seq = Evaluate("transpose(row(A1:A" & UBound(a) + 1 & "))")   'hopefully your last line < 65.000 (problem with the transpose)
          With Worksheets("Level 2 Check").Range("A" & Rows.Count).End(xlUp).Offset(2).Resize(UBound(DC), UBound(a))
               .Value = Application.Index(dict.items, DC, seq)
               MyLayout .Offset(0)
          End With

     End If
     Application.CutCopyMode = False

     MsgBox "Ready in " & Format(Timer - t, "0.00\s"), vbInformation, UCase("Statistics")
End Sub

Sub MyLayout(MyRange)

     With MyRange
          With .Font
               .Name = "Calibri"     'or another font
               .FontStyle = "General"     'i hope this is the correct term in english (???)
               .Size = 12
          End With

          .Offset(, .Columns.Count - 2).Resize(, 2).NumberFormat = "dd/mm/yy"
          .HorizontalAlignment = xlCenter

          With .Borders
               .LineStyle = xlContinuous
               .ColorIndex = 0
               .TintAndShade = 0
               .Weight = xlThin
          End With

     End With
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,248
Messages
6,171,027
Members
452,374
Latest member
keccles

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top