VBA: Macro to flatten out dataset

coffeenoir

New Member
Joined
Jul 11, 2017
Messages
1
I’m working with a dataset that has 6 columns and approximately 5000 rows. Column A contains the name of a training course. Columns B through F have event types. Each training course can have multiple event types. So, for example:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Row/Column[/TD]
[TD]A[/TD]
[TD="align: center"]B[/TD]
[TD="align: center"]C[/TD]
[TD="align: center"]D[/TD]
[TD="align: center"]E[/TD]
[TD="align: center"]F[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Name[/TD]
[TD="align: center"]Event Type 1[/TD]
[TD="align: center"]Event Type 2[/TD]
[TD="align: center"]Event Type 3[/TD]
[TD="align: center"]Event Type 4[/TD]
[TD="align: center"]Event Type 5[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]T4[/TD]
[TD]Workshop[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]T7[/TD]
[TD]Workshop[/TD]
[TD]Drill[/TD]
[TD]Exercise[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]T9[/TD]
[TD]Drill[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]T6[/TD]
[TD]Exercise[/TD]
[TD]Drill[/TD]
[TD]Workshop[/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]

I would like to create a macro that flattens this dataset out. I’m looking for a macro that accomplishes the following:


  1. Looks at the Event Type 2 column (Column C)
  2. If cell C2 has content, add a new row below that item
  3. Copy over the content from A2 into the new row
  4. Copy the content in C2 (Event 2) into C3 (Event 1) in the new row
  5. Do the same for the rest of the event types
  6. Do the same for the rest of the line items

So, after running the macro, I would like it to rearrange my example above into the following configuration:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Row/Column[/TD]
[TD]A[/TD]
[TD] B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[TD]E[/TD]
[TD]F[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Name[/TD]
[TD="align: center"]Event Type 1[/TD]
[TD="align: center"]Event Type 2[/TD]
[TD="align: center"]Event Type 3[/TD]
[TD="align: center"]Event Type 4[/TD]
[TD="align: center"]Event Type 5[/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]T4[/TD]
[TD]Workshop[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]T7[/TD]
[TD]Workshop[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]T7[/TD]
[TD]Drill[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]T7[/TD]
[TD]Exercise[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]6[/TD]
[TD]T9[/TD]
[TD]Drill[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]7[/TD]
[TD]T6[/TD]
[TD]Exercise[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]8[/TD]
[TD]T6[/TD]
[TD]Drill[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
[TR]
[TD]9[/TD]
[TD]T6[/TD]
[TD]Workshop[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]


Any help is appreciated!
 

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
Try the code below.
<acronym title="visual basic for applications" style="border-width: 0px 0px 1px; border-top-style: initial; border-right-style: initial; border-bottom-style: dotted; border-left-style: initial; border-top-color: initial; border-right-color: initial; border-bottom-color: rgb(0, 0, 0); border-left-color: initial; border-image: initial; cursor: help; color: rgb(51, 51, 51); background-color: rgb(250, 250, 250);">VBA</acronym> code should UnPivot Data into database format. Item in Bold where change range and column headers to repeat.
Macro code also checks and creates "UnPivot Data Files" folder wherever default location for <acronym title="visual basic for applications" style="border-width: 0px 0px 1px; border-top-style: initial; border-right-style: initial; border-bottom-style: dotted; border-left-style: initial; border-top-color: initial; border-right-color: initial; border-bottom-color: rgb(0, 0, 0); border-left-color: initial; border-image: initial; cursor: help; color: rgb(51, 51, 51); background-color: rgb(250, 250, 250);">VBA</acronym> executing this code.
Eg A new file would be created under "UnPivot Data Files" folder like "Data 2017-06-29 23-18-03.xlsx".

Please note two InputBox requires manual inputs
1) InputBox to capture the range to process
2) InputBox to capture RepeatColsCount on left side. In your case Input 1.

Hope the macro code below answers your question.



Create two Modules
1) modMain

Code:
Option Explicit


'List: The range to be normalized.
'RepeatingColsCount: The number of columns, starting with the leftmost,
'   whose headings remain the same.
'NormalizedColHeader: The column header for the rolled-up category.
'DataColHeader: The column header for the normalized data.
'NewWorkbook: Put the sheet with the data in a new workbook?
'
'NOTE: The data must be in a contiguous range and the
'rows that will be repeated must be to the left,
'with the rows to be normalized to the right.


Dim aStartTime
Const DblSpace As String = vbNewLine & vbNewLine
Const SpecialCharacters As String = "!,?,@,#,$,%,^,&,*,(,),{,[,],}"  'modify as needed
Dim strFullName As String
Dim ProcName As String
Dim bErrorHandle As Boolean
Dim SourceWbk As Workbook


Sub Test()
    Dim lRow As Long, lCol As Long
    Dim RepeatColsCount As Long
    Dim rngDel As Range
    Dim ActWbk As Workbook
    Dim UserRange As Range
    On Error GoTo errHandler
    bErrorHandle = False
    
    '~~> Start Timer
    aStartTime = Now()
    
    
    Set SourceWbk = ActiveWorkbook
    
    On Error Resume Next
    Set UserRange = Application.InputBox(Prompt:="Please Select Range", Title:="Range Select", Default:=Selection.Address, Type:=8)
    If UserRange Is Nothing Then
        MsgBox "You press Cancel, this procedure will now terminate."
        Exit Sub
    End If
    
    
redo:
    Dim v As Variant
    v = Application.InputBox(Prompt:="How many columns, at the left side will repeat?", Title:="Input Whole Numbers Only", Type:=1)
    If v = "False" Then
        bErrorHandle = True
        MsgBox "Terminate Processing", vbCritical
        GoTo BeforeExit
    End If
    
    '<~~ More Testing for positive Integer only
    If v < 1 Or Not (v = Int(v)) Then
        MsgBox "How many columns are needed - this must be a postive integer?", vbCritical
        GoTo redo
    End If
    RepeatColsCount = Int(v)
    
    '~~> Speeding Up VBA Code
    Call SpeedUp(False)
    
    Call NormalizeList(UserRange, RepeatColsCount, "Event Type", "Event Name", True)
    
    '~~> New UnPivot Data workbook
    Set ActWbk = ActiveWorkbook
    
    '~~> Remove Data
    Call FilterDel(ActWbk.Sheet(1), 3, "=")
    
    '~~>
    Call SaveAs(ActWbk)
    
BeforeExit:
    '~~> Remove items from memory
    Set ActWbk = Nothing
    Set SourceWbk = Nothing
    Set rngDel = Nothing
    
    
    '~~> Speeding Up VBA Code
    Call SpeedUp(True)
    
    If bErrorHandle = False Then
        MsgBox "Time taken: " & Format(Now() - aStartTime, "h:mm:ss") & vbNewLine _
            & DblSpace & " You're good to go!" & DblSpace & _
            "UnPivot Done" & Chr(169) & Chr(32) & Year(Date), vbInformation, "Excellent"
        
        '~~> Close Workbook with VBA Code too
        ThisWorkbook.Close False
    End If
    
    Exit Sub
errHandler:
    '~~> Error Occurred
    bErrorHandle = True
    ProcName = Application.VBE.ActiveCodePane.CodeModule.ProcOfLine(Application.VBE.ActiveCodePane.TopLine, 0)
    MsgBox "Procedure: - " & ProcName & DblSpace & Err.Description, vbCritical, "Oops I did it again...."
    Resume BeforeExit
    
End Sub


Sub NormalizeList(List As Excel.Range, RepeatingColsCount As Long, _
    NormalizedColHeader As String, DataColHeader As String, _
    Optional NewWorkbook As Boolean = False)


Dim FirstNormalizingCol As Long, NormalizingColsCount As Long
Dim ColsToRepeat As Excel.Range, ColsToNormalize As Excel.Range
Dim NormalizedRowsCount As Long
Dim RepeatingList() As String
Dim NormalizedList() As Variant
Dim ListIndex As Long, i As Long, j As Long
Dim wbSource As Excel.Workbook, wbTarget As Excel.Workbook
Dim wsTarget As Excel.Worksheet


With List
    'If the normalized list won't fit, you must quit.
    If .Rows.Count * (.Columns.Count - RepeatingColsCount) > .Parent.Rows.Count Then
        MsgBox "The normalized list will be too many rows.", _
            vbExclamation + vbOKOnly, "Sorry"
        Exit Sub
    End If
    
    'You have the range to be normalized and the count of leftmost rows to be repeated.
    'This section uses those arguments to set the two ranges to parse
    'and the two corresponding arrays to fill
    FirstNormalizingCol = RepeatingColsCount + 1
    NormalizingColsCount = .Columns.Count - RepeatingColsCount
    Set ColsToRepeat = .Cells(1).Resize(.Rows.Count, RepeatingColsCount)
    Set ColsToNormalize = .Cells(1, FirstNormalizingCol).Resize(.Rows.Count, NormalizingColsCount)
    NormalizedRowsCount = ColsToNormalize.Columns.Count * .Rows.Count
    ReDim RepeatingList(1 To NormalizedRowsCount, 1 To RepeatingColsCount)
    ReDim NormalizedList(1 To NormalizedRowsCount, 1 To 2)
End With




'Fill in every i elements of the repeating array with the repeating row labels.
For i = 1 To NormalizedRowsCount Step NormalizingColsCount
    ListIndex = ListIndex + 1
    For j = 1 To RepeatingColsCount
        RepeatingList(i, j) = List.Cells(ListIndex, j).Value2
    Next j
Next i




'We stepped over most rows above, so fill in other repeating array elements.
For i = 1 To NormalizedRowsCount
    For j = 1 To RepeatingColsCount
        If RepeatingList(i, j) = "" Then
            RepeatingList(i, j) = RepeatingList(i - 1, j)
        End If
    Next j
Next i




'Fill in each element of the first dimension of the normalizing array
'with the former column header (which is now another row label) and the data.
With ColsToNormalize
    For i = 1 To .Rows.Count
        For j = 1 To .Columns.Count
            NormalizedList(((i - 1) * NormalizingColsCount) + j, 1) = .Cells(1, j)
            NormalizedList(((i - 1) * NormalizingColsCount) + j, 2) = .Cells(i, j)
        Next j
    Next i
End With


'Put the normal data in the same workbook, or a new one.
If NewWorkbook Then
    Set wbTarget = Workbooks.Add
    Set wsTarget = wbTarget.Worksheets(1)
Else
    Set wbSource = List.Parent.Parent
    With wbSource.Worksheets
        Set wsTarget = .Add(After:=.Item(.Count))
    End With
End If




With wsTarget
    'Put the data from the two arrays in the new worksheet.
    .Range("A1").Resize(NormalizedRowsCount, RepeatingColsCount) = RepeatingList
    .Cells(1, FirstNormalizingCol).Resize(NormalizedRowsCount, 2) = NormalizedList
    
    
    'At this point there will be repeated header rows, so delete all but one.
    .Range("1:" & NormalizingColsCount - 1).EntireRow.Delete
    
    
    'Add the headers for the new label column and the data column.
    .Cells(1, FirstNormalizingCol).Value = NormalizedColHeader
    .Cells(1, FirstNormalizingCol + 1).Value = DataColHeader
End With
End Sub


Private Sub SaveAs(wbDestination As Workbook)
    Dim strFile As String
    Dim NewFile As String
    Dim FileExtStr As String
    Dim FileFormatNum As Long
    Dim bIsDirectoryExist As Boolean
    Const ANewString As String = "Data"
    Const newFolder As String = "UnPivot Data Files"
    Const sExt As String = ".xlsx"
    
    On Error GoTo errHandler
    '~~> Speeding Up VBA Code
    Call SpeedUp(False)
    
    
    With wbDestination
        NewFile = Left(.Name, Len(.Name) - 5) & ANewString
        FileExtStr = sExt
        
        '~~> Checks If Directory Exists
        bIsDirectoryExist = IsDirectoryExist(SourceWbk.Path & Application.PathSeparator & newFolder)
        
        Select Case FileExtStr
        Case ".xlsb": FileFormatNum = 50
        Case ".xlsx": FileFormatNum = 51
        Case ".xlsm": FileFormatNum = 52
        Case ".xls": FileFormatNum = 56
        Case ".csv": FileFormatNum = 6
        Case ".txt": FileFormatNum = -4158
        Case ".prn": FileFormatNum = 36
        Case Else: FileFormatNum = 0
        End Select
        
        
        'Now we can create/Save the file with the xlFileFormat parameter
        'value that match the file extension
        If FileFormatNum = 0 Then
            MsgBox "Sorry, unknown file extension"
        Else
            '~~> Delete Parameters worksheet tab
            On Error Resume Next
            Sheets("Parameters").Delete
            On Error GoTo 0
            
            
            '~~> Saveas default file path location
            .SaveAs Filename:=SourceWbk.Path & Application.PathSeparator & newFolder & Application.PathSeparator & NewFile & " " & Format(Now(), "yyyy-MM-dd hh-mm-ss"), FileFormat:=FileFormatNum
            '~~> Save and close new file
            .Saved = True
            .Close
            
        End If
    End With
    
    
BeforeExit:
    '~~> Remove items from memory
    Set wbDestination = Nothing
    
    
    '~~> Speeding Up VBA Code
    Call SpeedUp(True)
    
    '    If bErrorHandle = False Then
    '        MsgBox "Time taken: " & Format(Now() - aStartTime, "h:mm:ss") & vbNewLine _
        '            & DblSpace & " You're good to go!" & DblSpace & _
        '            "Job Done! " & Chr(169) & Chr(32) & Year(Date), vbInformation, "Excellent"
    '    End If
    
    Exit Sub
errHandler:
    '~~> Error Occurred
    bErrorHandle = True
    ProcName = Application.VBE.ActiveCodePane.CodeModule.ProcOfLine(Application.VBE.ActiveCodePane.TopLine, 0)
    MsgBox "Procedure: - " & ProcName & DblSpace & Err.Description, vbCritical, "Oops I did it again...."
    Resume BeforeExit
    
End Sub


[COLOR=#333333]
[/COLOR]


2.modTools

Code:
Option Explicit


Public Function SpeedUp(Optional bSpeed As Boolean = True)
With Application
.ScreenUpdating = bSpeed 'Prevent screen flickering
.Calculation = IIf(bSpeed, xlAutomatic, xlCalculationManual) 'Preventing calculation
.DisplayAlerts = bSpeed 'Turn OFF alerts
.EnableEvents = bSpeed 'Prevent All Events
End With
End Function


Function IsDirectoryExist(newFolder As String) As Boolean
If Len(Dir(newFolder, vbDirectory)) = 0 Then
MkDir newFolder
End If
End Function


'************************************************************************************************
'* VBA to filter by Worksheet, Field and Criteria and delete items
'*
'*************************************************************************************************
Function FilterDel(sh As Worksheet, iField As Long, sCriteria As String)
Dim lCount As Long, iRecordcount As Long, rowsAffected As Long
Dim rnData As Range, rngArea As Range, rnDataFilter As Range, aCell As Range
Dim ans As Long, iLastColumn As Long
Dim sField As String, sLast_Types As String


iLastColumn = LastCol(sh)
'~~> Find Column Letter for Last Column Number
sLast_Types = Col_Letter(iLastColumn)


With sh
'~~> Defines Autofilter range
Set rnData = .Range("$A$1:$" & sLast_Types & "$" & .Range("A" & Rows.Count).End(xlUp).Row)

With rnData
.AutoFilter Field:=iField, Criteria1:=sCriteria


'~~> Counts filtered records
For Each rngArea In .SpecialCells(xlCellTypeVisible).Areas
lCount = lCount + rngArea.Rows.Count
Next rngArea
iRecordcount = lCount - 1
'MsgBox "Autofilter " & lcount - 1 & " records"

'~~> Defines Auto Filter excluding Headers
On Error Resume Next
Set rnDataFilter = rnData.Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible)
On Error GoTo 0


'~~> Delete Filitered Records
If Not rnDataFilter Is Nothing Then rnDataFilter.EntireRow.Delete
'~~> Remove the AutoFilter
sh.AutoFilterMode = False

End With

End With
End Function


Function Col_Letter(lngCol As Long) As String 'this is for the 2 way look up
Dim vArr
vArr = Split(Cells(1, lngCol).Address(True, False), "$")
Col_Letter = vArr(0)
End Function


Function LastCol(sh As Worksheet)
On Error Resume Next
LastCol = sh.Cells.Find(What:="*", _
After:=sh.Range("A1"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByColumns, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Column
On Error GoTo 0
End Function
[COLOR=#333333]
[/COLOR]


Kind Regards

Biz
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,992
Messages
6,175,825
Members
452,672
Latest member
missbanana

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top