Separate large data set onto multiple tabs

crimsonexcel

New Member
Joined
Feb 22, 2011
Messages
17
Hi,

I have a large data set (150,000 rows). Each row has 16 columns. I need to separate this data set onto separate tabs based on the text description in column E.

For example there may be 2,000 rows with a description like "Swedish actions in the second world war" I need for excel to recognize the country in the description "Swedish" and pull all rows like this to a new tab in the document called "Swedish". I actually already have the tabs created (about 50 countries) but realize it may be easier to start from scratch.

To further the example, the next 8,000 rows might have titles like "British icons in the 18th century" or "British revolutions and their outcomes". These 50,000 rows may have different descriptions but they are all common in that they start with "British" Again, would need to pull this data to a new tab called "British".

It looks like the first word of every description line is in fact the country if that is at all helpful.

To slightly complicate things, I have around 60 excel files that are all 500,000 rows and need to be sorted this way. Ideally I could drop a master data list (i.e. 1 file) into the first tab of a master document with all of these tabs and have it sort into each tab. I would then repeat this with the next file so that in the end, I have one master file that has everything sorted into tabs (with data collected from all 60 excel files)

Obviously realize I will probably need to do some manual labor to make this happen, just trying to cut my work time down from weeks to days or hours.

Any help would be much appreciated. I have not posted before, so please let me know if you need further information.

Thanks!
 
CrimsonExcel,

Try this code on a single worksheet and confirm that it works in the manner you want. After you confirm this, we'll work on getting it to work on all files in your folder to consolidate down to a single master file:

Code:
Public Sub CrimsonExcel()
Dim i               As Long, _
    LR              As Long, _
    x               As Variant, _
    CountryNames    As Variant, _
    ws              As Worksheet, _
    wb              As Workbook
 
' Initial Variable Definitions
CountryNames = Array("Argentine", "Australian", "Austrian", "Belgian", "Brazilian", "Bulgarian", _
                     "Croatian", "Cypriot", "Czech", "Danish", "Dutch", "English", "Finnish", _
                     "French", "German", "Greek", "Hungarian", "Irish", "Italian", "Japanese", _
                     "Mexican", "Northern Ireland", "Norwegian", "Polish", "Portuguese", "Romanian", _
                     "Russian", "Scottish", "Slovakian", "Slovenian", "Spanish", "Swedish", "Swiss", _
                     "Turkish", "Ukranian", "UnitedStates", "Welsh")
Set wb = ActiveWorkbook
Set ws = wb.ActiveSheet
LR = ws.Range("E" & Rows.Count).End(xlUp).row
 
Application.ScreenUpdating = False
 
' Create a worksheet for each country name.
For i = LBound(CountryNames) To UBound(CountryNames)
    If Not WorksheetExists(CountryNames(i)) Then
        Sheets.Add After:=Sheets(Sheets.Count)
        ActiveSheet.Name = CountryNames(i)
        ws.Rows(1).Copy Destination:=Sheets(CountryNames(i)).Range("A1")
    End If
Next i
 
' Loop through column E and place rows in corresponding workbooks
For i = 1 To LR
    x = Application.Match(Left(ws.Range("E" & i).Value, InStr(ws.Range("E" & i).Value & " ", " ") - 1), CountryNames, 0)
    If Not IsError(x) Then
        With Sheets(CountryNames(x - 1))
            ws.Rows(i).Copy Destination:=.Range("A" & .Range("E" & Rows.Count).End(xlUp).row + 1)
        End With
    End If
Next i
 
Application.ScreenUpdating = True
 
End Sub

@Daverunt - I appreciate the accolades. By all means, post your ideas as well. I may know Excel, but I'm definitely not the best! The great thing about these forums is that we are a combined mind - the more ideas/approaches we come up with, the better chance we will find not only the right solution, but also find the right solution for the user. Also, there very well could be something you post that I'm not well versed in that could be used in future code. :)
 
Last edited:
Upvote 0

Excel Facts

Format cells as date
Select range and press Ctrl+Shift+3 to format cells as date. (Shift 3 is the # sign which sort of looks like a small calendar).
Thanks to both of you for this amazing help. Sorry, but I am incredibly new to macros. I am not too bad at basica excel but am not sure how to even implement this. Sorry for the basic questions, but can you walk me through how to actually run this?
 
Upvote 0
Thanks to both of you for this amazing help. Sorry, but I am incredibly new to macros. I am not too bad at basica excel but am not sure how to even implement this. Sorry for the basic questions, but can you walk me through how to actually run this?

First off, make sure to try this on a copy of your worksheet. That way if the macro does do what you want it to do, you won't lose valuable data.

To install/run the code, in the workbook you want it to run in.
  1. Go to Tools>Macro>Visual Basic Editor (Shortcut: Alt+F11).
  2. In here, go to Insert>Module.
  3. Copy and paste the code given into the module.
  4. Go to the worksheet you want to run the code on (the one with all of the data) and go to Tools>Macro>Macros (Shortcut: Alt+F8)
  5. Choose the name of the macro from the list and click Run.
 
Upvote 0
MrKowz,

Thanks for the step by steps instructions - very helpful. I did this and then ran it and got an error that says "Compile Error: Sub or Function Not Defined" Did I do something wrong?

Thanks!
 
Upvote 0
I forgot to give you an accompanying function. Copy/paste this under the code I provided above:

Code:
Function WorksheetExists(WSName As Variant) As Boolean
On Error Resume Next
WorksheetExists = Worksheets(WSName).Name = WSName
On Error GoTo 0
End Function
 
Upvote 0
It worked! This is amazing! Thanks so much. So what would be the best way to do this for 60 different excel docs to combine the output all into one document with multiple tabs?

One concern I have is that I think for some tabs it may go over the million row limit so certain countries would need overflow tab.

Also, would their be a way to alter this first script so that it actually deletes the data from the first page after copying it to the individual tabs? Just so that I can easily go back and see what was sorted and if anything was missed? If not or if that is complicated no worries.

Thanks again!
 
Upvote 0
Mr. Kowz,

I just wanted to check in and see if you are still going to be able to help me on this. If not, no worries - what you have already done is incredibly helpful. I can use what you have helped me with and do the rest manually if need be.

If you do still have a moment, it would be great to be able to adjust this to try to make it so that it deletes the rows from the master tab that it then puts on each new tab (So that they only exist on the new tabs).

Thanks again for the help!
 
Upvote 0
Crimsonexcel,

Give this a shot. It will cut the data from the source file and paste it to the master file.

Also, I added functionality for an overflow that that I believe will work. Please test this on a copy of your data (and force it to require an overflow tab). I made the threshhold be row 1048500 instead of the true limit of 1048576 so that you have 76 rows to play with should you need to do calculations, manipulate data, etc.

Note: The code has been altered dramatically to accomodate for this. If it errors, please indicate on what line, and what error message you received.

Also, a bit of a status indicator has been added for the status bar, so you can see what row is currently being moved out of how many rows in each file. (If I knew a way to count how many files it needs, I would do that as well).

Hope this works for you!:

Code:
Public Sub CrimsonExcel()
Dim i               As Long, _
    LR              As Long, _
    LR2             As Long, _
    overflowcount   As Long, _
    x               As Variant, _
    CountryNames    As Variant, _
    wbMaster        As Workbook, _
    wsMaster        As Worksheet, _
    wbsource        As Workbook, _
    wsSource        As Worksheet, _
    fPath           As String, _
    fName           As String, _
    overflowbool    As Boolean
    
    
' Initial Variable Definitions
CountryNames = Array("Argentine", "Australian", "Austrian", "Belgian", "Brazilian", "Bulgarian", _
                     "Croatian", "Cypriot", "Czech", "Danish", "Dutch", "English", "Finnish", _
                     "French", "German", "Greek", "Hungarian", "Irish", "Italian", "Japanese", _
                     "Mexican", "Northern Ireland", "Norwegian", "Polish", "Portuguese", "Romanian", _
                     "Russian", "Scottish", "Slovakian", "Slovenian", "Spanish", "Swedish", "Swiss", _
                     "Turkish", "Ukranian", "UnitedStates", "Welsh")
Set wbMaster = ActiveWorkbook
Set wsMaster = wbMaster.ActiveSheet
LR = ws.Range("E" & Rows.Count).End(xlUp).row
Application.ScreenUpdating = False
' Create a worksheet for each country name.
For i = LBound(CountryNames) To UBound(CountryNames)
    If Not WorksheetExists(CountryNames(i) & " (1)") Then
        Sheets.Add After:=Sheets(Sheets.Count)
        ActiveSheet.Name = CountryNames(i) & " (1)"
        wsMaster.Rows(1).Copy Destination:=Sheets(CountryNames(i) & "(1)").Range("A1")
    End If
Next i
'
fPath = wbMaster.Path
fName = dir(fPath & "\" & "*.xls")
Do While Len(fName) > 0
    If fName <> wbMaster.Name Then
        Set wbsource = Workbooks.Open(fPath & "\" & fName)
        Set wsSource = wbsource.ActiveSheet
        ' Loop through column E and place rows in corresponding workbooks
        For i = 1 To LR
            overflowcount = 1
            Application.StatusBar = "Currently moving row " & i & " of " & LR & " in " & fName
            x = Application.Match(Left(wsSource.Range("E" & i).Value, InStr(wsSource.Range("E" & i).Value & " ", " ") - 1), CountryNames, 0)
            If Not IsError(x) Then
                Do
                    If Not WorksheetExists(CountryNames(x - 1) & " (" & overflowcount & ")") Then
                        Sheets.Add After:=Sheets(CountryNames(x - 1) & " (" & overflowcount - 1 & ")")
                        ActiveSheet.Name = CountryNames(x - 1) & " (" & overflowcount & ")"
                        wsMaster.Rows(1).Copy Destination:=Sheets(CountryNames(x - 1) & " (" & overflowcount & ")")
                    End If
                    With wbMaster.Sheets(CountryNames(x - 1) & " (" & overflowcount & ")")
                        LR2 = .Range("E" & .Rows.Count).End(xlUp).row + 1
                        If LR2 < 1048500 Then
                            wsSource.Rows(i).Cut Destination:=.Range("A" & LR2)
                            overflowbool = False
                        Else
                            overflowbool = True
                            overflowcount = overflowcount + 1
                        End If
                    End With
                Loop Until overflowbool = False
            End If
        Next i
    End If
Loop
With Application
    .ScreenUpdating = True
    .StatusBar = False
End With
End Sub
 
Upvote 0
MrKowz,

Thanks so much for the additional help on this. I gave it a try and it gave me the following error:

Run-time Error '424'
Object Required

The line it highlighted in Yellow when I selected "Debug" was:

LR = ws.Range("E" & Rows.Count).End(xlUp).Row

Thanks again for the continued help!
 
Upvote 0
Try this, the previous code wasn't checking for last row in each sheet anyway. This should fix that:

Code:
Public Sub CrimsonExcel()
Dim i               As Long, _
    LR              As Long, _
    LR2             As Long, _
    overflowcount   As Long, _
    x               As Variant, _
    CountryNames    As Variant, _
    wbMaster        As Workbook, _
    wsMaster        As Worksheet, _
    wbsource        As Workbook, _
    wsSource        As Worksheet, _
    fPath           As String, _
    fName           As String, _
    overflowbool    As Boolean
    
    
' Initial Variable Definitions
CountryNames = Array("Argentine", "Australian", "Austrian", "Belgian", "Brazilian", "Bulgarian", _
                     "Croatian", "Cypriot", "Czech", "Danish", "Dutch", "English", "Finnish", _
                     "French", "German", "Greek", "Hungarian", "Irish", "Italian", "Japanese", _
                     "Mexican", "Northern Ireland", "Norwegian", "Polish", "Portuguese", "Romanian", _
                     "Russian", "Scottish", "Slovakian", "Slovenian", "Spanish", "Swedish", "Swiss", _
                     "Turkish", "Ukranian", "UnitedStates", "Welsh")
Set wbMaster = ActiveWorkbook
Set wsMaster = wbMaster.ActiveSheet
Application.ScreenUpdating = False
' Create a worksheet for each country name.
For i = LBound(CountryNames) To UBound(CountryNames)
    If Not WorksheetExists(CountryNames(i) & " (1)") Then
        Sheets.Add After:=Sheets(Sheets.Count)
        ActiveSheet.Name = CountryNames(i) & " (1)"
        wsMaster.Rows(1).Copy Destination:=Sheets(CountryNames(i) & "(1)").Range("A1")
    End If
Next i
'
fPath = wbMaster.Path
fName = dir(fPath & "\" & "*.xls")
Do While Len(fName) > 0
    If fName <> wbMaster.Name Then
        Set wbsource = Workbooks.Open(fPath & "\" & fName)
        Set wsSource = wbsource.ActiveSheet
       [COLOR=red][B] LR = wsSource.Range("E" & Rows.Count).End(xlUp).row
[/B][/COLOR]        ' Loop through column E and place rows in corresponding workbooks
        For i = 1 To LR
            overflowcount = 1
            Application.StatusBar = "Currently moving row " & i & " of " & LR & " in " & fName
            x = Application.Match(Left(wsSource.Range("E" & i).Value, InStr(wsSource.Range("E" & i).Value & " ", " ") - 1), CountryNames, 0)
            If Not IsError(x) Then
                Do
                    If Not WorksheetExists(CountryNames(x - 1) & " (" & overflowcount & ")") Then
                        Sheets.Add After:=Sheets(CountryNames(x - 1) & " (" & overflowcount - 1 & ")")
                        ActiveSheet.Name = CountryNames(x - 1) & " (" & overflowcount & ")"
                        wsMaster.Rows(1).Copy Destination:=Sheets(CountryNames(x - 1) & " (" & overflowcount & ")")
                    End If
                    With wbMaster.Sheets(CountryNames(x - 1) & " (" & overflowcount & ")")
                        LR2 = .Range("E" & .Rows.Count).End(xlUp).row + 1
                        If LR2 < 1048500 Then
                            wsSource.Rows(i).Cut Destination:=.Range("A" & LR2)
                            overflowbool = False
                        Else
                            overflowbool = True
                            overflowcount = overflowcount + 1
                        End If
                    End With
                Loop Until overflowbool = False
            End If
        Next i
    End If
Loop
With Application
    .ScreenUpdating = True
    .StatusBar = False
End With
End Sub
 
Upvote 0

Forum statistics

Threads
1,224,542
Messages
6,179,424
Members
452,914
Latest member
echoix

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top