Split comma delimited data but ignoring commas between brackets

DeonM

New Member
Joined
Sep 18, 2014
Messages
26
Hi

I have a set of data in the following format:

[TABLE="width: 1004"]
<tbody>[TR]
[TD]code[/TD]
[TD]other members[/TD]
[TD]modified[/TD]
[/TR]
[TR]
[TD]ABC[/TD]
[TD]Babalwa Lobishe (Economic Development, Tourism and Agriculture), Fikile Desi (Constituency Coordinator), Thembinkosi Mafana (Safety and Security), Nomamerika Magopeni (Sport, Recreation, Arts and Culture), Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury)[/TD]
[TD]2014-05-08 15:27:10[/TD]
[/TR]
</tbody>[/TABLE]

I need to split column B into separate rows and repeat the other data. The data is essentially comma delimited.

I have the following code that successfully splits Column B to new rows as needed:

Code:
Sub splitByColB()
    Dim r As Range, i As Long, ar
    Set r = Worksheets("Metropolitans").Range("B999999").End(xlUp)
    Do While r.Row > 1
        ar = Split(r.Value, ",")
        If UBound(ar) >= 0 Then r.Value = ar(0)
        For i = UBound(ar) To 1 Step -1
            r.EntireRow.Copy
            r.Offset(1).EntireRow.Insert
            r.Offset(1).Value = ar(i)
        Next
        Set r = r.Offset(-1)
    Loop
End Sub

My problem is that some of the data sometimes includes commas that appear between brackets (marked red in data above) that should be ignored as delimiters. I assume I should first replace all commas between brackets with something else and then reverse later, but have no idea how to do this.

Any help appreciated.
 

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
Alternatively, if every entry has bracketed content at the end then you can use ")" as your delimiter and append it back on again once the data has been split?
 
Upvote 0
Thanks. The entries don't always have bracketed position titles. So many are also just 'Name Surname, Name Surname, etc.' Then commas are the only delimiters.
 
Upvote 0
Is that what you want?

[Table="width:, class:head"]
[tr=bgcolor:#FFFFFF][td=bgcolor:#5B9BD5]code[/td][td=bgcolor:#5B9BD5]other members[/td][td=bgcolor:#5B9BD5]modified[/td][td][/td][td=bgcolor:#70AD47]code[/td][td=bgcolor:#70AD47]other members[/td][td=bgcolor:#70AD47]modified[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#DDEBF7]ABC[/td][td=bgcolor:#DDEBF7]Babalwa Lobishe (Economic Development, Tourism and Agriculture), Fikile Desi (Constituency Coordinator), Thembinkosi Mafana (Safety and Security), Nomamerika Magopeni (Sport, Recreation, Arts and Culture), Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury), John Doe, Hamri Salaam[/td][td=bgcolor:#DDEBF7]
08/05/2014 15:27​
[/td][td][/td][td=bgcolor:#E2EFDA]ABC[/td][td=bgcolor:#E2EFDA]Babalwa Lobishe (Economic Development, Tourism and Agriculture)[/td][td=bgcolor:#E2EFDA]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]ABC[/td][td]Fikile Desi (Constituency Coordinator)[/td][td]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td=bgcolor:#E2EFDA]ABC[/td][td=bgcolor:#E2EFDA]Thembinkosi Mafana (Safety and Security)[/td][td=bgcolor:#E2EFDA]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]ABC[/td][td]Nomamerika Magopeni (Sport, Recreation, Arts and Culture)[/td][td]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td=bgcolor:#E2EFDA]ABC[/td][td=bgcolor:#E2EFDA]Wandisile Jikeka (Corporate Services)[/td][td=bgcolor:#E2EFDA]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]ABC[/td][td]Mbuyiseli Mkavu (Human Settlements)[/td][td]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td=bgcolor:#E2EFDA]ABC[/td][td=bgcolor:#E2EFDA]Andile Mfunda (Infrastructure, Engineering and Energy)[/td][td=bgcolor:#E2EFDA]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]ABC[/td][td]Paticia Ndlovu (Public Health)[/td][td]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td=bgcolor:#E2EFDA]ABC[/td][td=bgcolor:#E2EFDA]Balu Naran (Budget and Treasury)[/td][td=bgcolor:#E2EFDA]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td]ABC[/td][td]John Doe[/td][td]
08/05/2014 15:27​
[/td][/tr]

[tr=bgcolor:#FFFFFF][td][/td][td][/td][td][/td][td][/td][td=bgcolor:#E2EFDA]ABC[/td][td=bgcolor:#E2EFDA]Hamri Salaam[/td][td=bgcolor:#E2EFDA]
08/05/2014 15:27​
[/td][/tr]
[/table]
 
Last edited:
Upvote 0
Altering your code a bit you could do this:

Code:
Sub splitByColB()
    Dim r As Range, i As Long, ar, l As Long, flag As Boolean
    Set r = Worksheets("Metropolitans").Range("B999999").End(xlUp)
    Do While r.Row > 1
        flag = False
        For l = 1 To Len(r)
            If Mid(r, l, 1) = "(" Then flag = True
            If Mid(r, l, 1) = ")" Then flag = False
            If flag = True And Mid(r, l, 1) = "," Then
                r = Left(r, l - 1) & "|" & Mid(r, l + 1, Len(r))
            End If
        Next
        ar = Split(r, ",")
        If UBound(ar) >= 0 Then r.Value = Trim(Replace((ar(0)), "|", ","))
        For i = UBound(ar) To LBound(ar) + 1 Step -1
            r.EntireRow.Copy
            r.Offset(1).EntireRow.Insert
            r.Offset(1).Value = Trim(Replace((ar(i)), "|", ","))
        Next
        Set r = r.Offset(-1)
    Loop
End Sub

Give it a test and see how you get on.
 
Upvote 0
You could also give this a try. At the moment, I have it writing results in columns E:G rather than over-writing the original data in columns A:C. Can change that later if it appears to be doing what you want.

Code:
Sub splitByColB_v2()
  Dim RX As Object
  Dim a As Variant, b As Variant, m As Variant
  Dim i As Long, j As Long, k As Long, uba2 As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "([^\,].+?)(\(.+?\))?(?=\,|$)"
  a = Range("A2", Range("A" & Rows.Count).End(xlUp)).Resize(, 3).Value
  uba2 = UBound(a, 2)
  ReDim b(1 To Rows.Count, 1 To uba2)
  For i = 1 To UBound(a)
    For Each m In RX.Execute(a(i, 2))
      k = k + 1
      For j = 1 To uba2
        b(k, j) = a(i, j)
      Next j
      b(k, 2) = m
    Next m
  Next i
  Range("E2").Resize(k, uba2).Value = b
End Sub

Here is my sample data and results of code:

Excel Workbook
ABCDEFG
1codeother membersmodified
2ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture), Fikile Desi (Constituency Coordinator), Thembinkosi Mafana (Safety and Security), Nomamerika Magopeni (Sport, Recreation, Arts and Culture), Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury)8/05/2014 15:27ABCBabalwa Lobishe (Economic Development, Tourism and Agriculture)8/05/2014 15:27
3DEFThembinkosi Mafana, Nomamerika Magopeni, Wandisile Jikeka (Corporate Services), Mbuyiseli Mkavu (Human Settlements), Andile Mfunda (Infrastructure, Engineering and Energy), Paticia Ndlovu (Public Health), Balu Naran (Budget and Treasury)xzscABCFikile Desi (Constituency Coordinator)8/05/2014 15:27
4ABCThembinkosi Mafana (Safety and Security)8/05/2014 15:27
5ABCNomamerika Magopeni (Sport, Recreation, Arts and Culture)8/05/2014 15:27
6ABCWandisile Jikeka (Corporate Services)8/05/2014 15:27
7ABCMbuyiseli Mkavu (Human Settlements)8/05/2014 15:27
8ABCAndile Mfunda (Infrastructure, Engineering and Energy)8/05/2014 15:27
9ABCPaticia Ndlovu (Public Health)8/05/2014 15:27
10ABCBalu Naran (Budget and Treasury)8/05/2014 15:27
11DEFThembinkosi Mafanaxzsc
12DEFNomamerika Magopenixzsc
13DEFWandisile Jikeka (Corporate Services)xzsc
14DEFMbuyiseli Mkavu (Human Settlements)xzsc
15DEFAndile Mfunda (Infrastructure, Engineering and Energy)xzsc
16DEFPaticia Ndlovu (Public Health)xzsc
17DEFBalu Naran (Budget and Treasury)xzsc
18
Sheet1
 
Last edited:
Upvote 0
Brilliant. That's exactly what I was looking for. Thank you.

When doing the last = Trim(Replace((ar(i)), "|", ",")) is there a simple way to also remove any carriage returns that may have been left over in column B?
 
Upvote 0
Altering your code a bit you could do this:

Code:
Sub splitByColB()
    Dim r As Range, i As Long, ar, l As Long, flag As Boolean
    Set r = Worksheets("Metropolitans").Range("B999999").End(xlUp)
    Do While r.Row > 1
        flag = False
        For l = 1 To Len(r)
            If Mid(r, l, 1) = "(" Then flag = True
            If Mid(r, l, 1) = ")" Then flag = False
            If flag = True And Mid(r, l, 1) = "," Then
                r = Left(r, l - 1) & "|" & Mid(r, l + 1, Len(r))
            End If
        Next
        ar = Split(r, ",")
        If UBound(ar) >= 0 Then r.Value = Trim(Replace((ar(0)), "|", ","))
        For i = UBound(ar) To LBound(ar) + 1 Step -1
            r.EntireRow.Copy
            r.Offset(1).EntireRow.Insert
            r.Offset(1).Value = Trim(Replace((ar(i)), "|", ","))
        Next
        Set r = r.Offset(-1)
    Loop
End Sub

Give it a test and see how you get on.




Brilliant. That's exactly what I was looking for. Thank you.

When doing the last = Trim(Replace((ar(i)), "|", ",")) is there a simple way to also remove any carriage returns that may have been left over in column B?

Thanks to all the other suggestions.
 
Upvote 0
.. is there a simple way to also remove any carriage returns that may have been left over in column B?
For my code I have
- adjusted for that, assuming that the "carriage return" is CHAR(10), and added TRIM.
- set to specifically target sheet 'Metropolitans' which my earlier code had not
- set to now over-write the original data

Code:
Sub splitByColB_v3()
  Dim RX As Object
  Dim a As Variant, b As Variant, m As Variant
  Dim i As Long, j As Long, k As Long, uba2 As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "([^\,].+?)(\(.+?\))?(?=\,|$)"
  With Sheets("Metropolitans")
    a = .Range("A2", .Range("A" & .Rows.Count).End(xlUp)).Resize(, 3).Value
    uba2 = UBound(a, 2)
    ReDim b(1 To Rows.Count, 1 To uba2)
    For i = 1 To UBound(a)
      For Each m In RX.Execute(Replace(a(i, 2), Chr(10), ""))
        k = k + 1
        For j = 1 To uba2
          b(k, j) = a(i, j)
        Next j
        b(k, 2) = Trim(m)
      Next m
    Next i
    .Range("A2").Resize(k, uba2).Value = b
  End With
End Sub


Thanks to all the other suggestions.
Not sure if you tried the other suggestions or how much data you have but for me, testing on about 200 rows of data like your sample data, my code took about 0.2 seconds versus nearly 30 seconds for the other code.
 
Upvote 0
For my code I have
- adjusted for that, assuming that the "carriage return" is CHAR(10), and added TRIM.
- set to specifically target sheet 'Metropolitans' which my earlier code had not
- set to now over-write the original data

Code:
Sub splitByColB_v3()
  Dim RX As Object
  Dim a As Variant, b As Variant, m As Variant
  Dim i As Long, j As Long, k As Long, uba2 As Long
  
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = "([^\,].+?)(\(.+?\))?(?=\,|$)"
  With Sheets("Metropolitans")
    a = .Range("A2", .Range("A" & .Rows.Count).End(xlUp)).Resize(, 3).Value
    uba2 = UBound(a, 2)
    ReDim b(1 To Rows.Count, 1 To uba2)
    For i = 1 To UBound(a)
      For Each m In RX.Execute(Replace(a(i, 2), Chr(10), ""))
        k = k + 1
        For j = 1 To uba2
          b(k, j) = a(i, j)
        Next j
        b(k, 2) = Trim(m)
      Next m
    Next i
    .Range("A2").Resize(k, uba2).Value = b
  End With
End Sub


Not sure if you tried the other suggestions or how much data you have but for me, testing on about 200 rows of data like your sample data, my code took about 0.2 seconds versus nearly 30 seconds for the other code.

Hey Peter

Thanks for that. It works perfectly. I have been testing with about 200 rows too - and get similar time results. Your code is much, much faster. My total dataset has about 10 000 rows. Even though I only really have to run this procedure once to convert legacy data, the speed will be a big factor.

Much appreciated.
 
Upvote 0

Forum statistics

Threads
1,223,227
Messages
6,170,848
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top