How would you improve this code?..Create an array from a delimited string but ignore delimiters within quotations.

MoshiM

Active Member
Joined
Jan 31, 2018
Messages
439
Office Version
  1. 2016
Platform
  1. Windows
I wrote this function in the past to help parse 100k+ rows worth of strings into cells. I gave up on it shortly after as it would take far too much time to do and opted to retrieve data in the form of workbooks and write their contents to arrays.

I'm more curious than anything, but solutions that work on both Mac and Windows would be appreciated.

VBA Code:
Public Function Change_Delimiter_Not_Between_Quotes(ByRef Current_String As Variant, ByVal Delimiter As String, Optional ByVal Changed_Delimiter As String = ">Ý") As Variant
    
'returns a 0 based array
    
Dim String_Array() As String, X As Long, Right_CHR As String

    If InStr(1, Current_String, Chr(34)) = 0 Then 'if there are no quotation marks then split with the supplied delimiter
        
        Change_Delimiter_Not_Between_Quotes = Split(Current_String, Delimiter)
        Exit Function

    End If
    
    Right_CHR = Right(Changed_Delimiter, 1) 'RightMost character in at least 2 character string that will be used as a replacement delimiter

    'Replace ALL quotation marks with the ChangedDelimiter[Quotation mark] EX: " --> $+
    Current_String = Replace(Current_String, Chr(34), Changed_Delimiter)

    String_Array = Split(Current_String, Left(Changed_Delimiter, 1))
    '1st character of Changed_Delimiter will be used to delimit a new array
    'element [0] will be an empty string if the first value in the delmited string begins with a Quotation mark.
    
    For X = LBound(String_Array) To UBound(String_Array) 'loop all elements of the array

        If Left(String_Array(X), 1) = Right_CHR And Not Left(String_Array(X), 2) = Right_CHR & Delimiter Then
            'If the string contains a valid comma
            'Checked by if [the First character is the 2nd Character in the Changed Delimiter] and the 2nd character isn't the delimiter
            'Then offset the string by 1 character to remove the 2nd portion of the changed Delimiter
            String_Array(X) = Right(String_Array(X), Len(String_Array(X)) - 1)
        
        Else
        
            If Left(String_Array(X), 1) = Right_CHR Then 'If 1st character = 2nd portion of the Changed Delimiter
                                                         'Then offset string by 1 and then repalce all [Delimiter]
                String_Array(X) = Replace(Right(String_Array(X), Len(String_Array(X)) - 1), Delimiter, Changed_Delimiter)
            
            Else 'Just replace
                
                String_Array(X) = Replace(String_Array(X), Delimiter, Changed_Delimiter)
            
            End If
            
        End If
        
    Next X
    'Join the Array elements back together {Do not add another delimiter] and split with the changed Delimiter
    Change_Delimiter_Not_Between_Quotes = Split(Join(String_Array), Changed_Delimiter)
    
    Erase String_Array
End Function
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop
Assuming a comma delimiter, replace "," with something like "|" then using | as delimiter to split the string to an array should work.

I would work on the method of, copy all strings into a single column, change the delimiter as above using find and replace for the whole list, then split using text to columns.
 
Upvote 0
Try this. The idea is to first split on the quote mark.
Then split on the delimiter if and only if its an even index in that split (i.e. outside the quote marks).

VBA Code:
Function QuoteSensitiveSplit(aString As String, Optional Delimiter As String = " ") As Variant
    Dim quoteWords As Variant
    Dim Result As Variant, resultPointer As Long
    Dim strWorking As String, workingIndex As Long
    Dim WorkingSubWords As Variant
    Dim i As Long

    ReDim Result(0 To 0): resultPointer = -1
  
    quoteWords = Split(aString, Chr(34))
  
    For workingIndex = 0 To UBound(quoteWords) Step 2
        strWorking = quoteWords(workingIndex)
        WorkingSubWords = Split(strWorking, Delimiter)
      
        For i = 0 To UBound(WorkingSubWords)
            resultPointer = resultPointer + 1
            If UBound(Result) < resultPointer Then ReDim Preserve Result(0 To 2 * resultPointer)
            Result(resultPointer) = WorkingSubWords(i)
        Next i
        If workingIndex < UBound(quoteWords) Then
            Result(resultPointer) = Result(resultPointer) & Chr(34) & quoteWords(workingIndex + 1) & Chr(34)
        End If
    Next workingIndex
    ReDim Preserve Result(0 To resultPointer)
    QuoteSensitiveSplit = Result
End Function
 
Upvote 0
Try this. The idea is to first split on the quote mark.
Then split on the delimiter if and only if its an even index in that split (i.e. outside the quote marks).

VBA Code:
Function QuoteSensitiveSplit(aString As String, Optional Delimiter As String = " ") As Variant
    Dim quoteWords As Variant
    Dim Result As Variant, resultPointer As Long
    Dim strWorking As String, workingIndex As Long
    Dim WorkingSubWords As Variant
    Dim i As Long

    ReDim Result(0 To 0): resultPointer = -1

    quoteWords = Split(aString, Chr(34))

    For workingIndex = 0 To UBound(quoteWords) Step 2
        strWorking = quoteWords(workingIndex)
        WorkingSubWords = Split(strWorking, Delimiter)
    
        For i = 0 To UBound(WorkingSubWords)
            resultPointer = resultPointer + 1
            If UBound(Result) < resultPointer Then ReDim Preserve Result(0 To 2 * resultPointer)
            Result(resultPointer) = WorkingSubWords(i)
        Next i
        If workingIndex < UBound(quoteWords) Then
            Result(resultPointer) = Result(resultPointer) & Chr(34) & quoteWords(workingIndex + 1) & Chr(34)
        End If
    Next workingIndex
    ReDim Preserve Result(0 To resultPointer)
    QuoteSensitiveSplit = Result
End Function
Genious!!! I got an error which an on error resume next statement seemed to resolve for the first "proper string" but it resulted in that element becoming an empty string while working fine for everything else.

I tested on this string :

"WHEAT-SRW - CHICAGO BOARD OF TRADE",200428,2020-04-28,001602,CBT ,00,001 , 453802, 103281, 86897, 143334, 178186, 180303, 424802, 410534, 29000, 43268, 4121, 429, 1008, 0, 3010, 1720, 3439, 2728, 682, 1393, 449681, 103302, 86339, 142884, 175176, 178583, 421363, 407806, 28318, 41875, -50313, -922, 10781, -44504, -2194, -13087, -47621, -46810, -2692, -3504, 100.0, 22.8, 19.1, 31.6, 39.3, 39.7, 93.6, 90.5, 6.4, 9.5, 100.0, 10.4, 24.5, 0.0, 73.0, 41.7, 83.5, 66.2, 16.5, 33.8, 100.0, 23.0, 19.2, 31.8, 39.0, 39.7, 93.7, 90.7, 6.3, 9.3, 347, 103, 96, 120, 89, 115, 268, 272, 46, 7, 11, 0, 12, 18, 19, 29, 347, 104, 96, 118, 88, 114, 265, 271, 11.3, 11.8, 19.8, 18.8, 9.4, 7.5, 16.0, 12.7, 71.3, 38.0, 81.3, 51.2, 71.3, 38.0, 81.3, 51.2, 11.4,
11.8, 20.0, 18.8, 9.5, 7.4, 16.1, 12.7,"(CONTRACTS OF 5,000 BUSHELS)","001602","CBT ","001 "

I adapted it based on the idea you mentioned [Testing only the even elements] and it worked splendidly. I just don't know under what conditions if any it might fail.
VBA Code:
Function Quote_Delimiter_Array(ByVal InputA As String, Delimiter As String, Optional N_Delimiter As String = "*")

Dim X As Long, SA() As String

If InStr(1, InputA, Chr(34)) = 0 Then 'if there are no quotation marks then split with the supplied delimiter
   
    Quote_Delimiter_Array = Split(Current_String, Delimiter)
    Exit Function

Else
   
    SA = Split(InputA, Chr(34))
   
    For X = LBound(SA) To UBound(SA) Step 2
        SA(X) = Replace(SA(X), Delimiter, N_Delimiter)
    Next X
   
   Quote_Delimiter_Array = Split(Join(SA), N_Delimiter)

End If

End Function
 
Upvote 0
I would look at strings that ended in a quote
Mary said "Hello"
paired quote

Tom Apple Smith "cat""goat" fish

un closed qutoe marks

Tom Apple Smith "cat" goat fish"
 
Upvote 0

Forum statistics

Threads
1,223,237
Messages
6,170,924
Members
452,366
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top