Split Text String based on listing

Lenard

New Member
Joined
Jan 19, 2010
Messages
35
Office Version
  1. 2019
Platform
  1. Windows
Hi,

I have a set of data in column A with text string in each row under Data sheet and would like to split text string in each row based on a listing of keywords ( Unique name ) under List sheet.
I copied set of formulas from google search and modified, the result is not met my expectation.

{=IFERROR(INDEX(List!$A$2:$A$71, SMALL(IF(COUNTIF($A2, "*"&List!$A$2:$A$71&"*"), MATCH(ROW(List!$A$2:$A$71), ROW(List!$A$2:$A$71)), ""), COLUMNS($M$1:M1))), "")}

Sample file_working.xlsx
A
1Keyword
2ABS
3ADDRESS
4AGGREGATE
5AND
6AVERAGE
7CELL
8COLUMN
9CONCAT
10COUNT
11COUNTA
12COUNTBLANK
13COUNTIF
14COUNTIFS
15DATE
16EDATE
17EOMONTH
18EXACT
19FILTER
20FIND
21FLOOR
22FREQUENCY
23HYPERLINK
24IF
25INDEX
26INDIRECT
27ISBLANK
28ISERR
29ISERROR
30ISNA
31ISNUMBER
32LARGE
33LEFT
34LEN
35LET
36LOWER
37MATCH
38MAX
39MID
40MIN
41MMULT
42MOD
43NOT
44ODD
45OFFSET
46OR
47POWER
48PROPER
49RANK
50REPT
51RIGHT
52ROW
53SEARCH
54SEQUENCE
55SMALL
56SORT
57SUBTOTAL
58SUM
59SUMIF
60SUMPRODUCT
61TEXT
62TODAY
63TRANSPOSE
64UNIQUE
65UPPER
66LOOKUP
67XLOOKUP
68HLOOKUP
69VLOOKUP
70WEEKDAY
71YEAR
List


Sample file_working.xlsx
ABCDEFGHIJ
1DataResults
2COUNTIFSFINDLEFTRIGHTCOUNTCOUNTIFCOUNTIFSFINDIFLEFTRIGHT
3COUNTIFSDATECOUNTCOUNTIFCOUNTIFSDATEIF  
4COUNTIFSCOUNTIFCOUNTCOUNTIFCOUNTIFSIF   
5COUNTIFCOUNTCOUNTIFIF    
6SUMPRODUCTEXACTEXACTSUMSUMPRODUCT    
7COUNTIFSUMPRODUCTISNUMBERMATCHCOUNTCOUNTIFCOUNTIFSIFISNUMBERMATCHSUMSUMPRODUCT
8COUNTIFSUMPRODUCTCOUNTCOUNTIFCOUNTIFSIFSUMSUMPRODUCT  
9MATCHISNASUMPRODUCTCOUNTACOUNTIFCOUNTCOUNTACOUNTIFIFISNAMATCHSUMSUMPRODUCT
10COUNTIFSSUMPRODUCTCOUNTCOUNTIFCOUNTIFSIFSUMSUMPRODUCT  
11SUMPRODUCTLENNLENSUMSUMPRODUCT     
12COUNTBLANKCOUNTACOUNTCOUNTACOUNTBLANK     
13COUNTACOUNTIFCOUNTIFSCOUNTCOUNTACOUNTIFCOUNTIFSIF   
14COUNTIFCOUNTIFSCOUNTCOUNTIFCOUNTIFSIF   
15SUMPRODUCTISNUMBERFINDFINDISNUMBERSUMSUMPRODUCT   
16COUNTIFSUMPRODUCTISNUMBERFINDCOUNTCOUNTIFCOUNTIFSFINDIFISNUMBERSUM
17SUMPRODUCTISERRORISERRSUMISERRISERRORORSUMSUMPRODUCT  
18COUNTSUMPRODUCTCOUNTSUMSUMPRODUCT    
19SUMPRODUCTMODMODSUMSUMPRODUCT    
20COUNTIFSUMPRODUCTFINDISNUMBERCOUNTCOUNTIFCOUNTIFSFINDIFISNUMBERSUMSUMPRODUCT
21COUNTIFISTEXTSUMPRODUCTCOUNTIFSCOUNTCOUNTIFCOUNTIFSIFSUMSUMPRODUCTTEXT
22ISERRORNOTSUMPRODUCTISERRISERRISERRORNOTORSUMSUMPRODUCT 
23ISNUMBERSEARCHMMULTTRANSPOSEISNUMBERMMULTSEARCHTRANSPOSE   
24SUMPRODUCTWEEKDAYSUMSUMPRODUCTWEEKDAY    
25YEARSUMPRODUCTSUMSUMPRODUCTYEAR    
26SUMPRODUCTSUMSUMPRODUCT     
27COUNTIFSCOUNTCOUNTIFCOUNTIFSIF   
28SUMPRODUCTCOUNTIFCOUNTCOUNTIFIFSUMSUMPRODUCT  
29SUMPRODUCTISNUMBERMATCHSEARCHISNUMBERMATCHSEARCHSUMSUMPRODUCT  
30SUMPRODUCTISNAMATCHISNAMATCHSUMSUMPRODUCT   
Data
Cell Formulas
RangeFormula
C21:I30,C20:J20,C14:I19,C7:J13,C2:I6C2=IFERROR(INDEX(List!$A$2:$A$71, SMALL(IF(COUNTIF($A2, "*"&List!$A$2:$A$71&"*"), MATCH(ROW(List!$A$2:$A$71), ROW(List!$A$2:$A$71)), ""), COLUMNS($M$1:M1))), "")
Press CTRL+SHIFT+ENTER to enter array formulas.



The expected result is to separate text string in each row by columnar result or the result can be concatenated with "+" as per sample below : -

1628573847220.png


Or Expected Results :-
1628573920300.png



Appreciate if someone can help to use better set of formulas or vba to solve the above problem
Thanks in advance

Regards
Lenard
 

Excel Facts

Which came first: VisiCalc or Lotus 1-2-3?
Dan Bricklin and Bob Frankston debuted VisiCalc in 1979 as a Visible Calculator. Lotus 1-2-3 debuted in the early 1980's, from Mitch Kapor.
Is there a list box control? Could be that, but I am not sure. Will take a look at it again in about 8-10 hours time.
 
Upvote 0
What version of Excel are you using?
I suggest that you update your Account details (or click your user name at the top right of the forum) so helpers always know what Excel version(s) & platform(s) you are using as the best solution often varies by version. (Don’t forget to scroll down & ‘Save’)
 
Upvote 0
Hi Lenard, I've tried a different approach this time using a late-binding SortList (might not make any sense to you but I hope this works). Don't require any additional references for this.

VBA Code:
Function SplitText(Keywords As Range, Data As Range) As String
    Dim c As Range, s As String, d As String, i As Long, j As Long, v
 
    With CreateObject("System.Collections.SortedList")
        For Each c In Keywords
            If .Item(Len(c.Value)) = "" Then
                .Add Len(c.Value), c.Value
            Else
                .Item(Len(c.Value)) = .Item(Len(c.Value)) & "_" & c.Value
            End If
        Next
     
        d = Data.Value
        For i = .Count - 1 To 0 Step -1
            v = Split(.GetByIndex(i), "_")
            For j = 0 To UBound(v)
                If InStr(d, v(j)) > 0 Then
                    s = s & v(j) & "+"
                    d = Replace(d, v(j), "")
                End If
            Next
        Next
     
        SplitText = ""
        If s <> "" Then SplitText = Left(s, Len(s) - 1)
    End With
End Function

Keywords are still in column A (I just didn't show in this minisheet to minimise clutter)
Book1
BC
1DataResults
2COUNTIFSFINDLEFTRIGHTCOUNTIFS+RIGHT+FIND+LEFT
3COUNTIFSDATECOUNTIFS+DATE
4COUNTIFSCOUNTIFCOUNTIFS+COUNTIF
5COUNTIFCOUNTIF
6SUMPRODUCTEXACTSUMPRODUCT+EXACT
7COUNTIFSUMPRODUCTISNUMBERMATCHSUMPRODUCT+ISNUMBER+COUNTIF+MATCH
8COUNTIFSUMPRODUCTSUMPRODUCT+COUNTIF
9MATCHISNASUMPRODUCTCOUNTACOUNTIFSUMPRODUCT+COUNTIF+COUNTA+MATCH+ISNA
Sheet1
Cell Formulas
RangeFormula
C2:C9C2=SplitText($A$2:$A$71,B2)



Edit:
I can foresee a potential erroneous result though, since there isn't really a 'best' approach to prioritise which Keyword to search for first.
If your data is like "COUNTIFSUMPRODUCTSUM", the code will look for SUMPRODUCT first, since that's the longest length.
Then the data gets 'trimmed' to "COUNTIFSUM" ("SUMPRODUCT" is removed).
The code then goes in a descending order in length, and when it reaches COUNTIFS, it gets trimmed again, leaving you with "UM".
So in this example the expected result was supposed to be SUMPRODUCT+COUNTIF+SUM but you will instead get SUMPRODUCT+COUNTIFS
 
Last edited:
Upvote 0
What version of Excel are you using?
I suggest that you update your Account details (or click your user name at the top right of the forum) so helpers always know what Excel version(s) & platform(s) you are using as the best solution often varies by version. (Don’t forget to scroll down & ‘Save’)
Sure, my account details updated, thanks Fluff.
 
Upvote 0
Hi Lenard, I've tried a different approach this time using a late-binding SortList (might not make any sense to you but I hope this works). Don't require any additional references for this.

VBA Code:
Function SplitText(Keywords As Range, Data As Range) As String
    Dim c As Range, s As String, d As String, i As Long, j As Long, v
 
    With CreateObject("System.Collections.SortedList")
        For Each c In Keywords
            If .Item(Len(c.Value)) = "" Then
                .Add Len(c.Value), c.Value
            Else
                .Item(Len(c.Value)) = .Item(Len(c.Value)) & "_" & c.Value
            End If
        Next
    
        d = Data.Value
        For i = .Count - 1 To 0 Step -1
            v = Split(.GetByIndex(i), "_")
            For j = 0 To UBound(v)
                If InStr(d, v(j)) > 0 Then
                    s = s & v(j) & "+"
                    d = Replace(d, v(j), "")
                End If
            Next
        Next
    
        SplitText = ""
        If s <> "" Then SplitText = Left(s, Len(s) - 1)
    End With
End Function

Keywords are still in column A (I just didn't show in this minisheet to minimise clutter)
Book1
BC
1DataResults
2COUNTIFSFINDLEFTRIGHTCOUNTIFS+RIGHT+FIND+LEFT
3COUNTIFSDATECOUNTIFS+DATE
4COUNTIFSCOUNTIFCOUNTIFS+COUNTIF
5COUNTIFCOUNTIF
6SUMPRODUCTEXACTSUMPRODUCT+EXACT
7COUNTIFSUMPRODUCTISNUMBERMATCHSUMPRODUCT+ISNUMBER+COUNTIF+MATCH
8COUNTIFSUMPRODUCTSUMPRODUCT+COUNTIF
9MATCHISNASUMPRODUCTCOUNTACOUNTIFSUMPRODUCT+COUNTIF+COUNTA+MATCH+ISNA
Sheet1
Cell Formulas
RangeFormula
C2:C9C2=SplitText($A$2:$A$71,B2)



Edit:
I can foresee a potential erroneous result though, since there isn't really a 'best' approach to prioritise which Keyword to search for first.
If your data is like "COUNTIFSUMPRODUCTSUM", the code will look for SUMPRODUCT first, since that's the longest length.
Then the data gets 'trimmed' to "COUNTIFSUM" ("SUMPRODUCT" is removed).
The code then goes in a descending order in length, and when it reaches COUNTIFS, it gets trimmed again, leaving you with "UM".
So in this example the expected result was supposed to be SUMPRODUCT+COUNTIF+SUM but you will instead get SUMPRODUCT+COUNTIFS
Hi,

Thanks for new approach.

However, SplitText UDF is not working as the late binding sortlist "System.Collections.SortedList" which need to add VBA reference "mscorlib.dll", also not found.

Regards
Lenard
 
Upvote 0
Can you go to Tools -> References and list what are the checked ones? or does anything shows with a "MISSING:" in the front?
 
Upvote 0
You could try this procedure.

VBA Code:
Sub SearchWords()
  Dim a As Variant, b As Variant, c As Variant, itm As Variant
  Dim i As Long, pos As Long
  Dim s As String
  
  With Sheets("List")
    With .Range("A2", .Range("A" & Rows.Count).End(xlUp))
      a = .Value
      .Value = Evaluate(Replace("text(len(#),""00"")&#", "#", .Address(External:=True)))
      .Sort Key1:=.Cells(1), Order1:=xlDescending, Header:=xlNo
      .TextToColumns DataType:=xlFixedWidth, FieldInfo:=Array(Array(0, 9), Array(2, 1))
      b = .Value
      .Value = a
    End With
  End With
  With Sheets("Data")
    a = .Range("A2", Range("A" & Rows.Count).End(xlUp))
    ReDim c(1 To UBound(a), 1 To 1)
    For i = 1 To UBound(a)
      s = vbNullString
      For Each itm In b
        pos = InStr(1, a(i, 1), itm)
        If pos > 0 Then
          s = s & "+" & Mid(a(i, 1), pos, Len(itm))
          Mid(a(i, 1), pos, Len(itm)) = String(Len(itm), ".")
          If Len(Replace(a(i, 1), ".", "")) = 0 Then Exit For
        End If
      Next itm
      c(i, 1) = Mid(s, 2)
    Next i
    .Range("B2").Resize(UBound(c)).Value = c
  End With
End Sub

My test data and results:

'List' sheet is the same as yours from post #1

Lenard.xlsm
AB
1DataResults
2COUNTIFSFINDLEFTRIGHTCOUNTIFS+RIGHT+LEFT+FIND
3COUNTIFSDATECOUNTIFS+DATE
4COUNTIFSCOUNTIFCOUNTIFS+COUNTIF
5COUNTIFCOUNTIF
6SUMPRODUCTEXACTSUMPRODUCT+EXACT
7COUNTIFSUMPRODUCTISNUMBERMATCHSUMPRODUCT+ISNUMBER+COUNTIF+MATCH
8COUNTIFSUMPRODUCTSUMPRODUCT+COUNTIF
9MATCHISNASUMPRODUCTCOUNTACOUNTIFSUMPRODUCT+COUNTIF+COUNTA+MATCH+ISNA
Data
 
Upvote 0

Forum statistics

Threads
1,223,888
Messages
6,175,205
Members
452,618
Latest member
Tam84

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top