Substring Search !

Eddny

New Member
Joined
Jun 26, 2018
Messages
26
Hello!

So let's say I have:

Dim MainString as string, Substring as string
Dim Result as Boolean

MainString ="ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"

Question: Is there an easier (faster) way to check whether MainString contains all elements of Substring without using loops?

Example 1: If Substring ="QRSTU|ABCDE|CGHMN" then Result = True since all elements of Substring are found in MainString.

Example 2: If Substring ="ABCDF|QRSTU|HPXYZ" then Result = False since not all elements of Substring are found in MainString.

Is there a way to perform these checks to see whether all the elements in Substring exist in MainString without using loops? I have thought about the Instr function, the Like operator, etc, However, it appears in each case I will have to use a loop.
Any help would be appreciated.

Eddny
 
Last edited:
See if this non-looping method works for you.
Code:
Sub CheckAllSubstrings()
  Dim MainString As String, Substring As String
  Dim Result As Boolean
  Dim RX As Object

  MainString = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
  Substring = "QRSTU|ABCDE|CGHMN"
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = MainString
  Result = Replace(RX.Replace(Substring, ""), "|", "") = vbNullString
End Sub

Hi Peter - I tested your code and it is working so far.


Question: Can you create a modification where 2 substrings can be combined to produce a single string in which each section of the 2 substrings appears only once?


Example 1: If Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM" and Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
then CombinedUniqueString ="ABCDE|ABCDF|CGHMN|QRSTU|CJKLM|OXYZ|KUBYV"




Example 2: If Substring1 = "ABCFG|PLCDF|UIOMN|JKLTU|HOKJM" and Substring2 = "HOKJM|UIOMN|BCVXZ"
then CombinedUniqueString ="ABCFG|PLCDF|UIOMN|JKLTU|HOKJM|BCVXZ"


So CombinedUniqueString will always contain all sections of each of the 2 substrings appearing only once.

Thanks!
 
Last edited:
Upvote 0

Excel Facts

Is there a shortcut key for strikethrough?
Ctrl+S is used for Save. Ctrl+5 is used for Strikethrough. Why Ctrl+5? When you use hashmarks to count |||| is 4, strike through to mean 5.
I did assume that all sections of your substrings are 5 characters, like your samples. If that is not the case and/or there are other differences (eg upper/lower case or different delimiters) then more information and examples would help.

Peter - Yes, my example showed 5 character-sections of the substrings separated by the "|" as delimiter.
However, the characters can be all 5 characters or all X characters. If they are 5, all characters in both the Mainstring and Substring will all have 5 character sections. Similarly, If they are 6, all characters in both the Mainstring and Substring will all have 6 character sections, and so on. Case sensitivity does not apply.
Please let me know if I clarified it. I appreciate your assistance.
 
Upvote 0
Question: Can you create a modification where 2 substrings can be combined to produce a single string in which each section of the 2 substrings appears only once?

Example 1: If Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM" and Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
then CombinedUniqueString ="ABCDE|ABCDF|CGHMN|QRSTU|CJKLM|OXYZ|KUBYV"

Example 2: If Substring1 = "ABCFG|PLCDF|UIOMN|JKLTU|HOKJM" and Substring2 = "HOKJM|UIOMN|BCVXZ"
then CombinedUniqueString ="ABCFG|PLCDF|UIOMN|JKLTU|HOKJM|BCVXZ"

So CombinedUniqueString will always contain all sections of each of the 2 substrings appearing only once.
Try this
Code:
Sub Make_String()
  Dim Substring1 As String, Substring2 As String, CombinedUniqueString As String
  Dim d As Object
  Dim itm As Variant
  
  Set d = CreateObject("Scripting.Dictionary")
  Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
  Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
  For Each itm In Split(Substring1 & "|" & Substring2, "|")
    d(itm) = Empty
  Next itm
  CombinedUniqueString = Join(d.Keys, "|")
End Sub



Peter - Yes, my example showed 5 character-sections of the substrings separated by the "|" as delimiter.
However, the characters can be all 5 characters or all X characters. If they are 5, all characters in both the Mainstring and Substring will all have 5 character sections. Similarly, If they are 6, all characters in both the Mainstring and Substring will all have 6 character sections, and so on. Case sensitivity does not apply.
Please let me know if I clarified it. I appreciate your assistance.
Provided each time the code is used, the Mainstring and Substring have matching numbers of characters, the existing code should work fine.
 
Upvote 0
Try this
Code:
Sub Make_String()
  Dim Substring1 As String, Substring2 As String, CombinedUniqueString As String
  Dim d As Object
  Dim itm As Variant
  
  Set d = CreateObject("Scripting.Dictionary")
  Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
  Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
  For Each itm In Split(Substring1 & "|" & Substring2, "|")
    d(itm) = Empty
  Next itm
  CombinedUniqueString = Join(d.Keys, "|")
End Sub

Peter -

Thanks. Your code works. I am aware of the dictionary route, and below is how I have been executing it by first splitting the strings into a string vector V() and using .Exist() method to load into dict with a loop. However, your code is shorter and more advanced.

I was looking for a "loopless" regexp-like solution if possible. Is it even possible to accomplish same result with regexp Replace method or some other method? I will have to do this on a very, very large data set with other iterations and will like to use a solution without loops if possible.

Code:
Dim dict As Scripting.Dictionary, V() As String, I as integer
Set dict = New Scripting.Dictionary


Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
V = Split(Substring1 & "|" & Substring2, "|")


For I = LBound(V) To UBound(V)
    If Not dict.Exists(V(I)) Then dict.Add (V(I)), ""
Next I


CombinedUniqueString = Join(dict.Keys, "|")
 
Upvote 0
Is it even possible to accomplish same result with regexp Replace method or some other method?
Not that I can think of.


... like to use a solution without loops if possible.
I would be wary of dismissing loops out of hand. They can often be structured to be faster that non-looping methods. Depends on the individual circumstances of each case of course.
 
Upvote 0
Is it even possible to accomplish same result with regexp Replace method or some other method?
Actually, if there are no repeat sets of characters withing Substring1, then you could try this, though I'm not sure about speed, especially if the strings are large.
Code:
Sub Make_String_v2()
  Dim Substring1 As String, Substring2 As String, CombinedUniqueString As String
  Dim RX As Object
  
  Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
  Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = Substring1
  CombinedUniqueString = Replace(Application.Trim(Replace(Substring1, "|", " ") & "" & RX.Replace(Replace(Substring2, "|", " "), "")), " ", "|")
End Sub
 
Upvote 0
Actually, if there are no repeat sets of characters withing Substring1, then you could try this, though I'm not sure about speed, especially if the strings are large.
Code:
Sub Make_String_v2()
  Dim Substring1 As String, Substring2 As String, CombinedUniqueString As String
  Dim RX As Object
  
  Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
  Substring2 = "ABCDE|OXYZ|QRSTU|KUBYV"
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = Substring1
  CombinedUniqueString = Replace(Application.Trim(Replace(Substring1, "|", " ") & "" & RX.Replace(Replace(Substring2, "|", " "), "")), " ", "|")
End Sub

Peter - Correct. Within any string, no split section (or x-character token) will appear more than once. For example, in Substring1, the split section "ABCDF" will never repeat. However, Substring2 could also contain "ABCDF", but if it does, that too will never repeat within Substring2.


I tested the code for:
Substring1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
Substring2 = "BVCXZ|ABCDE|ABCDF|KUBYV|CGHMN|QRSTU|TGYHU|CJKLM"


It gives the output
CombinedUniqueString = ABCDE|ABCDF|CGHMN|QRSTU|CJKLMBVCXZ|KUBYV|TGYHU. It is correct except it is missing a pipe "|" between "CJKLM" and "BVCXZ"
The correct CombinedUniqueString output ought to be = ABCDE|ABCDF|CGHMN|QRSTU|CJKLM|BVCXZ|KUBYV|TGYHU

Thanks!
 
Upvote 0
The correct CombinedUniqueString output ought to be = ABCDE|ABCDF|CGHMN|QRSTU|CJKLM|BVCXZ|KUBYV|TGYHU
Try adding a space between the quote marks where shown in red.
Code:
  CombinedUniqueString = Replace(Application.Trim(Replace(Substring1, "|", " ") & [COLOR="#FF0000"][B]" "[/B][/COLOR] & RX.Replace(Replace(Substring2, "|", " "), "")), " ", "|")[/QUOTE]
 
Upvote 0
Try adding a space between the quote marks where shown in red.

Peter -
Thanks, the second regexp code is also working fine now.
For your first regexp code that checks if all sections of Substring are found in MainString and returns True in Result and False otherwise,
let's assume I have multiple MainStrings in a dictionary and want to test whether all sections of Substring are found in any one of the MainStrings (that is, Substring is found in any one of the dictionary keys), below is my approach to solving it using parts of your first regexp code. Is this the most efficient way to approach it or do you have better recommendations for me?

Code:
Dim TestDict As Scripting.Dictionary,
Dim MainString1 as string, MainString2 as string, MainString3 as string, MainString4 as string
Dim J as Long, Substring as String
Dim Result As Boolean
Dim RX As Object

MainString1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
MainString2 = "ABCDX|OMPHY|EDCRF|QRSTU|UPLKM"
MainString3 = "ABCDY|ABCDF|THYFE|QRSTU|VBNMK|DCSXE"
MainString4 = "ABCDZ|ABCDF|CGHMN|QRSTU|HUJNB"

Substring = "QRSTU|ABCDE|CGHMN"

Set TestDict = New Scripting.Dictionary

'Assume we added X number of MainStrings to TestDict.keys like below:

TestDict.add MainString1, ""
TestDict.add MainString2, ""
TestDict.add MainString3, ""
TestDict.add MainString4, ""
TestDict.add MainStringx, ""

Set RX = CreateObject("VBScript.RegExp")
RX.Global = True


For J=0 to TestDict.count-1
    RX.Pattern = trim(TestDict.keys(j))
    Result = Replace(RX.Replace(Substring, ""), "|", "") = vbNullString
    If Result=true then 
       Msgbox "Sub-string Found in at least one Dictionary Key"
       exit for
    end if
Next J
 
Last edited:
Upvote 0
If I have understood the requirement correctly there should still be no need for the dictionary or the loop. Try something like this.
Code:
Sub CheckAllSubInAnyMain()
  Dim MainString As String, Substring As String
  Dim MainString1 As String, MainString2 As String, MainString3 As String, MainString4 As String
  Dim Result As Boolean
  Dim RX As Object
  
  MainString1 = "ABCDE|ABCDF|CGHMN|QRSTU|CJKLM"
  MainString2 = "ABCDX|OMPHY|EDCRF|QRSTU|UPLKM"
  MainString3 = "ABCDY|ABCDF|THYFE|QRSTU|VBNMK|DCSXE"
  MainString4 = "ABCDZ|ABCDF|CGHMN|QRSTU|HUJNB"

  MainString = Join(Array(MainString1, MainString2, MainString3, MainString4), "|")
  Substring = "QRSTU|ABCDE|CGHMN"
  Set RX = CreateObject("VBScript.RegExp")
  RX.Global = True
  RX.Pattern = MainString
  Result = Replace(RX.Replace(Substring, ""), "|", "") = vbNullString
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,286
Members
452,631
Latest member
a_potato

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top