Decompress Alternate Spellings

badman3000

New Member
Joined
Jul 1, 2019
Messages
5
It would be incredibly helpful if someone could help me solve this problem.

I have a list of alternate spellings of words (Column B), coupled with a unique number (Column A)

Essentially I need a list of all the alternate spellings.

There are two rules
/ means or
() means optional

sounds very simple, and it is in many cases:
(A/E)ldridge should output Aldridge and Eldridge
Al(l)man should output Alman and AllmanThey can also be used together e.g. [TABLE="width: 64"]
<tbody>[TR]
[TD="width: 64"]Bes((t/w))ick[/TD]
[/TR]
</tbody>[/TABLE]
should output Beswick, Bestick and Besick

This can be complicated though e.g.
[TABLE="width: 64"]
<tbody>[TR]
[TD="width: 64"]B(e(a)/i)d(((w)e/a)ll/les) should output
Bedwell,Bedell,Bedall,Bedles,Beadwell,Beadell, Beadall,Beadles,Bidwell,Bidell,Bidall,Bidles[/TD]
[/TR]
</tbody>[/TABLE]

and [TABLE="width: 64"]
<tbody>[TR]
[TD="width: 64"](H)((A((l/u))/El/Il)/O)g(a/e)r(d)[/TD]
[/TR]
</tbody>[/TABLE]
should output
Hagar, Hagard,Hager, Hagerd, Halgar, Halgard, Halger, Halgerd, Haugar, Haugard, Hauger, Haugerd, Helgar, Helgard, Helger, Helgerd, Hilgar, Hilgard, Hilger, Hilgerd, Hogar, Hogard, Hoger, Hogerd, Agar, Agard, Ager, Agerd, Algar, Algard, Alger, Algerd, Augar, Augard, Auger, Augerd, Elgar, Elgard, Elger, Elgerd, Ilgar, Ilgard, Ilger, Ilgerd, Ogar, Ogard, Oger, Ogerd

The ideal output format would be in two columns, Column A would hold the unique identifier and Column B would hold the alternate spelling. Each row would hold a different alternate spelling. For example, if the input data was:

[TABLE="width: 500"]
<tbody>[TR]
[TD]Column A[/TD]
[TD]Column B[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD](A/E)ldridge[/TD]
[/TR]
</tbody>[/TABLE]

The output data would be:
[TABLE="width: 500"]
<tbody>[TR]
[TD]Column A[/TD]
[TD]Column B[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Aldridge[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Eldridge[/TD]
[/TR]
</tbody>[/TABLE]


Thank you very much for reading I hope you can help.
 
Try

Data exactly as in post 9 (header in A1)

Code:
Option Explicit

Dim regexOR As Object ' regex for OR
Dim regexOPT As Object ' regex for Optional
Dim dic As Object ' Stores the alternative Spelling

Sub AlternativeSpellings()
    Dim s As String, rCell As Range, LR As Long, lInd As Long
    
    Columns("B").Clear ' clear previous results
    Range("B1") = "List"
    Range("C1") = "Number"
    
    Set regexOR = CreateObject("VBScript.RegExp")
    regexOR.Pattern = "\(([^()/]+(/[^()/]+)+)\)"
    
    Set regexOPT = CreateObject("VBScript.RegExp")
    regexOPT.Pattern = "\(([^()]+)\)"
    
    For Each rCell In Range("A2:A" & Cells(Rows.Count, "A").End(xlUp).Row)
        s = rCell.Value
        Set dic = CreateObject("Scripting.Dictionary")
        AlternativeSpellings1 s
        LR = Cells(Rows.Count, "B").End(xlUp).Row
        Range("B" & LR + 1).Resize(dic.Count).Value = Application.Transpose(dic.keys)
        lInd = lInd + 1
        Range("C" & LR + 1).Resize(dic.Count).Value = lInd
    Next rCell
End Sub

Sub AlternativeSpellings1(s As String)
    Dim regexMatches As Object
    Dim vOR As Variant
    
    Set regexMatches = regexOR.Execute(s)
    If regexMatches.Count = 1 Then
        For Each vOR In Split(regexMatches(0).submatches(0), "/")
            AlternativeSpellings1 regexOR.Replace(s, vOR)
        Next vOR
    Else
        Set regexMatches = regexOPT.Execute(s)
        If regexMatches.Count = 1 Then
            AlternativeSpellings1 regexOPT.Replace(s, "")
            AlternativeSpellings1 regexOPT.Replace(s, regexMatches(0).submatches(0))
        Else
            s = StrConv(s, vbProperCase)
            If Not dic.exists(s) Then dic.Add s, ""
        End If
    
    End If
End Sub

M.
 
Upvote 0

Excel Facts

Save Often
If you start asking yourself if now is a good time to save your Excel workbook, the answer is Yes
You are welcome. Glad to help :)

We should thank pgc01 who did the important part. I just contributed by tailoring his code.

M.
 
Upvote 0
I'm glad it helped.
Marcelo, thanks much for helping out.
I've been really tied up these last days.
 
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top