Identifying Variant or Simialr Text Entry in One Column

leeandoona

New Member
Joined
Oct 13, 2016
Messages
45
Hi everyone, I've been struggling with this one for some time and I thought I'd throw it out there to see if anyone has experience of solving this type of problem? Sorry if its already been covered, I have looked and can't find anything quite like it.

The problem I have is I have large datasets that contain amongst other things a column which contains text. Many of the text entries are potential variations of what is essentially the same item. For example:

Column A
.............
Rose Scented Candle Black
Scented Candle Rose Blue
Rose Scented Candle Yellow
Scented Candle (Rose) Red

All of the above could loosely be reffered to as simply 'Rose Scented Candle'. So if I could find which items in my column were potentially the same, I could then go a step further and enter a validation column:

[TABLE="width: 380"]
<tbody>[TR]
[TD]Title[/TD]
[TD]Is Variant[/TD]
[TD]Vanilla Title[/TD]
[/TR]
[TR]
[TD]Rose Scented Candle Black[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[/TR]
[TR]
[TD]Scented Candle Rose Blue[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[/TR]
[TR]
[TD]Rose Scented Candle Yellow[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[/TR]
[TR]
[TD]Scented Candle (Rose) Red[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[/TR]
</tbody>[/TABLE]

This would mean I could then identify a SINGLE title, e.g. 'Rose Scented Candle' and have this in an adjacent column. My final step would then be to enter into a further validation column what it's variation actually is, for example:

[TABLE="width: 476"]
<tbody>[TR]
[TD]Title[/TD]
[TD]Is Variant[/TD]
[TD]Vanilla Title[/TD]
[TD]Variant Option[/TD]
[/TR]
[TR]
[TD]Rose Scented Candle Black[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[TD]Black[/TD]
[/TR]
[TR]
[TD]Scented Candle Rose Blue[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[TD]Blue[/TD]
[/TR]
[TR]
[TD]Rose Scented Candle Yellow[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[TD]Yellow[/TD]
[/TR]
[TR]
[TD]Scented Candle (Rose) Red[/TD]
[TD]YES[/TD]
[TD]Rose Scented Candle[/TD]
[TD]Red[/TD]
[/TR]
</tbody>[/TABLE]

The purpose of this is so I can sift through what is a very large dataset of over 20,000 lines and avoid duplicating products to appear on a website and instead populate with unique records which, where applicable, have variations in colour, size or design.

I've tried using Fuzzy lookup but this seems not to work on one column of data. I've tried using Find, Match, IF but my excel skills are limited beyond this. I know there's bound to be an easier way of doing this. Currently my only work around is to manually skip down each cell removing the 'variant' data leaving just the 'vanilla' data and then using substitute to enter the data I deleted into an adjacent column which I then use as the column for the option.

The problem is of course made worse in that there is not always a pattern to the text I'm supplied with, so the variant can appear at the start, middle or end of the text.

Help?:eeek:
 

Excel Facts

How to find 2nd largest value in a column?
MAX finds the largest value. =LARGE(A:A,2) will find the second largest. =SMALL(A:A,3) will find the third smallest
Hi leeandoona,

I just slapped this together quick, but it should help you get the idea.



Code:
Sub test()
 'user enters rose scented candle or any other phrase to search for
 keyPhrase = InputBox("enter the key phrase to search", "Key Phrase", , 9500, 5500)
 
 'splits the phrase into individual strings
 phraseSplit = Split(keyPhrase, " ")




 '1 represents the first column (column 'A')
 lastrow = Cells(Rows.Count, 1).End(xlUp).Row
 
 
 'loop to iterate through the column
 For Each Row In Range("A1:A" & lastrow)
    'loop to check for each key phrase
    For i = LBound(phraseSplit) To UBound(phraseSplit)
        If InStr(LCase(Row), LCase(phraseSplit(i))) Then
            x = True
        Else
            x = False
            Exit For
        End If
    Next
    'if all key phrases match place Yes in the cell adjacent
    If x = True Then
        Row.Offset(0, 1).Value = "yes"
    End If
    'you could also do something to get the color,
        If InStr(LCase(Row), "blue") Then
            Row.Offset(0, 2).Value = "Blue"
        ElseIf InStr(LCase(Row), "red") Then
            Row.Offset(0, 2).Value = "Red"
        End If


 Next
End Sub

All the best!
 
Upvote 0
That's totally helpful thanks very much indeed! I've been using this today and it's sped things up hugely!!!:biggrin:
Now I've just got to figure out how to build this to help me identify 'variants' when I don't know what the key phrase is! That's going to be a challenge!:eeek:

This though is a MAJOR help so thank you thank you thank you thank you!

Hi leeandoona,

I just slapped this together quick, but it should help you get the idea.



Code:
Sub test()
 'user enters rose scented candle or any other phrase to search for
 keyPhrase = InputBox("enter the key phrase to search", "Key Phrase", , 9500, 5500)
 
 'splits the phrase into individual strings
 phraseSplit = Split(keyPhrase, " ")




 '1 represents the first column (column 'A')
 lastrow = Cells(Rows.Count, 1).End(xlUp).Row
 
 
 'loop to iterate through the column
 For Each Row In Range("A1:A" & lastrow)
    'loop to check for each key phrase
    For i = LBound(phraseSplit) To UBound(phraseSplit)
        If InStr(LCase(Row), LCase(phraseSplit(i))) Then
            x = True
        Else
            x = False
            Exit For
        End If
    Next
    'if all key phrases match place Yes in the cell adjacent
    If x = True Then
        Row.Offset(0, 1).Value = "yes"
    End If
    'you could also do something to get the color,
        If InStr(LCase(Row), "blue") Then
            Row.Offset(0, 2).Value = "Blue"
        ElseIf InStr(LCase(Row), "red") Then
            Row.Offset(0, 2).Value = "Red"
        End If


 Next
End Sub

All the best!
 
Upvote 0
Hi again hateme28, I've been using this with some amendments to expand the colour range I'm searching through and its working brilliantly. I'm trying to tidy up the code a bit because I've got like 200+ colour combinations (so thats a lot of dots to run slowing it down) but also I'm betting there's a 'neater' way of writing this? Also I'm wondering is there a way to reference the range of contents rather than manually enter the keyphrase into the InputBox. Reason for trying to do that is simply to speed it up again. Currently I'm manually entering a lot of key phrases which is cool on smaller ranges but on the larger ranges is taking an age. I tried to make the InputBox a local Variable and making it explicit but I goofed up somewhere as I get a bunch of errors back. Is it possible to use an InputBox as a local varaiable or should I be looping through the range and declaring that a variable? Sorry...getting confused now! All in all though, this works like a charm on the smaller datasets.

Cheers

Hi leeandoona,

I just slapped this together quick, but it should help you get the idea.



Code:
Sub test()
 'user enters rose scented candle or any other phrase to search for
 keyPhrase = InputBox("enter the key phrase to search", "Key Phrase", , 9500, 5500)
 
 'splits the phrase into individual strings
 phraseSplit = Split(keyPhrase, " ")




 '1 represents the first column (column 'A')
 lastrow = Cells(Rows.Count, 1).End(xlUp).Row
 
 
 'loop to iterate through the column
 For Each Row In Range("A1:A" & lastrow)
    'loop to check for each key phrase
    For i = LBound(phraseSplit) To UBound(phraseSplit)
        If InStr(LCase(Row), LCase(phraseSplit(i))) Then
            x = True
        Else
            x = False
            Exit For
        End If
    Next
    'if all key phrases match place Yes in the cell adjacent
    If x = True Then
        Row.Offset(0, 1).Value = "yes"
    End If
    'you could also do something to get the color,
        If InStr(LCase(Row), "blue") Then
            Row.Offset(0, 2).Value = "Blue"
        ElseIf InStr(LCase(Row), "red") Then
            Row.Offset(0, 2).Value = "Red"
        End If


 Next
End Sub

All the best!
 
Upvote 0

Forum statistics

Threads
1,223,214
Messages
6,170,774
Members
452,353
Latest member
strainu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top