How to create a smart split?

PAK

New Member
Joined
Nov 11, 2009
Messages
4
Hi, thanks for taking the time to look at this.
Have looked high and low so hopefully someone here can point me in the right direction.

I am on XP, Excel 2003 or 2007.
I need to split a description field provided by suppliers in their varied formats into pre defined columns so that I can clean, format & concatenate it back to suit it's use in Catalogues, Invoices & Web pages.
Below is a small example of the data provided by a particular supplier.

<TABLE style="WIDTH: 434pt; BORDER-COLLAPSE: collapse" cellSpacing=0 cellPadding=0 width=579 border=0 x:str><COLGROUP><COL style="WIDTH: 215pt; mso-width-source: userset; mso-width-alt: 10496" width=287><COL style="WIDTH: 219pt; mso-width-source: userset; mso-width-alt: 10678" width=292><TBODY><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; WIDTH: 215pt; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" width=287 height=17>MARBIG TWIN WIRE SPIRAL NOTEBOOKS</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; WIDTH: 219pt; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent" width=292>A5 Blue 80 Pg SB</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG TWIN WIRE SPIRAL NOTEBOOKS</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Red 80 Pg SB</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG 2 TONE NOTEBOOK</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Twin Wire Pink 160Pg</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG 2 TONE NOTEBOOK</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Twin Wire Purple 160Pg</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG 2 TONE NOTEBOOK</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Twin Wire Teal 160Pg</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG TWIN WIRE SPIRAL NOTEBOOKS</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Green Pastel 80 Pg SB</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG TWIN WIRE SPIRAL NOTEBOOKS</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Pink Pastel 80 Pg SB</TD></TR><TR style="HEIGHT: 12.75pt" height=17><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; HEIGHT: 12.75pt; BACKGROUND-COLOR: transparent" height=17>MARBIG TWIN WIRE SPIRAL NOTEBOOKS</TD><TD style="BORDER-RIGHT: #e0dfe3; BORDER-TOP: #e0dfe3; BORDER-LEFT: #e0dfe3; BORDER-BOTTOM: #e0dfe3; BACKGROUND-COLOR: transparent">A5 Purple Pastel 80 Pg SB</TD></TR></TBODY></TABLE>

A description may contain a "Brand" (MARBIG), "Basic Desc" (NOTEBOOK), "Size" (A5), "Capacity" (80Pg), "Colour" (Pink Pastel) and so on that need to be extracted and sorted into specific columns titled as such.
I would like to be able to define for example "MARBIG" as a brand in a "Brand" table so that when it is found in a description it will place it in a "Brand" column. The more I define, the more details are split into the matching column. Once a component is defined it would obviously then ideally resolve all cases to the appropriate column saving years of work as every supplier has their own fandangled way of writing a description.

If you can help that would be grouse, I have about 40000 SKU's and an obsession for neat data.

Regards
PAK

Melbourne, AUSTRALIA
 

Excel Facts

Pivot Table Drill Down
Double-click any number in a pivot table to create a new report showing all detail rows that make up that number
hi PAK,

Maybe a ReverseLookup function, which will do most of the work:
Code:
Option Explicit
Function ReverseLookup(ByVal LookupValue As String, ByRef LookupRange As Range) As Variant
Dim iPtr As Integer
Dim lMatch As Long
Dim saCheckString() As String

saCheckString = Split(WorksheetFunction.Trim(LookupValue), " ")
For iPtr = 0 To UBound(saCheckString)
    lMatch = 0
    On Error Resume Next
    lMatch = WorksheetFunction.Match(saCheckString(iPtr), LookupRange, 0)
    On Error GoTo 0
    If lMatch<> 0 Then
        ReverseLookup = saCheckString(iPtr)
        Exit Function
    End If
Next iPtr
ReverseLookup = CVErr(xlErrNA)

End Function

Example of call:
Excel Workbook
ABCDEF
1BrandBasic DescSizeColour
2MARBIG TWIN WIRE SPIRAL NOTEBOOKSA5 Blue 80 Pg SBMARBIGNOTEBOOKSA5Blue
3MARBIG TWIN WIRE SPIRAL NOTEBOOKSA5 Red 80 Pg SBMARBIGNOTEBOOKSA5Red
4MARBIG 2 TONE NOTEBOOKA5 Twin Wire Pink 160PgMARBIGNOTEBOOKA5Pink
Sheet1
Excel 2003
Cell Formulas
RangeFormula
C2=reverselookup($A2&" "&$B2,I:I)
D2=reverselookup($A2&" "&$B2,J:J)
E2=reverselookup($A2&" "&$B2,K:K)
F2=reverselookup($A2&" "&$B2,L:L)


Example of tables:
Excel Workbook
IJKL
1BrandsBasic DescsSizesColours
2aaaaNotebookA5Red
3bbbbNetBookA6Yellow
4ccccTabletA7blue
5ddddNotebooksBgreen
6MARBIGPink
7Purple
8Pastel
9Teal
Sheet1
Excel 2003
 
Upvote 0
you say the info "may" contain a brand
will the brand name always be in the beginning of the text?
if the info does not have a brand, is there a way of identifying the brand?
 
Upvote 0
That is spot on al_b_cnu! :)
The best IT help I have ever had.

I have tested and found I need to be able to treat multiple words as one.
EG "Twin Wire", "F Cap", "Manilla Folder", "80 pg" or "ABC CBA BCA"
Currently it returns #N/A.
Is there a way of refining this?

Otherwise you are on my Christmas card list now anyway.
Thanks again.
 
Upvote 0
you say the info "may" contain a brand
will the brand name always be in the beginning of the text?
if the info does not have a brand, is there a way of identifying the brand?

Thanks for asking wsjackman.
The data is from various suppliers and they all have various orders in which they write their descriptions, hence my dilema. Some don't put brand anywhere. After using the reverse lookup as provided above I will still have blank cells to be filled and data to clean & format but it's a massive improvement on doing one by one.

I have 21 columns so far for breaking up a product description.
Once split I will then need to concatanate back together based on product group/style as different products require only certain details in their description. It will allow me to plan a logical short description for Invoicing and product comparative fields for the web commerce instead of just random bullet points.

Anyway sorry for waffling and thanks for any help.
Regards
PAK
 
Upvote 0
That is spot on al_b_cnu! :)
The best IT help I have ever had.

I have tested and found I need to be able to treat multiple words as one.
EG "Twin Wire", "F Cap", "Manilla Folder", "80 pg" or "ABC CBA BCA"
Currently it returns #N/A.
Is there a way of refining this?

Otherwise you are on my Christmas card list now anyway.
Thanks again.

Hi PAK,

Try this UDF:
Code:
Option Explicit
Function ReverseLookup(ByVal LookupValue As String, ByRef LookupRange As Range, Optional MultipleWords As Integer = 2) As Variant
Dim iPtr As Integer, iPtr1 As Integer, iPtr2 As Integer
Dim lMatch As Long
Dim saCheckString() As String, sCurCheckString As String

saCheckString = Split(WorksheetFunction.Trim(LookupValue), " ")
For iPtr1 = 1 To MultipleWords
    For iPtr = 0 To UBound(saCheckString) - iPtr1 + 1
        sCurCheckString = ""
        For iPtr2 = 1 To iPtr1
            sCurCheckString = sCurCheckString & " " & saCheckString(iPtr + iPtr2 - 1)
        Next iPtr2
        sCurCheckString = Mid$(sCurCheckString, 2)
        
        lMatch = 0
        On Error Resume Next
        lMatch = WorksheetFunction.Match(sCurCheckString, LookupRange, 0)
        On Error GoTo 0
        If lMatch<> 0 Then
            ReverseLookup = sCurCheckString
            Exit Function
        End If
    Next iPtr
Next iPtr1
ReverseLookup = CVErr(xlErrNA)

End Function

You'll notice the additional optional parameter 'MultipleWords' which is defaulted to 2.

Example:
Excel Workbook
ABCDEFG
1BrandBasic DescSizeColourCapacity
2MARBIG TWIN WIRE SPIRAL NOTEBOOKSA5 Blue 80 Pg SBMARBIGNOTEBOOKSA5Blue80 Pg
3MARBIG TWIN WIRE SPIRAL NOTEBOOKSA5 Red 80 Pg SBMARBIGNOTEBOOKSA5Red80 Pg
4MARBIG 2 TONE NOTEBOOKA5 Twin Wire Pink 160PgMARBIGNOTEBOOKA5Pink160Pg
Sheet1
Excel 2003
Cell Formulas
RangeFormula
C2=reverselookup($A2&" "&$B2,I:I)
D2=reverselookup($A2&" "&$B2,J:J)
E2=reverselookup($A2&" "&$B2,K:K)
F2=reverselookup($A2&" "&$B2,L:L)
G2=reverselookup($A2&" "&$B2,M:M)


The lookup tables:
Excel Workbook
IJKLM
1BrandsBasic DescsSizesColoursCapacities
2aaaaNotebookA5Red160pg
3bbbbNetBookA6Yellow160 pg
4ccccTabletA7blue80pg
5ddddNotebooksBgreen80 pg
6MARBIGPink
7Purple
8Pastel
9Teal
Sheet1
Excel 2003

Can I have a card with a Robin on then?
 
Upvote 0
Actually, this is slightly better - it checks for multiple words before single words:
Code:
Option Explicit
Function ReverseLookup(ByVal LookupValue As String, ByRef LookupRange As Range, Optional MultipleWords As Integer = 2) As Variant
Dim iPtr As Integer, iPtr1 As Integer, iPtr2 As Integer
Dim lMatch As Long
Dim saCheckString() As String, sCurCheckString As String

saCheckString = Split(WorksheetFunction.Trim(LookupValue), " ")
For iPtr1 = MultipleWords To 1 Step -1
    For iPtr = 0 To UBound(saCheckString) - iPtr1 + 1
        sCurCheckString = ""
        For iPtr2 = 1 To iPtr1
            sCurCheckString = sCurCheckString & " " & saCheckString(iPtr + iPtr2 - 1)
        Next iPtr2
        sCurCheckString = Mid$(sCurCheckString, 2)
        
        lMatch = 0
        On Error Resume Next
        lMatch = WorksheetFunction.Match(sCurCheckString, LookupRange, 0)
        On Error GoTo 0
        If lMatch <> 0 Then
            ReverseLookup = sCurCheckString
            Exit Function
        End If
    Next iPtr
Next iPtr1
ReverseLookup = CVErr(xlErrNA)

End Function
 
Upvote 0
Hi PAK,

In anticipation to your next requirement, this UDF includes an optional parameter which will treat the specified characters (comma by default) as word breaks:
Code:
Option Explicit
Function ReverseLookup(ByVal LookupValue As String, _
                       ByRef LookupRange As Range, _
                       Optional MultipleWords As Integer = 2, _
                       Optional TreatAsSpace As String = ",") As Variant
Dim iPtr As Integer, iPtr1 As Integer, iPtr2 As Integer
Dim lMatch As Long
Dim saCheckString() As String, sCurCheckString As String

For iPtr = 1 To Len(TreatAsSpace)
    LookupValue = Replace(LookupValue, Mid$(TreatAsSpace, iPtr, 1), " ")
Next iPtr

saCheckString = Split(WorksheetFunction.Trim(LookupValue), " ")
For iPtr1 = MultipleWords To 1 Step -1
    For iPtr = 0 To UBound(saCheckString) - iPtr1 + 1
        sCurCheckString = ""
        For iPtr2 = 1 To iPtr1
            sCurCheckString = sCurCheckString & " " & saCheckString(iPtr + iPtr2 - 1)
        Next iPtr2
        sCurCheckString = Mid$(sCurCheckString, 2)
        
        lMatch = 0
        On Error Resume Next
        lMatch = WorksheetFunction.Match(sCurCheckString, LookupRange, 0)
        On Error GoTo 0
        If lMatch <> 0 Then
            ReverseLookup = sCurCheckString
            Exit Function
        End If
    Next iPtr
Next iPtr1
ReverseLookup = CVErr(xlErrNA)

End Function
 
Upvote 0

Forum statistics

Threads
1,223,909
Messages
6,175,310
Members
452,634
Latest member
cpostell

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top