Extracting text/numbers from the text string

alexgroup

New Member
Joined
Aug 2, 2014
Messages
14
Hello Everyone,

I gave up... and searching for help from those whose IQ is definitely much higher!

I took a task (big mistake!!!) at my work to help my team to clean the DB. The long story short - it is a small DB with 68K or rows of data of groceries. I need to extract all units of weight for every single items. I thought it's going to be easy... but ... It is not at all!

If you can help me with I'll pray for you ! No jokes here

Sample of data:

[TABLE="class: grid, width: 500, align: left"]
<tbody>[TR]
[TD][/TD]
[TD]A[/TD]
[TD]B[/TD]
[TD]C[/TD]
[TD]D[/TD]
[/TR]
[TR]
[TD]1[/TD]
[TD]Name[/TD]
[TD]Price[/TD]
[TD]Description[/TD]
[TD]Warehouse [/TD]
[/TR]
[TR]
[TD]2[/TD]
[TD]Lay's Potato Chips[/TD]
[TD]$1.67[/TD]
[TD]Selected products, 160 - 240 g, 4 types[/TD]
[TD]34499238[/TD]
[/TR]
[TR]
[TD]3[/TD]
[TD]Dare Cookies[/TD]
[TD]$2.05[/TD]
[TD]657 g Selected varieties[/TD]
[TD]34491123[/TD]
[/TR]
[TR]
[TD]4[/TD]
[TD]Condensed Milk[/TD]
[TD]$1.50[/TD]
[TD]Eagle Brand Selected varieties 250-370 g, East region[/TD]
[TD]3449A224[/TD]
[/TR]
[TR]
[TD]5[/TD]
[TD]Nelson Milk[/TD]
[TD]$3.90[/TD]
[TD]1%, 2% or 4% 4L, available in different containers[/TD]
[TD]3449M18[/TD]
[/TR]
</tbody>[/TABLE]

In this particular table I was tasked to extract all UNITS such as:

from C2: 160 -240 g
from C3: 657 g
from C4: 250-370 g
from C5: 4L

As you can see there are no consistency whatsoever.
Any thoughts guys?

Thanks a million all those who found time to respond.
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
This does not pull out everthing (still working on that), but it will give you a big jump in the right direction. I put this in E2, copied down...
=LOOKUP(99^99,--("0"&MID(C2,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},C2&"0123456789")),ROW($1:$10000)))) Courtesy of:*Ron Coderre
 
Upvote 0
This does not pull out everthing (still working on that), but it will give you a big jump in the right direction. I put this in E2, copied down...
=LOOKUP(99^99,--("0"&MID(C2,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},C2&"0123456789")),ROW($1:$10000)))) Courtesy of:*Ron Coderre


Thanks a lot! I'll try it.
Just by looking at it, I was even close to it! :)
 
Upvote 0
Try this UDF , that is:

1) Open your workbook
2) Press Alt-F11 to open the VBA editor
3) Click Insert / Module from the menu
4) Paste the following code into that window

Code:
Function ExtUnit(s As String) As String
    With CreateObject("VBScript.Regexp")
        .Pattern = ".*?([\d- ]+[ ]*(g|L)).*"
        If .test(s) Then ExtUnit = Application.Trim(.Replace(s, "$1"))
    End With
End Function

5) Close the VBA editor, so you return to the worksheet
6) Go to the list of functions (fx) and in „Select a category” select „User defined”
7) Click on ExtUnit and enter the address of the first cell to process then copy down

I am sure further changes will be needed as the data does not seem to be consistent.
 
Upvote 0
Try this UDF , that is:

1) Open your workbook
2) Press Alt-F11 to open the VBA editor
3) Click Insert / Module from the menu
4) Paste the following code into that window

Code:
Function ExtUnit(s As String) As String
    With CreateObject("VBScript.Regexp")
        .Pattern = ".*?([\d- ]+[ ]*(g|L)).*"
        If .test(s) Then ExtUnit = Application.Trim(.Replace(s, "$1"))
    End With
End Function

5) Close the VBA editor, so you return to the worksheet
6) Go to the list of functions (fx) and in „Select a category” select „User defined”
7) Click on ExtUnit and enter the address of the first cell to process then copy down

I am sure further changes will be needed as the data does not seem to be consistent.


Thanks a million for your help. I could not comprehend the logic though. Also I tried to play with it and it does work.
 
Upvote 0
You are welcome.

Regular expressions used in this UDF are relatively easy to create, but are much slower than formulas or regex-free VBAs. So, before using the UDF, it is advisable to check the processing time only on a small part of your 68K data (say on 5K). I guess, processing 68K data similar to the data you provided will take more than 2 minutes. If this is not acceptable, a formula or regex-free VBA will be needed.

To keep the code, the workbook should be saved as „Excel Macro-enabled Workbook” after processing.
 
Upvote 0

Forum statistics

Threads
1,225,401
Messages
6,184,760
Members
453,254
Latest member
topeb

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top