Extract numerical data from a specific type of substrings

vladimiratanasiu

Active Member
Joined
Dec 17, 2010
Messages
347
Office Version
  1. 365
  2. 2021
Platform
  1. Windows
Hello!

I have a very large database of products. Each product is defined by a string, that includes information about its name (e.g. METRO CHEF MURATURI ASORTATE, MIRINDA STRUGURI SI PEPENE GALBEN etc. – see the red colored texts) and quantity. The quantity of the contents is defined by specific units of measure (UM - see black colored texts), depending on the type of product: weight – (kilo)grams – KG /G; volume – (mili)litre – ML /L; length – centimeters – CM; number of pieces / set - BUC. Some products are not allocated any UM (e.g. ORNAMENT MOS CRACIUN BARBA, CIOCAN LEMN PT CARNE). Most of products have the UM substrings data placed at the end of the string, but in some cases they are inserted in the middle of it (e.g. SOS USTUROI 420G UNIVER, CANA OPAL 250ML PARADISE). No name has the UM at the beginning of the string. I mention that all UM substrings have a maximum number of 4 digits, and are separated from the rest of string’s elements by space(s). Rarely, some values are expressed as decimal numbers with dots (e.g. MIRINDA STRUGURI SI PEPENE GALBEN 0.5L, ZAREA DACIC CIDRU AFINE 0.275L). I need to extract in another column the numerical data of UM substrings, defined on the basis of the above mentioned UMs. I wonder if the substrings could be defined as wildcards, built on the basis of UMs standard names. If the UM substring misses from a string, I would like result be blank or 0 in the destination cell.

Thank you!

P.S. I attached a screenshot too of the table , as the XL2BB couldn't import the original table formatted completly.

Book1.xlsx
AB
1Initial dataDesired results
2ORNAMENT MOS CRACIUN BARBA
3METRO CHEF MURATURI ASORTATE 1600G1600
4MIRINDA STRUGURI SI PEPENE GALBEN 0.5L0,5
5MIRINDA STRUGURI/PEPENE GALBEN 2L2
6IAURT CAPRA AMILACT 340ML340
7COSMIN MUSLI 30% FRUCTE 250G250
8DR.O PRAF TORT GELLE FRUCTE PADURE 8G8
9CHIO POMBAR PIZZA 40G40
10GEL MAINI AMANTE 65ML65
11FORMA TEFLON CHEC 25CM ZENKER25
12TAVA CUPTOR OTI 38X26CM38X26
13MONODOZE CAFEA CAPSULE BAROCCO 50BUCATI50
14CIOCAN LEMN PT CARNE
15APARATE RAS GILLETTE VENUS3 3BUCATI3
16ZAREA DACIC CIDRU AFINE 0.275L0.275
17GIANA ULEI TURTE MASLINE 1000ML1000
18SOS USTUROI 420G UNIVER420
19CANA OPAL 250ML PARADISE250
Sheet1
 

Attachments

  • Untitled.png
    Untitled.png
    18.1 KB · Views: 12
Last edited:

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop
This is the result of previous macro, extracting your mentioned substring as I needed in my first message. As I said, it needs to be updated, but its incipient result is ok. Have a nice day / evening!
 

Attachments

  • Macro results.png
    Macro results.png
    28.2 KB · Views: 11
Last edited:
Upvote 0
Hello, rlv01! I tried to run the macro updated. Unfortunately, it generates the photo messages from below. I attached the full structure of database Fisier Migrare Materiale RETAIL.xlsm, containing more relevant products for testing. Please, verify it and tell me how could I solve this issue. Thank you!

That error means you have two copies of Function OnlyNumbers in your code

VBA Code:
Function OnlyNumbers(S As String) As String
    Dim RX As Object

    Set RX = CreateObject("VBScript.RegExp")
    
    RX.Global = True
    RX.Pattern = "[^0-9]"
    
    OnlyNumbers = RX.Replace(S, "")
End Function

You only need one, so delete one of them.

With respect to your dropbox file, I cannot download unsecure files on the internet to this PC, so I am unable to look at it.
 
Upvote 0
From post #1
What about string like 38X26. Is it be extracted or not. If yes It should be 38X26 or 3826.

That error means you have two copies of Function OnlyNumbers in your code

VBA Code:
Function OnlyNumbers(S As String) As String
    Dim RX As Object

    Set RX = CreateObject("VBScript.RegExp")
  
    RX.Global = True
    RX.Pattern = "[^0-9]"
  
    OnlyNumbers = RX.Replace(S, "")
End Function

You only need one, so delete one of them.

With respect to your dropbox file, I cannot download unsecure files on the internet to this PC, so I am unable to look at it.
Hello, Rlv01! I used your last macro ad litteram, missing this module. I updated it and now errors are gone. However, when applying the macro neither column P (numerical quantities), nor the column Q (UMs) are populated with the necessary data. I attach a photo of the macro used in the file and please tell me what to do in order to solve this problem. Thank you!
 

Attachments

  • Macro.jpg
    Macro.jpg
    135.1 KB · Views: 12
Upvote 0
Hello, Rlv01! I used your last macro ad litteram, missing this module. I updated it and now errors are gone. However, when applying the macro neither column P (numerical quantities), nor the column Q (UMs) are populated with the necessary data. I attach a photo of the macro used in the file and please tell me what to do in order to solve this problem. Thank you!
The problem is that you moved your data to a different column than in your post #1. When data moves, the code must generally be adjusted.

VBA Code:
Sub ExtractNumbers3()
    Dim WS As Worksheet
    Dim CellRange As Range
    Dim R As Range
    Dim I As Long
    Dim NS As String, S As String
    Dim SA As Variant
    Dim StartRow As Long, Ofs As Long
    Dim ResultsCol As String
    Dim DataCol As String

    Set WS = ActiveSheet

    With WS
        StartRow = 9                                  'row to start at
        DataCol = "I"                                 'column for data.
        ResultsCol = "P"                              'column for results.

        Ofs = .Columns(ResultsCol).Column - .Columns(DataCol).Column
        Set CellRange = .Range(DataCol & "1", .Range(DataCol & .Rows.Count).End(xlUp)).Offset(StartRow - 1)
        With CellRange
            Set CellRange = .Resize(.Count - (StartRow - 1))
        End With
    End With

    For Each R In CellRange
        SA = Split(Application.Trim(Replace(R.Value, Chr(9), " ")))
        For I = UBound(SA) To 0 Step -1
            NS = OnlyNumbers(CStr(SA(I)))
            If NS <> "" Then
                NS = SA(I)
                S = CStr(SA(I))
                Do While Not IsNumeric(Right(NS, 1))
                    NS = Left(NS, Len(NS) - 1)
                Loop
                R.Offset(0, Ofs).Value = NS
                S = Mid(S, Len(NS) + 1, Len(S))
                If Len(S) <= 4 Then
                    R.Offset(0, Ofs + 1).Value = S
                Else
                    R.Offset(0, Ofs + 1).Value = ""
                End If
                Exit For
            Else
                R.Offset(0, Ofs).Value = ""
                R.Offset(0, Ofs + 1).Value = ""
            End If
        Next I
    Next R
End Sub


Results:
Book2
AIPQR
1MAKTXBRGEWGEWEI
2Date de baza 1Date de baza 1Date de baza 1
3Descriere material lb romanaGreutate brutăUnitate de greutate
4CHARQUANUNIT
540173
6denumirea produsului in 40 caracteregreutatea bruta
71LEGUME-FRUCTE PRET REDUS 0.4
82ARDEI IUTE GALBEN KG
93BOROMIR CHEC MARMORAT CIOCOLATA/ALUNE 400G400G
104FOX SALAM RUSTIC GASTRO KG
117JUCARIE MASINA FARMER TRUCK
1211JUCARIE SET ANIMALE
1312ORNAMENT MOS CRACIUN BARBA
1413METRO CHEF MURATURI ASORTATE 1600G1600G
1514MIRINDA STRUGURI SI PEPENE GALBEN 0.5L0.5L
1615MIRINDA STRUGURI/PEPENE GALBEN 2L2L
1716LIPTON ICE TEA GREEN WHITE PIERSICA 1.5L1.5L
1817IAURT CAPRA AMILACT 340ML340ML
Sheet7
 
Upvote 0
Solution
The problem is that you moved your data to a different column than in your post #1. When data moves, the code must generally be adjusted.

VBA Code:
Sub ExtractNumbers3()
    Dim WS As Worksheet
    Dim CellRange As Range
    Dim R As Range
    Dim I As Long
    Dim NS As String, S As String
    Dim SA As Variant
    Dim StartRow As Long, Ofs As Long
    Dim ResultsCol As String
    Dim DataCol As String

    Set WS = ActiveSheet

    With WS
        StartRow = 9                                  'row to start at
        DataCol = "I"                                 'column for data.
        ResultsCol = "P"                              'column for results.

        Ofs = .Columns(ResultsCol).Column - .Columns(DataCol).Column
        Set CellRange = .Range(DataCol & "1", .Range(DataCol & .Rows.Count).End(xlUp)).Offset(StartRow - 1)
        With CellRange
            Set CellRange = .Resize(.Count - (StartRow - 1))
        End With
    End With

    For Each R In CellRange
        SA = Split(Application.Trim(Replace(R.Value, Chr(9), " ")))
        For I = UBound(SA) To 0 Step -1
            NS = OnlyNumbers(CStr(SA(I)))
            If NS <> "" Then
                NS = SA(I)
                S = CStr(SA(I))
                Do While Not IsNumeric(Right(NS, 1))
                    NS = Left(NS, Len(NS) - 1)
                Loop
                R.Offset(0, Ofs).Value = NS
                S = Mid(S, Len(NS) + 1, Len(S))
                If Len(S) <= 4 Then
                    R.Offset(0, Ofs + 1).Value = S
                Else
                    R.Offset(0, Ofs + 1).Value = ""
                End If
                Exit For
            Else
                R.Offset(0, Ofs).Value = ""
                R.Offset(0, Ofs + 1).Value = ""
            End If
        Next I
    Next R
End Sub


Results:
Book2
AIPQR
1MAKTXBRGEWGEWEI
2Date de baza 1Date de baza 1Date de baza 1
3Descriere material lb romanaGreutate brutăUnitate de greutate
4CHARQUANUNIT
540173
6denumirea produsului in 40 caracteregreutatea bruta
71LEGUME-FRUCTE PRET REDUS 0.4
82ARDEI IUTE GALBEN KG
93BOROMIR CHEC MARMORAT CIOCOLATA/ALUNE 400G400G
104FOX SALAM RUSTIC GASTRO KG
117JUCARIE MASINA FARMER TRUCK
1211JUCARIE SET ANIMALE
1312ORNAMENT MOS CRACIUN BARBA
1413METRO CHEF MURATURI ASORTATE 1600G1600G
1514MIRINDA STRUGURI SI PEPENE GALBEN 0.5L0.5L
1615MIRINDA STRUGURI/PEPENE GALBEN 2L2L
1716LIPTON ICE TEA GREEN WHITE PIERSICA 1.5L1.5L
1817IAURT CAPRA AMILACT 340ML340ML
Sheet7
Thank you very much, Rlv01! It works perfectly, you are brilliant!:)
 
Upvote 0

Forum statistics

Threads
1,223,893
Messages
6,175,240
Members
452,621
Latest member
Laura_PinksBTHFT

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top