Find numbers before unit of measurement in string

strangebiscuit

New Member
Joined
Nov 25, 2013
Messages
35
Hi, I've got what seems like a simple conversion task that I'm having trouble puzzling out.

I have a column of product descriptions (text/html) that contain measurements (i.e. "Medium - 6.3cm x 7cm") and I need to find and convert them from cm to inches. I'm having trouble figuring out how to grab the numbers before the "cm" so I can convert them.

I'm fine to do a search/replace for the actual unit of measurement (i.e. changes "cm" to "inches"), but how do I isolate and convert just the numbers that come before the unit?

Would be happy to just do this as a formula in an adjacent column if possible and copy/paste the whole thing, but also fine to use a macro if it makes it easier to figure out.

There are no spaces between the numbers and the unit of measurement.

Need to end up with something like this:

[TABLE="width: 500"]
<tbody>[TR]
[TD][TABLE="width: 393"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Small 7cm x 4.5cm - blah blah blah[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[TD][TABLE="width: 393"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Small 17.8in x 11.4in - blah blah blah[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[/TR]
[TR]
[TD][TABLE="width: 393"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Medium 11.5cm x 7cm - blah blah blah[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[TD]Medium 4.5cin x 2.6in - blah blah blah[TABLE="width: 500"]
<tbody></tbody>[/TABLE]
[/TD]
[/TR]
[TR]
[TD][TABLE="width: 393"]
<colgroup><col></colgroup><tbody>[TR]
[TD]Range: from 5cm x 5cm to 15cm x 5cm. Blah blah blah[/TD]
[/TR]
</tbody>[/TABLE]
[/TD]
[TD]Range: from 2in x 2in to 5.9in x 2in. Blah blah blah[/TD]
[/TR]
</tbody>[/TABLE]

Thanks in advance for any tips you can provide, greatly appreciated!
 

Excel Facts

Format cells as date
Select range and press Ctrl+Shift+3 to format cells as date. (Shift 3 is the # sign which sort of looks like a small calendar).
Assuming you data starts in cell A1, the following macro will put the converted data in Column B starting at cell B1...
Code:
[table="width: 500"]
[tr]
	[td]Sub ChangeCM2IN()
  Dim X As Long, V As Double, Cell As Range, CM() As String
  Application.ScreenUpdating = False
  For Each Cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    CM = Split(Replace(Cell.Value, "x", "x ", , , vbTextCompare), "cm", , vbTextCompare)
    For X = 0 To UBound(CM) - 1
      V = Format(Mid(CM(X), InStrRev(CM(X), " ")) * 0.393700787401575, "0.0")
      CM(X) = Left(CM(X), InStrRev(CM(X), " ")) & V
    Next
    Cell.Offset(, 1).Value = Join(CM, "in")
  Next
  Application.ScreenUpdating = True
End Sub[/td]
[/tr]
[/table]
 
Last edited:
Upvote 0
Whoah that's super impressive (at least to me!). Thanks so much for the response, I deeply appreciate it. Don't fully understand everything going on in there but tested it and it's almost working.

It does work for some cells but breaks on others with either "Run-time error '5': Invalid procedure call or argument" or "Run-time error '13': Type mismatch". This seems to be occurring in places where there is no space prior to the number that comes before the unit of measurement, either because it's at the beginning of a cell (i.e. cell starts with "10.5cm x 9.2cm") or because it's wrapped in an HTML tag or something similar (i.e. "
HTML:
<strong>10.5cm x 9.2cm</strong>
").

Adding a space in those cases seems to allow it to proceed, but there's way too much data to do that for all occurrences. Is there a way to account for that in the code?
 
Last edited:
Upvote 0
It does work for some cells but breaks on others with either "Run-time error '5': Invalid procedure call or argument" or "Run-time error '13': Type mismatch". This seems to be occurring in places where there is no space prior to the number that comes before the unit of measurement, either because it's at the beginning of a cell (i.e. cell starts with "10.5cm x 9.2cm") or because it's wrapped in an HTML tag or something similar (i.e. "
HTML:
<strong>10.5cm x 9.2cm</strong>
").
Does this version solve the problems...
Code:
[table="width: 500"]
[tr]
	[td]Sub ChangeCM2IN()
  Dim X As Long, V As Double, Cell As Range, CM() As String
  Application.ScreenUpdating = False
  For Each Cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    CM = Split(Replace(Replace(" " & Cell.Value, "x", "x ", , , vbTextCompare), ">", " "), "cm", , vbTextCompare)
    For X = 0 To UBound(CM) - 1
      V = Format(Mid(CM(X), InStrRev(CM(X), " ")) * 0.393700787401575, "0.0")
      CM(X) = Left(CM(X), InStrRev(CM(X), " ")) & V
    Next
    Cell.Offset(, 1).Value = Join(CM, "in")
  Next
  Application.ScreenUpdating = True
End Sub[/td]
[/tr]
[/table]
 
Upvote 0
That seems to get a little farther but still breaks. Looks like there are other symbols that could be preceding the number other than space or ">". I believe part of the issue is that the data contains UTF-8 encoding, so there can be various symbols that might precede the numbers. Even when I import the data to Excel specifying UTF-8, the macro seems to hit these symbols and break even if they're not visible.

An example of where it's breaking is a cell which contains:
HTML:
<p><strong>Sizes in this set range from: 9cm x 4.5cm to 2.8cm x 3.5cm.</strong></p>

But when I look at the raw data without specifying encoding it looks like:
HTML:
<p><strong>Sizes in this set range from: 9cm x 4.5cm to 2.8cm x 3.5cm.</strong></p>

So it looks in this case like its the "to 2.8cm" that's breaking things here. Guess it has to be assumed that there may be any non-numerical character preceding the actual value to convert.
 
Last edited:
Upvote 0
Sorry, hate to bump but just wondering if there's any other suggestion to help us put the last pieces of this puzzle together.

Noticed the last version posted is also removing all instances of ">" (due to the extra Remove()) which is problematic as this needs to stay as valid HTML.

Is there any way to do this by identifying the number to convert as all numerical characters that come before "cm" in the text, rather than looking for a space preceding them?
 
Upvote 0
Sorry, hate to bump but just wondering if there's any other suggestion to help us put the last pieces of this puzzle together.
Sorry about the delay... been battling a major system crash (all seems to be better now).

Does this modified version of my macro work as you want...
Code:
[table="width: 500"]
[tr]
	[td]Sub ChangeCM2IN()
  Dim X As Long, Z As Long, FirstDigit As Long, V As Double, Cell As Range, CM() As String
  Application.ScreenUpdating = False
  For Each Cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    CM = Split(" " & Cell.Value, "cm", , vbTextCompare)
    For X = 0 To UBound(CM) - 1
      For Z = Len(CM(X)) To 1 Step -1
        If Mid(CM(X), Z, 1) Like "[!0-9.]" Then
          FirstDigit = Z + 1
          V = Format(Mid(CM(X), FirstDigit) * 0.393700787401575, "0.0")
          Exit For
        End If
      Next
      CM(X) = Left(CM(X), FirstDigit - 1) & V
    Next
    Cell.Offset(, 1).Value = Join(CM, "in")
  Next
  Application.ScreenUpdating = True
End Sub[/td]
[/tr]
[/table]
 
Upvote 0
Yes, that seems to have done it! Amazing!

Only tiny issue I've noticed is that for some reason it seems to be adding a space to the beginning of each cell. Believe that's due to the " " & Cell.Value in:
Code:
CM = Split(" " & Cell.Value, "cm", , vbTextCompare)

Not sure if that " " & is still needed as it no longer relies on looking for spaces, but hardly matters. Should be quite easy to get rid of by simply removing the first character of each cell.

Absolutely no need to apologize, you're helping us immensely. I can't begin to thank you enough. If there's anything I can do to repay your kindness, please let me know.

People like you are utterly inspiring to me!
 
Upvote 0
Only tiny issue I've noticed is that for some reason it seems to be adding a space to the beginning of each cell.
Sorry, I did not even notice that. Here is the fix (highlighted in red)...
Code:
[table="width: 500"]
[tr]
	[td]Sub ChangeCM2IN()
  Dim X As Long, Z As Long, FirstDigit As Long, V As Double, Cell As Range, CM() As String
  Application.ScreenUpdating = False
  For Each Cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    CM = Split(" " & Cell.Value, "cm", , vbTextCompare)
    For X = 0 To UBound(CM) - 1
      For Z = Len(CM(X)) To 1 Step -1
        If Mid(CM(X), Z, 1) Like "[!0-9.]" Then
          FirstDigit = Z + 1
          V = Format(Mid(CM(X), FirstDigit) * 0.393700787401575, "0.0")
          Exit For
        End If
      Next
      CM(X) = Left(CM(X), FirstDigit - 1) & V
    Next
    Cell.Offset(, 1).Value = [B][COLOR="#FF0000"]Trim([/COLOR][/B]Join(CM, "in")[B][COLOR="#FF0000"])[/COLOR][/B]
  Next
  Application.ScreenUpdating = True
End Sub[/td]
[/tr]
[/table]
 
Upvote 0

Similar threads

Forum statistics

Threads
1,224,822
Messages
6,181,165
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top