Extracting Multiple Numbers from String

bolillo1234

New Member
Joined
Jan 5, 2009
Messages
5
I have thousands of fields that contain addresses and I need to extract only the numbers. The problem I am facing is that there are two numbers within the string and they begin at different times.

Examples of the data
1409 N 250 W
259 West 158 Johnson Blvd
109829 South 344 East

What Im trying to get
1409250
259158
109829344

Thanks,
Bolillo
 
Can you please provide us with a sample of the data, along with the expected result?


Example:

[TABLE="width: 496"]
<tbody>[TR]
[TD="width: 496"]SCM EQUITY JV:CORPORATE ACTIONS:CA EVENT BCSL:9448:PREVIOUS YEARS LOCKED:Y2010 LOCKED


The data I was looking to extract, was the first set of numbers.
What I did for example was:

Column G:
=TRIM(LEFT(C1090,MIN(FIND({1,2,3,4,5,6,7,8,9},C1090&"123456789"))-1))

Column H:
=TRIM(REPLACE(C1090,1,LEN(J1090),""))

Column I:
=LEFT(K1090,4)

99.9% of the data population worked. A few of the numbers came back with a :944, possibly due to an additional character. I liked your formula, but I could not understand how to use it unfortunately.[/TD]
[/TR]
</tbody>[/TABLE]
 
Upvote 0

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
Example:

[TABLE="width: 496"]
<tbody>[TR]
[TD="width: 496"]SCM EQUITY JV:CORPORATE ACTIONS:CA EVENT BCSL:9448:PREVIOUS YEARS LOCKED:Y2010 LOCKED


The data I was looking to extract, was the first set of numbers.
What I did for example was:

Column G:
=TRIM(LEFT(C1090,MIN(FIND({1,2,3,4,5,6,7,8,9},C1090&"123456789"))-1))

Column H:
=TRIM(REPLACE(C1090,1,LEN(J1090),""))

Column I:
=LEFT(K1090,4)

99.9% of the data population worked. A few of the numbers came back with a :944, possibly due to an additional character. I liked your formula, but I could not understand how to use it unfortunately.[/TD]
[/TR]
</tbody>[/TABLE]

One example does not usually give us all the information necessary (a representative sample from all the possible values is more useful). Using the colon as a delimiter, is the value you want always the 4th field? If so, then this formula should work (assuming your text is never longer than 500 characters)...

=TRIM(MID(SUBSTITUTE(":"&A1,":",REPT(" ",500)),2000,500))
 
Upvote 0
One example does not usually give us all the information necessary (a representative sample from all the possible values is more useful). Using the colon as a delimiter, is the value you want always the 4th field? If so, then this formula should work (assuming your text is never longer than 500 characters)...

=TRIM(MID(SUBSTITUTE(":"&A1,":",REPT(" ",500)),2000,500))

Rick,

In this instance it remains constant because each colon is used as a LEVEL function (levels 1 - 5, or 6 depending on info)
 
Upvote 0
Rick,

In this instance it remains constant because each colon is used as a LEVEL function (levels 1 - 5, or 6 depending on info)

Then the formula I posted in Message #22 (once you adjust it for the cell that has data in it) should be working for you then, correct?
 
Upvote 0
Then the formula I posted in Message #22 (once you adjust it for the cell that has data in it) should be working for you then, correct?


In this instance it works perfectly. Would you be able to explain the logic behind the formula?

I followed the substitute/mid/trim function, as well as the Rept (repetition function). I am having a bit of trouble interpreting the use of the Max 500 characters, and the other.
 
Upvote 0
In this instance it works perfectly. Would you be able to explain the logic behind the formula?

I followed the substitute/mid/trim function, as well as the Rept (repetition function). I am having a bit of trouble interpreting the use of the Max 500 characters, and the other.
Not able to answer you fully at the moment, but since you were able to work out the other parts of the formula, seeing the generalized formula my posting was constructed from may give you all the hints you need to figure it out...

=TRIM(MID(SUBSTITUTE(delimiter&A1,delimiter,REPT(" ",999)),fieldnumber*999,999))

Note: I replaced the 999 with 500 for my response to you... all that is necessary is that the number be larger (maybe by 10% minimum) than the maximum length of your text (for the generized setup... earlier fields do not require as much extra space as latter fields).
 
Upvote 0
Not able to answer you fully at the moment, but since you were able to work out the other parts of the formula, seeing the generalized formula my posting was constructed from may give you all the hints you need to figure it out...

=TRIM(MID(SUBSTITUTE(delimiter&A1,delimiter,REPT(" ",999)),fieldnumber*999,999))

Note: I replaced the 999 with 500 for my response to you... all that is necessary is that the number be larger (maybe by 10% minimum) than the maximum length of your text (for the generized setup... earlier fields do not require as much extra space as latter fields).

So what I am inferring here, is that this function Replaces any text information with only a number in a field of 500 characters or less?
 
Upvote 0
I have some comments.

=TRIM(MID(SUBSTITUTE(":"&A1,":",REPT(" ",500)),2000,500))
Rick, given your usual penchant for compactness, I can't see the reason for the red part.
Wouldn't this do the same job?
=TRIM(MID(SUBSTITUTE(A1,":",REPT(" ",500)),1500,500))


Pyrrhus272BC
In relation to the use of 500/2000/500 that I think you are asking about, I'm going to assume that we have a colon delimited string and we are trying to extract a number, being the 4th 'term' in the string and that a structure like the blue formula above can be used.

  1. The three numbers used would be n, 3n and n. That is n, (term number of interest - 1)*n and n

  2. n is unrelated to the overall length of the original string. Row 1 length in my samples below is 3 times n & the formula works fine

  3. Instead, n is important in relation to the length of the original string from the start to the end of the desired term, excluding delimiters. n must be greater than this number.
    In row 2 I've added an extra 2 and in row 3 I've added an extra 4. In each case the length (excluding colons) to the end of the desired term is now 6, which is not less than n (=6). Hence the formula fails to extract the term correctly.

  4. In the 1 example given in this problem the distance to the end of term (excl : ) is 47 so that particular extraction would work with 48, 144, 48
    Of course the preceding text could be longer but I don't think it unreasonable to expect that your distance to end of term would be less than 255 characters. That is over 5 times that part of the length of the example given. Therefore, I think we could use
    =TRIM(MID(SUBSTITUTE(A1,":",REPT(" ",255)),765,255))
    200,600,200 might still be plenty & easier to work with but I've chosen 255 for a particular reason...

  5. Since in your question, we are extracting a number it may be that you do actually want a number & not text. All the formulas above return text.
    For any of them, including Rick's & my blue one above, the result can be turned into a number by adding a "+0" on the end of the formula.

    However, if n is 255 or less, we can return a number and eliminate one of the functions (TRIM) from the formula:
    =MID(SUBSTITUTE(A1,":",REPT(" ",255)),765,255)+0

    For n > 255 the last formula returns an error & we'd have to revert to, for example,
    =TRIM(MID(SUBSTITUTE(A1,":",REPT(" ",500)),1500,500))+0 to extract the number as a number


Excel Workbook
ABCDE
1StringExpected ResultFormula ResultOriginal LengthLen to end of desired term (excl :)
21:2:3:44:5555555554444185
31:22:3:44:555555555444196
41:2:3:444:55555555544444196
5
61:2:3:444:55555555444444
71:2:3:444:55555555444#VALUE!
81:2:3:444:55555555444444
Sample




Now, that may be well more than you wanted to know but I actually set it out for myself as much as anything as I've never looked closely at the workings, just tend to pick bigger values for n for longer strings and/or terms further through the strings. :)

.. and if any of the above is not actually right, I'm happy to be corrected.
 
Upvote 0
SCM EQUITY JV:CORPORATE ACTIONS:CA EVENT BCSL:9448:PREVIOUS YEARS LOCKED:Y2010 LOCKED


The data I was looking to extract, was the first set of numbers.



If you want to do it without a formula, select the column and

Find „*:*:*:” and Replace with nothing then
Find ”:*” and Replace with nothing.
(without quotes).

If you want to keep the original data, do this with a copy of the data column.
 
Last edited:
Upvote 0
Hi All,

I have one question. I have to extract particular number.

Eg:

E2 Abc21mhz

I want to extract just 21.

Can any one please tell me how can I in excel?
 
Upvote 0

Forum statistics

Threads
1,224,829
Messages
6,181,224
Members
453,025
Latest member
Hannah_Pham93

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top