Formula to Extract Numbers from an Alphanumeric String

gonesalsa

New Member
Joined
May 5, 2014
Messages
7
Hello
I'm wondering, if there is formula to extract numbers from an alphanumeric string. The numbers will be 5 digits, but the placement within the string will vary. I was using MID, but I have to keep changing the formula based on the placement of the numbers in the string and it's not very efficent. Thanks!

Examples are:
[TABLE="width: 227"]
<tbody>[TR]
[TD]22047 - Cjljj
S022509 Chhheefe 5/12/14
Self Pay # 022597 jjjkjkj
jljlj cheque #022378
[TABLE="width: 170"]
<colgroup><col style="width: 170pt; mso-width-source: userset; mso-width-alt: 8265;" width="226"> <tbody>[TR]
[TD="width: 226, bgcolor: transparent"] [/TD]
[/TR]
</tbody>[/TABLE]


[TABLE="width: 170"]
<colgroup><col style="width: 170pt; mso-width-source: userset; mso-width-alt: 8265;" width="226"> <tbody>[TR]
[TD="width: 226, bgcolor: transparent"] [/TD]
[/TR]
</tbody>[/TABLE]


[TABLE="width: 227"]
<tbody>[TR]
[TD]

[/TD]
[/TR]
</tbody><colgroup><col></colgroup>[/TABLE]


[/TD]
[/TR]
</tbody><colgroup><col></colgroup>[/TABLE]
 

Excel Facts

How can you automate Excel?
Press Alt+F11 from Windows Excel to open the Visual Basic for Applications (VBA) editor.
What would the answers be for the examples above?

Are you omitting the 0's(Zeroes)
 
Upvote 0
See which of these gives your desired result


Excel 2010
AB
122047 - Cjljj22047
2S022509 Chhheefe 5/12/1422509
3Self Pay # 022597 jjjkjkj22597
4jljlj cheque #02237822378
Sheet12
Cell Formulas
RangeFormula
B1{=MID(A1,MATCH(TRUE,IF(LEFT(MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5),1)<>"0",IF(LEFT(MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5),1)<>" ",ISNUMBER(1*MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5)))),0),5)}
Press CTRL+SHIFT+ENTER to enter array formulas.


OR


Excel 2010
AB
122047 - Cjljj22047
2S022509 Chhheefe 5/12/1402250
3Self Pay # 022597 jjjkjkj02259
4jljlj cheque #02237802237
Sheet13
Cell Formulas
RangeFormula
B1{=MID(A1,MATCH(TRUE,IF(LEFT(MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5),1)<>" ",ISNUMBER(1*MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5))),0),5)}
Press CTRL+SHIFT+ENTER to enter array formulas.
 
Upvote 0
Thank you so much! The first formula works.

Can you explain this formula? I want to learn how to solve these problems myself.
 
Upvote 0
Thank you so much! The first formula works.

Can you explain this formula? I want to learn how to solve these problems myself.

Let me give this a try although I might leave out some parts you deem as important :)

Feel free to ask for further clarification, I will be sure to provide the ones i have answers to

EXPLANATION

The Logic behind the formula is to pick all consecutive 5 characters(could be numbers) that can be extracted from the string 1.e from the 1st character, pick 5 characters, from the second character pick 5 and on and on which in your case would be "22047","2047 ","047 - ","47 - C", "7 - Cj"
That’s what this section does MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5)

What you want to check now is to ensure that the strings9all the set of 5 characters) it has selected only contains numbers, hence i used ISNUMBER which tells me true for numbers and false for non-numbers, the new formula at this stage gives us ISNUMBER(1*MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5)) and it will return TRUE for 22047, TRUE for "2047", FALSE for "047 - "

What I then do is to use the MATCH Function to get the first instance of TRUE 'cos we would be having something like
{TRUE,TRUE,FALSE,FALSE........)

The match in this case gives us 1 as the first instance of TRUE meaning the first string in the set of strings meets our criteria which is 5 digits and they are all numbers
So you wrap that in the MID to say, give me five characters starting from where the match starts from an give me 5 characters
That’s the basic thing(I hope I have explained well)

But you will see I use like two IF statements in there
1. One IF is to ensure that we don’t have a string of 5 numbers starting with 0, cos you seem to want the numbers but not starting with the zero
so IF(LEFT(MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5),1)<>"0"

2. second IF to be sure the first or last character/number in the string is not a SPACE cos "2047 " will give you true when i use ISNUMBER, so to avoid that I used an IF i.e. IF(LEFT(MID(A1,ROW(INDIRECT("1:" &LEN(A1)-4)),5),1)<>" " but this only takes care of the first character bot being a space 'cos " 2047" will give TRUE but you don’t want this

So the summary of the whole formula is
1. PICK 5 characters starting from position 1,2,3.......length(string)-4 so you don’t pick characters that don’t exist in the string, that’s why i stop at len(string)-4
2. Check that the first character is not a 0 and that the first character is also not a SPACE, once these 2 conditions are met then
3. USE MATCH to give me the first set of 4-characters that satisfy 2 above and are all numbers
4. Once you get the starting position of the SET that satisfies the condition, pick 5 characters.
 
Upvote 0
Thank you so much! The first formula works.
Maybe it works... depending... can you ever have any "short" numbers mixed in with text before the 5-digit number you want? I ask because the formula will return a wrong value for something like this...

123E4 - lf Pay # 022597 jjjkjkj
 
Upvote 0
Maybe it works... depending... can you ever have any "short" numbers mixed in with text before the 5-digit number you want? I ask because the formula will return a wrong value for something like this...

123E4 - lf Pay # 022597 jjjkjkj

Nice point, Mr Rothstein. Didn't see that one coming :) I hope he doesn't have data like that.

Am sure you can give us a formula that would get us the desired result
 
Upvote 0
Here is a variation of Momentman's formula that also handles the example from Rick Rothstein. Since Excel evaluates 123E4 as an exponential, my array formula first changes all E's to spaces.
Code:
=MID(A1,MATCH(TRUE,MID(LEFT(SUBSTITUTE(A1,"E"," "),LEN(A1)),ROW(INDIRECT("1:" &LEN(A1)-4)),5)*1>=10000,0),5)
 
Upvote 0
Here is a variation of Momentman's formula that also handles the example from Rick Rothstein. Since Excel evaluates 123E4 as an exponential, my array formula first changes all E's to spaces.
Code:
=MID(A1,MATCH(TRUE,MID(LEFT(SUBSTITUTE(A1,"E"," "),LEN(A1)),ROW(INDIRECT("1:" &LEN(A1)-4)),5)*1>=10000,0),5)

It should also work without the left in there

Code:
=MID(A1,MATCH(TRUE,MID(SUBSTITUTE(A1,"E"," "),ROW(INDIRECT("1:" &LEN(A1)-4)),5)*1>=10000,0),5)

Don't forget CTRL+SHIFT+ENTER
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,243
Messages
6,170,964
Members
452,371
Latest member
Frana

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top