Extract a 5 Digit Number From Inconsistent Text Strings

jeffgibson55

New Member
Joined
Aug 22, 2011
Messages
17
I know how to extract a number from a text string using:

LOOKUP(99^99,--("0"&MID(J1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},J1&"0123456789")),ROW($1:$10000))))

But in my current data set I have thousands of inconsistent text strings some of which contain other numbers that I don't care about - with the above formula it just grabs the first number it finds left to right. The number I need to grab is always a 5 digit number. Is there any formula I can use that would only grab out a 5 digit number ignoring all other numbers? Here's an example text string: "9/14/2010.NTL.TeleBroc.55129T_V1_N" where 55129 is the desired extract value but like I said the other text strings don't necessarily follow this example's format/delimiters.

Thanks!
Jeff
 
This means that, in order to have my udf recalculate when you change the number of digits you have to change it.

You either

1 - add also the volatile statement at the beginning of the udf

or

2 - include the number of digits as a parameter, like
I like #2 the best... it makes the function independent of where or how it is being used... and, thus, more universal.
 
Upvote 0

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.
Rick
More than happy to send them to you.
I'm away from my office until tomorrow morning, so I'll do it then
Don't get too excited though, I've simply used the Function that PGC posted and changed the Patterns to meet my needs.
I'm sure there is a VBA equivalent, but it's something new to play with.....OMG that makes me feel like a Geek !!!
 
Upvote 0
More than happy to send them to you.
I'm away from my office until tomorrow morning, so I'll do it then
No rush... I just want to see what I can come up with.

Don't get too excited though, I've simply used the Function that PGC posted and changed the Patterns to meet my needs.
In that case, don't forget the brief explanation of what the function is supposed to be doing.

I'm sure there is a VBA equivalent, but it's something new to play with.....OMG that makes me feel like a Geek !!!
My one recollection from my RegEx days back in the 1980s... I remember being fascinated at being able to write complex, fully working expressions that 30 minutes later I couldn't figure out how it worked. Knowing the rules and constructing is one thing, but I found reading expressions (especially the complex ones) was not so intuitive.
 
Upvote 0
Would you prefer I send you a sample spreadsheet with the Functions working ??
 
Upvote 0
Would you prefer I send you a sample spreadsheet with the Functions working ??
If the function names are reasonable self-explanatory (that is, not too abbreviated or terse), then sure... we can try that as a start at least. I think maybe we should move any continuation of this discussion off line rather than keep extending this thread. You have my email address (I don't have yours, hence this message here in this thread). By the way, I'll be going to sleep in a few minutes, so if you send me anything in the next few hours, that will be why I don't respond.
 
Upvote 0
I'm glad it helped Jeff.

Also a formula solution, please test. In B1:

=LOOKUP(2,1/(MMULT(0+(ISNUMBER(-MID(" "&A1,ROW(INDIRECT("1:"&LEN(A1)))+{0,1,2,3,4,5,6},1))+0={0,1,1,1,1,1,0}),{1;1;1;1;1;1;1})=7),MID(A1,ROW(INDIRECT("1:"&LEN(A1))),5))

Copy down


[TABLE="width: 2"]
<tbody>[TR]
[TH] [/TH]
[TH="align: center"]A[/TH]
[TH="align: center"]B[/TH]
[TH="width: 30, align: center"]C[/TH]
[/TR]
[TR]
[TD="align: center"]1[/TD]
[TD="align: left"]10/12/2010.IBN.NTL.WebcastEBrochure.5469
5LVWCR_V2[/TD]
[TD="align: left"]54695[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]2[/TD]
[TD="align: left"]10/13/2010.IBN.NTL.WebcastEBrochure.5471
3LVWCR_V2[/TD]
[TD="align: left"]54713[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]3[/TD]
[TD="align: left"]4/20/2010.IBN.NTL.WebcastEBrochure.52511
LVWCR_V1[/TD]
[TD="align: left"]52511[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]4[/TD]
[TD="align: left"]1/19/2010.IBN.NTL.WebcastEBrochure.52509
LVWCR_V1[/TD]
[TD="align: left"]52509[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]5[/TD]
[TD="align: left"]57639ER_Drop1-110808-1335[/TD]
[TD="align: left"]57639[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]6[/TD]
[TD="align: left"]57818ER_Drop1-110803-0646[/TD]
[TD="align: left"]57818[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]7[/TD]
[TD="align: left"]9/14/2010.NTL.TeleBroc.55129T_V1_N[/TD]
[TD="align: left"]55129[/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="align: center"]8[/TD]
[TD="align: right"] [/TD]
[TD="align: right"] [/TD]
[TD="align: right"] [/TD]
[/TR]
[TR]
[TD="colspan: 4"] [Book1]Sheet1[/TD]
[/TR]
</tbody>[/TABLE]

Hello PCG
Thanks for this formula. It works great for me.
Can you kindly advise on the modification i need to perform in order to extract a 7 digit number using this formula?

Thanks and regards
 
Upvote 0
As above - the formula worked great but I have no idea how to modify it to suit my needs (which also happens to be extracting a 7 digit number - and a four digit number). Can anyone help?

Many thanks
 
Upvote 0
the easiest way is a custom formula from PGC.

I modified it so i can input a number for the length of the number to extract (i added the part in red)

Code:
Function XDigitNo(s As String[COLOR=#ff0000], X As Integer[/COLOR]) As String
With CreateObject("VBScript.RegExp")
  .Pattern = "(?:^|\D)(\d{[COLOR=#ff0000]" & X & "[/COLOR]})(?!\d)"
  If .test(s) Then XDigitNo = .Execute(s)(0).SubMatches(0)
End With
End Function
 
Upvote 0
This is close to my requirements..
My text in cell would like.. 1-anfkjdn-2,3-nngnkfngnfd ngfndnfg 454532-34-5 mngfdsngf 7674-46-9, nfkjndsgkfdnm,nkjfdf 768-54-1 nfngfdngfn 96-78-1
I would like extract only the (text) independent number strings 454532-34-5 7674-46-9 768-54-1 96-78-1.
ignore all the strings where text and digits are mixes.. extract only digit strings between space and (space or comma)..
Please note that 2 hyphens are constant with last 1 digit and 2 digits before that.. in the first set of digits the digit count could be 2-7.
Though there could a way to define pattern like .Pattern = "(\d{7})(-)(\d{2})(-)(\d{1})".. but that did not work.. Please help.
 
Upvote 0

Forum statistics

Threads
1,223,230
Messages
6,170,883
Members
452,364
Latest member
springate

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top