Find and extract sequence within a text string

drousyblinx

New Member
Joined
Oct 7, 2015
Messages
13
Hi,

I have a long list of file names and within them sits a five digit sequence of numbers and letters. The problem is, the location of that sequence within the file name, differs for every file.

For example:
1) ThisisafilenameAB123.doc - sequence = AB123
2) ThisCD453isafilename.doc - sequence = CD453
3) ThisisEF713afilename.doc - sequence = EF713

I would be extremely grateful if anyone could provide me a formula that extracts these five digit sequences.
 
Your answer muddied things up a little bit for me. When you say "most of them", does that mean some may have only one upper case with four digits or three upper case letters with two digits? Or were you referring to the what you wrote in parentheses as the exception? If the latter, can the two scenarios (two upper case letters followed by three digits and two upper case letters followed by two digits) both appear in your data? If so, do you want them both pulled out or only the five-character one? Please try and be clear on your scenarios... remember we here on the forum have no idea what your data looks like, we can only go on what you tell us... even though something is obvious to you, you cannot expect us to "figure it out".


Is your sequence always two uppercase letters followed by three digits?''

Also, will your filenames ever have digits in them besides the three digits in your sequence?




Extractions needed (exact specifications below, aside from one exception):
Scenario A: five character sequence - 2 upper case letters 3 digits AB123
Scenario B: four character sequence - 2 upper case letters 2 digits AB12
Scenario C: three character sequence- 2 upper case letters 1 digit AB1

Exception:
The majority of these scenarios do not have other digits in them, but there are a few which do. E.g. ThisAB123isafilename3.
 
Upvote 0

Excel Facts

Can a formula spear through sheets?
Use =SUM(January:December!E7) to sum E7 on all of the sheets from January through December
Is your sequence always two uppercase letters followed by three digits?''

Also, will your filenames ever have digits in them besides the three digits in your sequence?




Extractions needed (exact specifications below, aside from one exception):
Scenario A: five character sequence - 2 upper case letters 3 digits AB123
Scenario B: four character sequence - 2 upper case letters 2 digits AB12
Scenario C: three character sequence- 2 upper case letters 1 digit AB1

Exception:
The majority of these scenarios do not have other digits in them, but there are a few which do. E.g. ThisAB123isafilename3.

Try this array formula,

=MID(A1,MIN(FIND(ROW($1:$9)-1,A1&5^19))-2,COUNT(-MID(A1,ROW($1:$99),1))+2)

Confirm press with Shift+Ctrl+Enter

Regards
 
Upvote 0
Extractions needed (exact specifications below, aside from one exception):
Scenario A: five character sequence - 2 upper case letters 3 digits AB123
Scenario B: four character sequence - 2 upper case letters 2 digits AB12
Scenario C: three character sequence - 2 upper case letters 1 digit AB1

Exception:
The majority of these scenarios do not have other digits in them, but there are a few which do. E.g. ThisAB123isafilename3.
See if this UDF works the way you want...
Code:
Function FiveChars(ByVal S As String) As String
  Dim X As Variant
  For X = 2 To Len(S)
    If Mid(S, X, 1) Like "[!A-Z0-9]" Then Mid(S, X) = " "
  Next
  For Each X In Split(Application.Trim(S))
    If X Like "[A-Z][A-Z]#*" Then
      If Not Mid(X, 3) Like "*[!0-9]*" And Len(X) < 6 Then
        FiveChars = X
        Exit Function
      End If
    End If
  Next
  FiveChars = ""
End Function
 
Upvote 0

Forum statistics

Threads
1,223,248
Messages
6,171,027
Members
452,374
Latest member
keccles

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top