Help with a formula to extract all 8 character alpha-numeric IDs from an Excel column

techie2

New Member
Joined
Jun 4, 2013
Messages
21
I am new to this forum. I have seen some excellent suggestions and solutions being posted.

I have a spread sheet with Special Notes column which has notes text in each cell of the column. Each cell text contains one or more 8 character numeric and alpha numeric IDs. I need to extract all these IDs and dump them in a separate column on the spread sheet . I am giving examples of text from two cells in the column:

K1= Model Z behind model 8 (ABC Company) Please generate AMC 47052130 (replacing dialup AMC 47059003)
K2= Model J behind model 9 (XYZ Company) Please Generate AMC ID 4554G023 (replacing 4554A032)

Is there a formula I can use to extract those 8 character AMC IDs with or without AMC in front of them.

I am very excited to be on this forum.

Thanks in advance

Techie
 
Thanks 'TheCman81'. The formula you provided works too. However, I wonder if it can be modified so it extracts all 8 character strings (regardless of whether or not they are preceded by AMC or AMC ID). This is because when I applied other solutions to a larger number of records, I found that many IDs are missed because they do have another term used instead of AMC. This might make the code more complex since there are 7 - 8K records and some IDs have '-' in front of them or there is just a regular space. So I think the best option is to pull all 8 character strings and then filter the data some how or just leave the strings that are not IDs in there as invalid data.
 
Upvote 0

Excel Facts

Convert text numbers to real numbers
Select a column containing text numbers. Press Alt+D E F to quickly convert text to numbers. Faster than "Convert to Number"
On second thought, the above does not seem like a good idea. When I applied the functions on the spreadsheet, I got 8 character strings like 'screen in', 'is a NCR' etc.. So it would not really help. I will stick to the solution that pulls all 8 character IDs in front of AMC or AMC ID. That way, I will have fewer number of invalid IDs. This does not solve the issue hundred percent but it remains a user training issue I guess. If they consistently input AMC IDs in a standard format then we don't have to struggle to try to get the data out.

Thanks all for providing such wonderful help. I really appreciate that.
 
Upvote 0
On second thought, the above does not seem like a good idea. When I applied the functions on the spreadsheet, I got 8 character strings like 'screen in', 'is a NCR' etc..
Did you try ZVI's suggestion in Message #19? I am not completely up on RegExp, but what he proposed looks like it would solve the above problem... the pattern he suggested would only return 8-character text where those 8 characters were composed of digits and upper case letters only. Since all your examples showed the ID to use upper case letters (when letters were part of them), it would seem his suggestion should actually do what you have finally settled on having the function return.
 
Upvote 0
Thanks, Rick. That seem to be the best approach. Do I just need to replace RegEx.Pattern = "\d{8}" with the line of code Zvi suggested (RegEx.Pattern="[0-9A-Z]{8}" and try? I am trying that but please let me know if I need to just add that line and not replace the above. Thank you, again for all your help.
 
Upvote 0
Thanks, Rick. That seem to be the best approach. Do I just need to replace RegEx.Pattern = "\d{8}" with the line of code Zvi suggested (RegEx.Pattern="[0-9A-Z]{8}" and try? I am trying that but please let me know if I need to just add that line and not replace the above. Thank you, again for all your help.
As I said, RegExp coding is not something I am deeply familiar with, so I'll defer to ZVI to answer that question when he next visits this thread. However, in the meantime, you might want to consider this non-RegExp code which works exactly like my previous code (second argument 0 returns all ID's delimited by comma/space.. second argument a number returns that occurrence of the ID) except that it does what ZVI's suggestion will do (returns 8-character ID's composed of only digits or upper case letters)...
Code:
Function GetIDs(S As String, Optional Index = 0) As String
  Dim X As Long, Count As Long
  For X = 1 To Len(S)
    If Mid(S, X, 10) Like "[!0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][!0-9A-Z]" Then
      Count = Count + 1
      If Index = 0 Then
        GetIDs = GetIDs & ", " & Mid(S, X + 1, 8)
      ElseIf Count = Index Then
        GetIDs = Mid(S, X + 1, 8)
        Exit Function
      End If
    End If
  Next
  GetIDs = Mid(GetIDs, 3)
End Function
 
Last edited:
Upvote 0
Thanks, Rick. That seem to be the best approach. Do I just need to replace RegEx.Pattern = "\d{8}" with the line of code Zvi suggested (RegEx.Pattern="[0-9A-Z]{8}" and try? I am trying that but please let me know if I need to just add that line and not replace the above. Thank you, again for all your help.

Wow! That worked like a charm. It pulled all valid 8 character numeric and apha-numeric IDs. I have to go through everything but in some places it did not pull IDs from text like this "Request - ASO requesting install of P182905 on 2/15/12 : abccompany.com : LAH" but it is because there is a user error in entering ID. There are only 7 characters in (P182905).
 
Upvote 0
Do I just need to replace RegEx.Pattern = "\d{8}" with the line of code Zvi suggested (RegEx.Pattern="[0-9A-Z]{8}" and try?
Yes, in the code of post #8
this: RegEx.Pattern = "\d{8}"
need to be replaced by that: RegEx.Pattern = " [0-9A-Z]{8}"
I like Rick's solution by the way :)
 
Upvote 0
As I said, RegExp coding is not something I am deeply familiar with, so I'll defer to ZVI to answer that question when he next visits this thread. However, in the meantime, you might want to consider this non-RegExp code which works exactly like my previous code (second argument 0 returns all ID's delimited by comma/space.. second argument a number returns that occurrence of the ID) except that it does what ZVI's suggestion will do (returns 8-character ID's composed of only digits or upper case letters)...
Code:
Function GetIDs(S As String, Optional Index = 0) As String
  Dim X As Long, Count As Long
  For X = 1 To Len(S)
    If Mid(S, X, 10) Like "[!0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][!0-9A-Z]" Then
      Count = Count + 1
      If Index = 0 Then
        GetIDs = GetIDs & ", " & Mid(S, X + 1, 8)
      ElseIf Count = Index Then
        GetIDs = Mid(S, X + 1, 8)
        Exit Function
      End If
    End If
  Next
  GetIDs = Mid(GetIDs, 3)
End Function



Hi Rick,

This is something that I am looking for but just say if I want to return only 5 CHARACTERS which consist of ONLY mixtures of numbers and letters (eg I to want to extract ONLY 1DD23 NOT FHGOG from "FHGOG 3445 1DD23 4FF 35234"). How do I go about it? Many thanks in advance. ( I tried to modify Mid(S, X + 1, 8) to Mid(S, X + 1, 5) but it doesnt work).
 
Upvote 0
Hi Rick,

This is something that I am looking for but just say if I want to return only 5 CHARACTERS which consist of ONLY mixtures of numbers and letters (eg I to want to extract ONLY 1DD23 NOT FHGOG from "FHGOG 3445 1DD23 4FF 35234"). How do I go about it? Many thanks in advance. ( I tried to modify Mid(S, X + 1, 8) to Mid(S, X + 1, 5) but it doesnt work).
This modification of my to my function will work exactly like the original (with respect to the Optional Index argument) except that it will only find 5-character text strings that are composed of combinations of both digits and upper case letters...
Code:
[table="width: 500"]
[tr]
	[td]Function GetIDs(ByVal S As String, Optional Index = 0) As String
  Dim X As Long, Count As Long
  S = " " & S & " "
  For X = 1 To Len(S)
    If Mid(S, X, 7) Like "[!0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][!0-9A-Z]" Then
      If Mid(S, X, 7) Like "*#*[A-Z]*" Or Mid(S, X, 7) Like "*[A-Z]*#*" Then
        Count = Count + 1
        If Index = 0 Then
          GetIDs = GetIDs & ", " & Mid(S, X + 1, 5)
        ElseIf Count = Index Then
          GetIDs = Mid(S, X + 1, 5)
          Exit Function
        End If
      End If
    End If
  Next
  GetIDs = Mid(GetIDs, 3)
End Function[/td]
[/tr]
[/table]
 
Upvote 0

Forum statistics

Threads
1,223,264
Messages
6,171,081
Members
452,377
Latest member
bradfordsam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top