Find and extract a substring that meets a specific character format

SuttieB2404

New Member
Joined
Sep 9, 2011
Messages
7
Office Version
  1. 365
Platform
  1. Windows
Hi I'm trying to write a formula that extracts an invoice reference that is either two capital letters followed by 4 or five numbers from a variable string in a cell.
i.e extract 'AB1234' from 'Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456'

I found a post that counted double capitals that may be able to modify?
=SUM((CODE(MID(A1,ROW(INDIRECT("1:"&LEN(A1)-1)),1))>=64)*(CODE(MID(A1,ROW(INDIRECT("1:"&LEN(A1)-1)),1))<=90)
*(CODE(MID(A1,ROW(INDIRECT("2:"&LEN(A1))),1))>=64)*(CODE(MID(A1,ROW(INDIRECT("2:"&LEN(A1))),1))<=90))

Cheers
Stu
 

Excel Facts

Did you know Excel offers Filter by Selection?
Add the AutoFilter icon to the Quick Access Toolbar. Select a cell containing Apple, click AutoFilter, and you will get all rows with Apple
The formula is taken from one of the solution by Fluff.

If it is always in 3rd positin from last word

Try

Excel Practice 05.22.2023.xlsx
AB
1Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456AB1234
2
3
Sheet3
Cell Formulas
RangeFormula
B1B1=FILTERXML("<k><m>"&SUBSTITUTE(A1," ","</m><m>")&"</m></k>","//m[.!=number()][position()=last()-2]")
 
Upvote 0
The formula is taken from one of the solution by Fluff.

If it is always in 3rd positin from last word

Try

Excel Practice 05.22.2023.xlsx
AB
1Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456AB1234
2
3
Sheet3
Cell Formulas
RangeFormula
B1B1=FILTERXML("<k><m>"&SUBSTITUTE(A1," ","</m><m>")&"</m></k>","//m[.!=number()][position()=last()-2]")
Thanks for the reply, unfortunately the reference could appear anywhere in the string hence why I thought it might be better to search for the substring format/length
 
Upvote 0
You could consider employing a user-defined function like this. To implement ..
1. Right click the sheet name tab and choose "View Code".
2. In the Visual Basic window use the menu to Insert|Module
3. Copy and Paste the code below (you can use the icon at the top right of the code pane below) into the main right hand pane that opens at step 2.
4. Close the Visual Basic window.
5. Enter the formula as shown in the screen shot below and copy down.
6. Your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

VBA Code:
Function InvNo(s As String) As String
  With CreateObject("VBScript.RegExp")
    .Pattern = "(^|[^A-Z])([A-Z]{2}\d{4,5})(\D|$)"
    If .Test(s) Then InvNo = .Execute(s)(0).SubMatches(1)
  End With
End Function

SuttieB2404.xlsm
AB
1Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456AB1234
2Other Text 
3 
4Last Inv BB23765BB23765
5Last Inv aB23766 
6ABC12345 or DE1234567 or FG76543FG76543
Sheet1
Cell Formulas
RangeFormula
B1:B6B1=InvNo(A1)
 
Upvote 0
You could consider employing a user-defined function like this. To implement ..
1. Right click the sheet name tab and choose "View Code".
2. In the Visual Basic window use the menu to Insert|Module
3. Copy and Paste the code below (you can use the icon at the top right of the code pane below) into the main right hand pane that opens at step 2.
4. Close the Visual Basic window.
5. Enter the formula as shown in the screen shot below and copy down.
6. Your workbook will need to be saved as a macro-enabled workbook (*.xlsm)

VBA Code:
Function InvNo(s As String) As String
  With CreateObject("VBScript.RegExp")
    .Pattern = "(^|[^A-Z])([A-Z]{2}\d{4,5})(\D|$)"
    If .Test(s) Then InvNo = .Execute(s)(0).SubMatches(1)
  End With
End Function

SuttieB2404.xlsm
AB
1Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456AB1234
2Other Text 
3 
4Last Inv BB23765BB23765
5Last Inv aB23766 
6ABC12345 or DE1234567 or FG76543FG76543
Sheet1
Cell Formulas
RangeFormula
B1:B6B1=InvNo(A1)
Unfortunately it needs to be without VBA :(
I'd have been OK otherwise
Stu
 
Upvote 0
If the Ref number is always after 'Ref: ' then maybe:
Book1
AB
1Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456A123456
2Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123457 & Cheese on ToastA123457
3Ref: A123457 Invoice 2222 Date: 01/01/2002 & Cheese on ToastA123457
Sheet1
Cell Formulas
RangeFormula
B1:B3B1=LET(ta,TEXTAFTER(A1:A3,"Ref: "),IF(ISNUMBER(FIND(" ",ta)),TEXTBEFORE(ta," "),ta))
Dynamic array formulas.
 
Upvote 0
If the value is always after the date, another option is
Fluff.xlsm
AB
1
2Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456AB1234
Main
Cell Formulas
RangeFormula
B2B2=FILTERXML("<k><m>"&SUBSTITUTE(A2," ","</m><m>")&"</m></k>","//m[contains(.,'/') ]/following::m[1]")
 
Upvote 0
If the value is always after the date, another option is
Fluff.xlsm
AB
1
2Invoice 2222 Date: 01/01/2002 AB1234 Ref: A123456AB1234
Main
Cell Formulas
RangeFormula
B2B2=FILTERXML("<k><m>"&SUBSTITUTE(A2," ","</m><m>")&"</m></k>","//m[contains(.,'/') ]/following::m[1]")
No, the full string is received from various sources and is not constant but always contains the substring I need to extract as either TTNNNN or TTNNNNN
 
Upvote 0

Forum statistics

Threads
1,223,889
Messages
6,175,223
Members
452,620
Latest member
dsubash

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top