Extracting text between certain words

JJ EXCEL1234

New Member
Joined
Jan 30, 2018
Messages
12
Hello, I have a worksheet that contains hyperlinks to hundreds of documents (Column B). The text string for hyperlinks under Column B contains volume and page numbers that I would like to extract and place in Column E.

(1) What formula would I need to extract the number after volume?

(2) What formula would I need to extract the number after page?

Possible Issue:
There hyperlink formulas should have similar formats under Column B, however there are 3 differences, there are formulas that only contain volume and page, ones that contain both volume and page and record number, and formulas which only contain record number.


Below is a picture showing a clip of my worksheet and the results I am looking for in the function.

Nic1hR6

https://imgur.com/a/Nic1hR6
 

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
Any chance you could post a sample workbook rather than a picture of the workbook?
 
Upvote 0
Hi,

Here's a couple of ways.
1. Use E20 formula in E14 copied down if the "Volume" and "Page" is Always at the beginning.
2. Use E14 formula copied down if the "Volume" and "Page" might be somewhere Other than at the beginning and/or there might be Other "_" (underscores) within the text string.


Book1
BCDE
13file linkcountyinstNumbervolume/page
14volume-11_page-423_6fe5a853-b1e0-49fc-8c09-8ffd9e7aadb2.pdfMIDLAND ABSTRACT PLANT11/423
15volume-688_page-53_record number-00011946_800bfc07-8abf-4520-ab2d-37daf9a4d15f.pdfMIDLAND ABSTRACT PLANT11946688/53
16record number-201821983_d77b2289-28c8-4db5-8c4b-a96edb7ce4df.pdfMIDLAND ABSTRACT PLANT201821983NONE
17record number-201823925_e7703b85-e14f-49c4-80cd-87a064075f2a.pdfMIDLAND ABSTRACT PLANT201823925NONE
18record number-201823926_eefa3464-e98f-4405-a6ac-9f0427e73d52.pdfMIDLAND ABSTRACT PLANT201823926NONE
19record number-201824074_d5306241-68f4-40c4-8289-a8dc3492f9cd.pdfMIDLAND ABSTRACT PLANT201824074NONE
2011/423
21688/53
22NONE
23NONE
24NONE
25NONE
SAMPLE WORKBOOK
Cell Formulas
RangeFormula
E14=IFERROR(TRIM(MID(SUBSTITUTE(B14,"_",REPT(" ",30)),SEARCH("Volume-",SUBSTITUTE(B14,"_",REPT(" ",30)))+7,30))&"/"&TRIM(MID(SUBSTITUTE(B14,"_",REPT(" ",30)),SEARCH("Page-",SUBSTITUTE(B14,"_",REPT(" ",30)))+5,30)),"NONE")
E20=IFERROR(TRIM(MID(SUBSTITUTE(B14,"_",REPT(" ",30)),8,30))&"/"&TRIM(MID(SUBSTITUTE(B14,"_",REPT(" ",30),2),SEARCH("Page-",B14)+5,30)),"NONE")
 
Upvote 0
Based on your samples data, two other ways.

A) Standard worksheet functions in E14

B) User-defined function in F14 (if you are interested further in this approach & need help with how to implement it, post back)

Code:
Function VolPage(s As String) As String
  VolPage = "NONE"
  With CreateObject("VBScript.RegExp")
    .Pattern = "(^.*volume-)(\d+)(_page-)(\d+)(.*$)"
    If .Test(s) Then VolPage = .Replace(s, "$2/$4")
  End With
End Function

Excel Workbook
BEF
14volume-11_page-423_6fe5a853-b1e0-49fc-8c09-8ffd9e7aadb2.pdf11/42311/423
15volume-688_page-53_record number-00011946_800bfc07-8abf-4520-ab2d-37daf9a4d15f.pdf688/53688/53
16record number-201821983_d77b2289-28c8-4db5-8c4b-a96edb7ce4df.pdfNONENONE
17record number-201823925_e7703b85-e14f-49c4-80cd-87a064075f2a.pdfNONENONE
18record number-201823926_eefa3464-e98f-4405-a6ac-9f0427e73d52.pdfNONENONE
19record number-201824074_d5306241-68f4-40c4-8289-a8dc3492f9cd.pdfNONENONE
SAMPLE WORKBOOK
 
Last edited:
Upvote 0
In case the volumn/page data is not always at the start of the text as jtakw suggested in relation to his E14 formula, this slight modification of my formula from the previous post would also allow for that (see row 17 below).
(The user-defined function posted previously works in this circumstance as it is too)

Excel Workbook
BE
14volume-11_page-423_6fe5a853-b1e0-49fc-8c09-8ffd9e7aadb2.pdf11/423
15volume-688_page-53_record number-00011946_800bfc07-8abf-4520-ab2d-37daf9a4d15f.pdf688/53
16record number-201821983_d77b2289-28c8-4db5-8c4b-a96edb7ce4df.pdfNONE
17some other text volume-11_page-423_6fe5a853-b1e0-49fc-8c09-8ffd9e7aadb2.pdf11/423
SAMPLE WORKBOOK
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,823
Messages
6,181,181
Members
453,022
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top