Efficient Formula to Extract Text Between Second and Third Hyphen in Excel 2021

irfman

New Member
Joined
Jan 1, 2019
Messages
19
Office Version
  1. 2021
Hi,

I have a large dataset stored as string values in Column A. I need to extract the text that appears immediately after the second hyphen (-) and before the third hyphen (-).

Currently, I am using the following formula:
=MID(A2, FIND(CHAR(1), SUBSTITUTE(A2, "-", CHAR(1), 2)) + 1, FIND(CHAR(1), SUBSTITUTE(A2, "-", CHAR(1), 3)) - FIND(CHAR(1), SUBSTITUTE(A2, "-", CHAR(1), 2)) - 1)

However, this formula is lengthy, complicated, and not efficient for a large dataset.

I am using Excel 2021. Are there any simpler and more efficient formulas to achieve the same result?

I appreciate your usual support and assistance.

1742729042964.png
 
May be something like this for Excel 2021,
Book2
ABC
1StringExtract thisFormula
2ACC-1-81-10103-1-58181
3ACC-1-605-11177-5-3605605
4ACC-1-650-11177-1-4650650
5ACC-1-568-11177-2-4568568
6ACC-1-543-11177-1-4543543
7ACC-1-615-11177-1-1615615
8ACC-1-656-11177-1-0656656
9ACC-1-583-11177-1-2583583
10ACC-1-680-11177-1-0680680
11ACC-1-622-11177-1-0622622
12ACC-1-701-11177-1-6701701
13ACC-1-641-11177-1-3641641
14ACC-1-655-11177-1-9655655
15ACC-1-570-11177-1-7570570
16ACC-1-714-11177-1-1714714
17ACC-1-701-11177-2-7701701
18ACC-1-1592-11177-1-615921592
19ACC-1-680-11177-2-1680680
Sheet2
Cell Formulas
RangeFormula
C2:C19C2=TRIM(MID(A2,FIND("-",A2,FIND("-",A2)+1)+1, FIND("-",A2,FIND("-",A2,FIND("-",A2)+1)+1) - FIND("-",A2,FIND("-",A2)+1) -1))
 
Upvote 0
For a large dataset you might be better off with using Power Query.
This will be slightly more efficient and should be easier to follow.
Note: the double dash before the Mid converts it to a number. Remove them if you want the result as text.
Excel Formula:
=LET(hyphen2,FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),2)),
     hyphen3,FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),3)),
     --MID(A2,hyphen2+1,hyphen3-hyphen2-1))
 
Upvote 0
I am using Excel 2021.
Please update your forum profile (click your user name at the top right of the forum, then ‘Account details’) so helpers always know what Excel version(s) & platform(s) you are using as the best solution often varies by version. (Don’t forget to scroll down & ‘Save’)



... a large dataset.

... Are there any ... more efficient formulas to achieve the same result?
Re more efficient and large data set. Just put this formula (adjusted for the number of possible rows) in B2. The efficiency/advantages are ..
  1. There is no need to copy the formula down all the rows of the large data set, and
  2. The results are returned as numbers, not text
25 03 24.xlsm
AB
1StringExtract this
2ACC-1-81-10103-1-581
3ACC-1-605-11177-5-3605
4ACC-1-650-11177-1-4650
5ACC-1-568-11177-2-4568
6ACC-1-543-11177-1-4543
7ACC-1-615-11177-1-1615
8ACC-1-656-11177-1-0656
9ACC-1-583-11177-1-2583
10ACC-1-680-11177-1-0680
11ACC-1-622-11177-1-0622
12ACC-1-701-11177-1-6701
13ACC-1-641-11177-1-3641
14ACC-1-655-11177-1-9655
15ACC-1-570-11177-1-7570
16ACC-1-714-11177-1-1714
17ACC-1-701-11177-2-7701
18ACC-1-1592-11177-1-61592
19ACC-1-680-11177-2-1680
20
21
IRFMAN
Cell Formulas
RangeFormula
B2:B1000B2=IFERROR(--MID(SUBSTITUTE(A2:A1000,"-",REPT(" ",100)),200,100),"")
Dynamic array formulas.
 
Upvote 0
@Peter_SSs, I considered using a dynamic array formula but since its a large dataset I wasn't sure of the performance implications.
It certainly seems to be the case that changing even one entry in column A makes the whole dynamic array "dirty" and it has to recalculate the whole array while in a standard formula it would only recalculate the one dependent formula.
For the current use case the difference seems too small to notice (I tried 50k records). Having said that the order of magnitude seemed to be that the Dynamic array formula took 10x as long as the standard formula.
I am sure it would start to add up if there were multiple array formulas in a workbook.
 
Upvote 0
@Alex Blakenburg
Fair enough points Alex. I suppose I was more thinking about efficiency for the user, not have to copy the formula down lots of rows. :biggrin:

We do not know just how big the OP's "large dataset" might be but I have done some testing with 10,000 rows of data similar to the sample given.
I tested the formulas from posts 1, 2, 3, 3b*, 4 (& later 9)

* 3b - I modified your formula slightly since the request (& the OP's working formula) was to extract the text from the string. Your conversion to numerical takes some time (extremely small) & also would drop any leading zero if there were any. So to compare the text extraction I simply removed the "--" from your formula
Excel Formula:
=LET(hyphen2,FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),2)),
     hyphen3,FIND(CHAR(1),SUBSTITUTE(A2,"-",CHAR(1),3)),
     MID(A2,hyphen2+1,hyphen3-hyphen2-1))

Post 1 was definitely the slowest and 3b (your formula modified as above) was the quickest but really there was very little difference between that and @Tom.Jones formula in Post 4.

However, in looking at Post 4, I note that the REPT function can be a little slow, more so the more any text is repeated.
So, assuming data like the samples, that formula could be sped up somewhat by reducing the number of space characters repeated. Say ..
Excel Formula:
=TRIM(MID(SUBSTITUTE(A2,"-",REPT(" ",10)),20,10))

Even more improvement though is achieved (again assuming the sample data) by actually coding in the space characters
Excel Formula:
=TRIM(MID(SUBSTITUTE(A2,"-","          "),20,10))

Calling this last formula the Post 9 formula, here are my timing results.

irfman.xlsm
JKLMNOP
1Post 1Post 2Post 3Post 3bPost 4Post 9
2Time (secs) to recalculate 10,000 rows0.06240.044680.043840.034870.03720.02735
3Rank654231
4Fastest compared to 2nd fastest78.43%
5Fastest compared to original43.83%
Sheet1
Cell Formulas
RangeFormula
K3:P3K3=RANK(K2,$K2:$P2,1)
P4P4=P2/N2
P5P5=P2/K2
 
Upvote 0
Thanks @Peter_SSs, that was really interesting.
I only compared the identical dynamic array version to the non-array version.
I was wondering about how the REPT function performed.
 
Upvote 0

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top