Extract multiple strings from 1 cell

daveac

New Member
Joined
Dec 15, 2010
Messages
4
Hi,

I have a list of topic codes in a cell and I need to extract just the codes. The only thing that gets me close is splitting the cell based on a delimiter but there are two issues with that 1) I then need to delete the columns which contain the unwanted text and 2) sometimes I may have 4 codes in the box, sometimes 40 so that will result in an unwieldy number of columns to deal with. The other complication is that not all the topic codes are the same length or format - some are 1.1.1, some are 1.1.12 and some are 1.1.1 (a). However, as I currently have the input cell formatted, the codes are all between some form of delimiter - square brackets at the moment - because I thought that might help to identify the codes to any function.

[TABLE="class: grid, width: 750, align: center"]
<tbody>[TR]
[TD]Input[/TD]
[TD]Desired output[/TD]
[/TR]
[TR]
[TD][1.2.3 (a)] first topic;[1.2.3 (b)]; second topic;[1.2.3 (d)]; third topic;[1.2.5]; fourth topic[/TD]
[TD]1.2.3 (a), 1.2.3 (b), 1.2.3 (d), 1.2.5[/TD]
[/TR]
</tbody>[/TABLE]

Ideally, the output would exclude the square brackets and it needs to be separated e.g. by a comma as shown. If this can be done with formulas - brilliant, because I don't know how to use VBA (if I have to, I will though!)

Thanks in advance.

Dave
 

Excel Facts

Why does 9 mean SUM in SUBTOTAL?
It is because Sum is the 9th alphabetically in Average, Count, CountA, Max, Min, Product, StDev.S, StDev.P, Sum, VAR.S, VAR.P.
You can use Regex to make it

Link: https://tinyurl.com/y5clsq3z

As Picture:

hBLN0j0.jpg
 
Upvote 0
Write this formula in B2:
=TEXTJOIN(", ",,MID(A2,MODE.MULT(IFNA(MATCH(--(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)="[")*ROW(INDIRECT("1:"&LEN(A2))),ROW(INDIRECT("1:"&LEN(A2))),{0,0}),""))+1,MODE.MULT(IFNA(MATCH(--(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)="]")*ROW(INDIRECT("1:"&LEN(A2))),ROW(INDIRECT("1:"&LEN(A2))),{0,0}),""))-MODE.MULT(IFNA(MATCH(--(MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)="[")*ROW(INDIRECT("1:"&LEN(A2))),ROW(INDIRECT("1:"&LEN(A2))),{0,0}),""))-1))

Write this formula in B3:
=TEXTJOIN(", ",,MID(A3,MODE.MULT(IFNA(MATCH(--(MID(A3,ROW(INDIRECT("1:"&LEN(A3))),1)="[")*ROW(INDIRECT("1:"&LEN(A3))),ROW(INDIRECT("1:"&LEN(A3))),{0,0}),"")),MODE.MULT(IFNA(MATCH(--(MID(A3,ROW(INDIRECT("1:"&LEN(A3))),1)="]")*ROW(INDIRECT("1:"&LEN(A3))),ROW(INDIRECT("1:"&LEN(A3))),{0,0}),""))-MODE.MULT(IFNA(MATCH(--(MID(A3,ROW(INDIRECT("1:"&LEN(A3))),1)="[")*ROW(INDIRECT("1:"&LEN(A3))),ROW(INDIRECT("1:"&LEN(A3))),{0,0}),""))+1))

These should work. :)
 
Upvote 0
Thanks both for the rapid replies. I've tried both approaches and I'm not quite there. Sadboy, I tried your approach first. I couldn't get it to work at all in Excel but it works in Google sheets. That's actually not a bad thing as this project is to process the output from a Google Form so that obviously goes natively into Sheets. I'm less familiar with Sheets so, when I posted the initial question, I had exported as a CSV to Excel (but still couldn't make it work, hence why I'm here ;)). The only problem now is that the export to CSV has altered the formatting a little. Now when I try to apply your solution directly in Sheets, it doesn't quite work. The formatting in sheets is a comma and space between each code so:

[1.2.3 (a)] First topic, [1.2.3 (b)] Second topic, [1.2.3 (d)] Third topic, [1.2.5] Fourth topic
Rather than
[1.2.3 (a)] First topic;[1.2.3 (b)]; Second topic;[1.2.3 (d)]; Third topic;[1.2.5]; Fourth topic

And in sheets, that then gives me (as the output from the regexreplace):
[1.2.3 ()] F, [1.2.3 ()] S, [1.2.3 ()] T, [1.2.5] F

i.e. it's leaving me the first letter of the topic description as an unwanted remnant, and also removing any alpha elements of the code which appear between ( ). I've read around and also fiddled with the "[a-z]*[a-z] [a-z]*[a-z]" expression but I can't figure out how that element is working adequately to trouble shoot the problems. Do you have any suggestions?

JBorg, thanks for your formulae. I could never have hoped to getting close to coming up with those!! Unfortunately, if I run them in Excel I get [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=NA]#NA ME[/URL] and #N/A if I run them in Sheets. Again, I'm not sufficiently savvy to be able to troubleshoot that. Any ideas gratefully received!

Thanks in advance and for what's gone so far.

Dave
 
Upvote 0

Forum statistics

Threads
1,224,822
Messages
6,181,165
Members
453,021
Latest member
Justyna P

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top