Extract text from a cell (VBA)

rimrattlerla11

New Member
Joined
Jan 27, 2016
Messages
17
I have a large dataset (about 500k rows) w/ text and I am only trying to extract the portion of text whenever the code is null or has a "C-*" code.

The nulls or "C-" codes don't appear in every row, so for those a simple "OK" in the output to differentiate those will do.


Each cell of the worksheet is in a format similar to below:

[{"code":null,"long_name":"null","name":"QWERTY","msrp":0,"includes":[]},{"code":"C-STD","long_name":"BLK","name":"BLK ","msrp":0,"includes":[]},{"code":"ABC","long_name":"ABC","name":"ABC","msrp":100,"includes":[]},.........]

The beginning and end of the cell has the "[" and "]" braces and each code and its pertinent info is enclosed within "{" and "}" with a comma separating each.


The desired output of the above example is to extract only the portion of text where "code":null or "code":"C-*" into the adjacent cell. I am using the "*" as a wildcard, since the "C-" codes can have a variety of letters following it.


{"code":null,"long_name":"null","name":"QWERTY","msrp":0,"includes":[]},{"code":"C-STD","long_name":"BLK","name":"BLK ","msrp":0,"includes":[]}

OR if "code":null or "code":"C-*" is not found, output the word "OK."

OK



Please let me know if anything was unclear. I can try to provide more examples.
I'm out of my element with VBA and I'm fairly certain I need some VBA code to make this happen.

Thanks in advance for any help you can provide!
 

Excel Facts

Lock one reference in a formula
Need 1 part of a formula to always point to the same range? use $ signs: $V$2:$Z$99 will always point to V2:Z99, even after copying
You do not need VBA

To extract by formula ...

Create 2 named ranges

String1 with RefersTo
="{"&CHAR(34)&"code"&CHAR(34)&":"&"null"

String2 with RefersTo
="{"&CHAR(34)&"code"&CHAR(34)& ":"&CHAR(34)&"C*"

Formula to extract String1
=MID(A1,SEARCH(String1,A1),SEARCH("}",A1,SEARCH(String1,A1)+1)-SEARCH(String1,A1)+1)

(or you could put the RefersTo formulas in a cell, instead of using named ranges)

Excel 2016 (Windows) 32 bit
[Table="width:, class:head"][tr=bgcolor:#E0E0F0][th] [/th][th]
A
[/th][th]
B
[/th][/tr]
[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
1
[/td][td][{"code":null,"long_name":"null","name":"QWERTY","msrp":0,"includes":[]},{"code":"C-STD","long_name":"BLK","name":"BLK ","msrp":0,"includes":[]},{"code":"ABC","long_name":"ABC","name":"ABC","msrp":100,"includes":[]},.........][/td][td][/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
2
[/td][td] RESULT[/td][td] FORMULA[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
3
[/td][td]{"code":null,"long_name":"null","name":"QWERTY","msrp":0,"includes":[]}[/td][td] =MID(A1,SEARCH(String1,A1),SEARCH("}",A1,SEARCH(String1,A1)+1)-SEARCH(String1,A1)+1)[/td][/tr]

[tr=bgcolor:#FFFFFF][td=bgcolor:#E0E0F0]
4
[/td][td]{"code":"C-STD","long_name":"BLK","name":"BLK ","msrp":0,"includes":[]}[/td][td] =MID(A1,SEARCH(String2,A1),SEARCH("}",A1,SEARCH(String2,A1)+1)-SEARCH(String2,A1)+1)[/td][/tr]
[/table]
[Table="width:, class:grid"][tr][td]Sheet: Sheet1[/td][/tr][/table]
 
Upvote 0
Hi

Is there only just a maximum of 1 of each records in the text or can there be, for ex., several {"code":"C- ...} records in the text?
Do you want all the results concatenated in 1 cell or each match in its own cell?
 
Upvote 0
Hi

Is there only just a maximum of 1 of each records in the text or can there be, for ex., several {"code":"C- ...} records in the text?
Do you want all the results concatenated in 1 cell or each match in its own cell?
@pgc01 - There can be several. So I'm trying to extract the ones w/ code:null or code:"C-*".
The results for each cell would be concatenated into another cell (comma delimited would be nice).
 
Upvote 0
You do not need VBA

To extract by formula ...

Create 2 named ranges

String1 with RefersTo
="{"&CHAR(34)&"code"&CHAR(34)&":"&"null"

String2 with RefersTo
="{"&CHAR(34)&"code"&CHAR(34)& ":"&CHAR(34)&"C*"

Formula to extract String1
=MID(A1,SEARCH(String1,A1),SEARCH("}",A1,SEARCH(String1,A1)+1)-SEARCH(String1,A1)+1)

(or you could put the RefersTo formulas in a cell, instead of using named ranges)

Excel 2016 (Windows) 32 bit
[TABLE="class: head"]
<tbody>[TR="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=E0E0F0]#E0E0F0[/URL] "]
[TH][/TH]
[TH]
[COLOR=[URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] ]A[/COLOR]​
[/TH]
[TH]
[COLOR=[URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] ]B[/COLOR]​
[/TH]
[/TR]
[TR="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] "]
[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=E0E0F0]#E0E0F0[/URL] "]
[COLOR=[URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] ]1[/COLOR]​
[/TD]
[TD][{"code":null,"long_name":"null","name":"QWERTY","msrp":0,"includes":[]},{"code":"C-STD","long_name":"BLK","name":"BLK ","msrp":0,"includes":[]},{"code":"ABC","long_name":"ABC","name":"ABC","msrp":100,"includes":[]},.........][/TD]
[TD][/TD]
[/TR]
[TR="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] "]
[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=E0E0F0]#E0E0F0[/URL] "]
[COLOR=[URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] ]2[/COLOR]​
[/TD]
[TD] RESULT[/TD]
[TD] FORMULA[/TD]
[/TR]
[TR="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] "]
[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=E0E0F0]#E0E0F0[/URL] "]
[COLOR=[URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] ]3[/COLOR]​
[/TD]
[TD]{"code":null,"long_name":"null","name":"QWERTY","msrp":0,"includes":[]}[/TD]
[TD] =MID(A1,SEARCH(String1,A1),SEARCH("}",A1,SEARCH(String1,A1)+1)-SEARCH(String1,A1)+1)[/TD]
[/TR]
[TR="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] "]
[TD="bgcolor: [URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=E0E0F0]#E0E0F0[/URL] "]
[COLOR=[URL=https://www.mrexcel.com/forum/usertag.php?do=list&action=hash&hash=FFFFFF]#FFFFFF[/URL] ]4[/COLOR]​
[/TD]
[TD]{"code":"C-STD","long_name":"BLK","name":"BLK ","msrp":0,"includes":[]}[/TD]
[TD] =MID(A1,SEARCH(String2,A1),SEARCH("}",A1,SEARCH(String2,A1)+1)-SEARCH(String2,A1)+1)[/TD]
[/TR]
</tbody>[/TABLE]
[TABLE="class: grid"]
<tbody>[TR]
[TD]Sheet: Sheet1[/TD]
[/TR]
</tbody>[/TABLE]
@Yongle - This works when there is only instance of a null or a "C-" in the cell. However, often time there are multiple instances of this occurring. Is there a way to amend the formula to account for multiples instead of stopping at the first instance?
 
Upvote 0
Is there a way to amend the formula to account for multiples instead of stopping at the first instance?
SEARCH 3rd attribute allows search to begin at specified character
So it may be possible to use an array formula to extract all instances
Sadly array formulas are not my strong suit - hopefully someone else reading this thread can crack that for you.
 
Upvote 0
I think this UDF (user defined function) will do what you want...
Code:
[table="width: 500"]
[tr]
	[td]Function CodeNullMinus(S As String) As String
  Dim X As Long, Parts() As String
  Parts = Split(Replace(S, " ", Chr(1)), "{""code"":", , vbTextCompare)
  For X = 0 To UBound(Parts)
    If Left(Parts(X), 3) <> """C-" And LCase(Left(Parts(X), 4)) <> "null" Then
      Parts(X) = ""
    End If
  Next
  CodeNullMinus = "{""code"":" & Replace(Replace(Replace(Replace(Application.Trim(Join(Parts)), ", ", " "), " ", ",{""code"":"), Chr(1), " "), "[,", "[")
  If CodeNullMinus Like "*," Then CodeNullMinus = Left(CodeNullMinus, Len(CodeNullMinus) - 1)
End Function[/td]
[/tr]
[/table]

HOW TO INSTALL UDFs
------------------------------------
If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use CodeNullMinus just like it was a built-in Excel function. For example,

=CodeNullMinus(A1)

If you are using XL2007 or above, make sure you save your file as an "Excel Macro-Enabled Workbook (*.xlsm) and answer the "do you want to enable macros" question as "yes" or "OK" (depending on the button label for your version of Excel) the next time you open your workbook.
 
Upvote 0
Hi

Another option:

Code:
Function ExtractRecords(s As String) As String
With CreateObject("VBScript.RegExp")
    .Pattern = ",?{""code"":(?!(null|""C-))[^}]*,?"
    .Global = True
    ExtractRecords = .Replace(s, "")
End With
End Function
 
Upvote 0

Forum statistics

Threads
1,223,903
Messages
6,175,284
Members
452,630
Latest member
OdubiYouth

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top