Extract # and @ from text in cell

MarlonvanDijk

New Member
Joined
Sep 23, 2019
Messages
4
Hello,

I have a large dataset with tweets in them. These are displayed in a single cell as text. I want to extract all hashtags and tags in this tweet and display them in their own cells. For example:

"My bilateral programme will comprise meetings with PM @GakhariaGiorgi and FM @DZalkaliani as well with the chairman of the #Parliament of #Georgia " would give" - @GakhariaGiorgi @DZalkaliani #Parliament #Georgia . Preferably all in their own cell.

I have already tried the following code, but this gave me an error. https://www.mrexcel.com/forum/excel...ing-mentions-hashtags-column-columns-b-c.html

Any help would be very much appreciated.
 

Excel Facts

When they said...
When they said you are going to "Excel at life", they meant you "will be doing Excel your whole life".
Hi
what about
Code:
Sub testy()
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True
        .Pattern = "(#+)+(\w+)"
        Set m = .Execute(Range("a1"))
        ReDim x(1 To m.Count)
        For i = 0 To m.Count - 1
            x(i + 1) = m.Item(i)
        Next
        Cells(1, 2).Resize(UBound(x)) = Application.Transpose(x)
    End With
End Sub
 
Upvote 0
Or may be
Code:
Sub testy()
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True
        .Pattern = "(#+)+(\w+)"
        Set m = .Execute(Range("a1"))
        ReDim x(1 To m.Count)
        For i = 0 To m.Count - 1
            x(i + 1) = m.Item(i)
        Next
        Cells(1, 2).Resize(UBound(x)) = Application.Transpose(x)
        .Pattern = "(\@+\w+)"
        Set m = .Execute(Range("a1"))
        ReDim x(1 To m.Count)
        For i = 0 To m.Count - 1
            x(i + 1) = m.Item(i)
        Next
        Cells(1, 3).Resize(UBound(x)) = Application.Transpose(x)
    End With
End Sub
 
Last edited:
Upvote 0
Welcome to the Board.

It can be done with formulas too. If your cell is A1, put this formula in B1, then drag right as far as needed:

=IFERROR(TRIM(LEFT(SUBSTITUTE(MID($A1,FIND("|",SUBSTITUTE(SUBSTITUTE($A1,"#","@"),"@","|",COLUMNS($B1:B1))),LEN($A1))," ",REPT(" ",LEN($A1))),LEN($A1))),"")
 
Upvote 0
Try this

<table border="1" cellspacing="0" style="font-family:Calibri,Arial; font-size:11pt; background-color:#ffffff; "> <colgroup><col style="font-weight:bold; width:30px; " /><col style="width:290.85px;" /><col style="width:144.48px;" /><col style="width:112.16px;" /><col style="width:103.6px;" /><col style="width:101.7px;" /></colgroup><tr style="background-color:#cacaca; text-align:center; font-weight:bold; font-size:8pt; "><td > </td><td >A</td><td >B</td><td >C</td><td >D</td><td >E</td></tr><tr style="height:74px ;" ><td style="font-size:8pt; background-color:#cacaca; text-align:center; " >1</td><td >My bilateral programme will comprise meetings with PM @GakhariaGiorgi and FM @DZalkaliani as well with the chairman of the #Parliament of #Georgia " would give</td><td >@GakhariaGiorgi</td><td >@DZalkaliani</td><td >#Parliament </td><td >#Georgia </td></tr></table>

Code:
Sub Extract_tweets()
  Dim t, n As Long
  n = 2
  For Each t In Split(Range("A1").Value)
    If InStr(1, t, "@") > 0 Or InStr(1, t, "#") Then
      Cells(1, n).Value = Trim(t)
      n = n + 1
    End If
  Next
End Sub
 
Upvote 0
Thanks for all your help! It is much appreciated.

I used the method by Eric W, since this is most intuitive for me. It worked perfectly.
I quickly ran the other methods to see it they worked. It seems like they only work correctly on the first row, but after this doesn't. I only tried this,very quickly so maybe I did something wrong there.
Anyways, thanks again very much!
 
Upvote 0
Thanks for all your help! It is much appreciated.

I used the method by Eric W, since this is most intuitive for me. It worked perfectly.
I quickly ran the other methods to see it they worked. It seems like they only work correctly on the first row, but after this doesn't. I only tried this,very quickly so maybe I did something wrong there.
Anyways, thanks again very much!

The macro works only for the first cell.
I put the updated macro to work with all the rows in column A. Change 1 In A1 through the row where your data begins

Code:
Sub Extract_tweets()
  Dim t, n As Long, c As Range
  For Each c In Range("[COLOR=#ff0000]A1[/COLOR]", Range("A" & Rows.Count).End(xlUp))
    Range(Cells(c.Row, 2), Cells(c.Row, Columns.Count)).ClearContents
    n = 2
    For Each t In Split(c.Value)
      If InStr(1, t, "@") > 0 Or InStr(1, t, "#") Then
        Cells(c.Row, n).Value = Trim(t)
        n = n + 1
      End If
    Next
  Next
End Sub
 
Upvote 0

Forum statistics

Threads
1,223,893
Messages
6,175,246
Members
452,623
Latest member
cliftonhandyman

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top